class documentation

class LxmlLinkExtractor(FilteringLinkExtractor):

View In Hierarchy

Undocumented

Method extract​_links Returns a list of ~scrapy.link.Link objects from the specified response.
Method __init__ Undocumented

Inherited from FilteringLinkExtractor:

Method __new__ Undocumented
Method ​_extract​_links Undocumented
Method ​_link​_allowed Undocumented
Method ​_process​_links Undocumented
Method matches Undocumented
Class Variable ​_csstranslator Undocumented
Instance Variable allow​_domains Undocumented
Instance Variable allow​_res Undocumented
Instance Variable canonicalize Undocumented
Instance Variable deny​_domains Undocumented
Instance Variable deny​_extensions Undocumented
Instance Variable deny​_res Undocumented
Instance Variable link​_extractor Undocumented
Instance Variable restrict​_text Undocumented
Instance Variable restrict​_xpaths Undocumented
def extract_links(self, response):

Returns a list of ~scrapy.link.Link objects from the specified response.

Only links that match the settings passed to the __init__ method of the link extractor are returned.

Duplicate links are omitted.

def __init__(self, allow=(), deny=(), allow_domains=(), deny_domains=(), restrict_xpaths=(), tags=('a', 'area'), attrs=('href'), canonicalize=False, unique=True, process_value=None, deny_extensions=None, restrict_css=(), strip=True, restrict_text=None):