package documentation

scrapy.linkextractors

This package contains a collection of Link Extractors.

For more info see docs/topics/link-extractors.rst

Module lxmlhtml Link extractor based on lxml.html

From __init__.py:

Constant IGNORED​_EXTENSIONS Undocumented
Class ​Filtering​Link​Extractor Undocumented
Function ​_is​_valid​_url Undocumented
Function ​_matches Undocumented
Variable ​_re​_type Undocumented
IGNORED_EXTENSIONS: list[str] =

Undocumented

Value
['7z',
 '7zip',
 'bz2',
 'rar',
 'tar',
 'tar.gz',
 'xz',
...
_re_type =

Undocumented

def _matches(url, regexs):

Undocumented

def _is_valid_url(url):

Undocumented