class SitemapSpider(Spider):
Undocumented
Method | __init__ |
Undocumented |
Method | _get_sitemap_body |
Return the sitemap body contained in the given response, or None if the response is not a sitemap. |
Method | _parse_sitemap |
Undocumented |
Method | sitemap_filter |
This method can be used to filter sitemap entries by their attributes, for example, you can filter locs with lastmod greater than a given date (see docs). |
Method | start_requests |
Undocumented |
Class Variable | sitemap_alternate_links |
Undocumented |
Class Variable | sitemap_follow |
Undocumented |
Class Variable | sitemap_rules |
Undocumented |
Class Variable | sitemap_urls |
Undocumented |
Instance Variable | _cbs |
Undocumented |
Instance Variable | _follow |
Undocumented |
Inherited from Spider
:
Class Method | from_crawler |
Undocumented |
Class Method | handles_request |
Undocumented |
Class Method | update_settings |
Undocumented |
Static Method | close |
Undocumented |
Method | __str__ |
Undocumented |
Method | _parse |
Undocumented |
Method | _set_crawler |
Undocumented |
Method | log |
Log the given message at the given log level |
Method | make_requests_from_url |
This method is deprecated. |
Method | parse |
Undocumented |
Class Variable | custom_settings |
Undocumented |
Instance Variable | crawler |
Undocumented |
Instance Variable | name |
Undocumented |
Instance Variable | settings |
Undocumented |
Instance Variable | start_urls |
Undocumented |
Property | logger |
Undocumented |
Inherited from object_ref
(via Spider
):
Method | __new__ |
Undocumented |
Class Variable | __slots__ |
Undocumented |