This is a convenient helper class that keeps track of, manages and runs
crawlers inside an already setup ~twisted.internet.reactor
.
The CrawlerRunner object must be instantiated with a
~scrapy.settings.Settings
object.
This class shouldn't be needed (since Scrapy is responsible of using it accordingly) unless writing scripts that manually handle the crawling process. See :ref:`run-from-script` for an example.
Method | crawl |
Run a crawler with the provided arguments. |
Method | create_crawler |
Return a ~scrapy.crawler.Crawler object. |
Method | join |
join() |
Method | stop |
Stops simultaneously all the crawling jobs taking place. |
Static Method | _get_spider_loader |
Get SpiderLoader instance from settings |
Method | __init__ |
Undocumented |
Method | _crawl |
Undocumented |
Method | _create_crawler |
Undocumented |
Method | _handle_twisted_reactor |
Undocumented |
Class Variable | crawlers |
Undocumented |
Instance Variable | _active |
Undocumented |
Instance Variable | _crawlers |
Undocumented |
Instance Variable | bootstrap_failed |
Undocumented |
Instance Variable | settings |
Undocumented |
Instance Variable | spider_loader |
Undocumented |
Property | spiders |
Undocumented |
Run a crawler with the provided arguments.
It will call the given Crawler's ~Crawler.crawl
method, while
keeping track of it so it can be stopped later.
If crawler_or_spidercls isn't a ~scrapy.crawler.Crawler
instance, this method will try to create one using this parameter as
the spider class given to it.
Returns a deferred that is fired when the crawling is finished.
Parameters | |
crawler_or_spidercls:~scrapy.crawler.Crawler instance,
~scrapy.spiders.Spider subclass or string | already created crawler, or a spider class or spider's name inside the project to create it |
*args | arguments to initialize the spider |
**kwargs | keyword arguments to initialize the spider |
Return a ~scrapy.crawler.Crawler
object.
join()
Returns a deferred that is fired when all managed crawlers
have
completed their executions.