class documentation

class Scraper:

View In Hierarchy

Undocumented

Method __init__ Undocumented
Method ​_check​_if​_closing Undocumented
Method ​_itemproc​_finished ItemProcessor finished for the given item and returned output
Method ​_log​_download​_errors Log and silence errors that come from the engine (typically download errors that got propagated thru here)
Method ​_process​_spidermw​_output Process each Request/Item (given in the output parameter) returned from the given spider
Method ​_scrape Handle the downloaded response or failure through the spider callback/errback
Method ​_scrape2 Handle the different cases of request's result been a Response or a Failure
Method ​_scrape​_next Undocumented
Method call​_spider Undocumented
Method close​_spider Close a spider being scraped and release its resources
Method enqueue​_scrape Undocumented
Method handle​_spider​_error Undocumented
Method handle​_spider​_output Undocumented
Method is​_idle Return True if there isn't any more spiders to process
Method open​_spider Open the given spider for scraping and allocate resources for it
Instance Variable concurrent​_items Undocumented
Instance Variable crawler Undocumented
Instance Variable itemproc Undocumented
Instance Variable logformatter Undocumented
Instance Variable signals Undocumented
Instance Variable slot Undocumented
Instance Variable spidermw Undocumented
def __init__(self, crawler):

Undocumented

def _check_if_closing(self, spider, slot):

Undocumented

def _itemproc_finished(self, output, item, response, spider):
ItemProcessor finished for the given item and returned output
def _log_download_errors(self, spider_failure, download_failure, request, spider):
Log and silence errors that come from the engine (typically download errors that got propagated thru here)
def _process_spidermw_output(self, output, request, response, spider):
Process each Request/Item (given in the output parameter) returned from the given spider
def _scrape(self, result, request, spider):
Handle the downloaded response or failure through the spider callback/errback
def _scrape2(self, result, request, spider):
Handle the different cases of request's result been a Response or a Failure
def _scrape_next(self, spider, slot):

Undocumented

def call_spider(self, result, request, spider):

Undocumented

def close_spider(self, spider):
Close a spider being scraped and release its resources
def enqueue_scrape(self, response, request, spider):

Undocumented

def handle_spider_error(self, _failure, request, response, spider):

Undocumented

def handle_spider_output(self, result, request, response, spider):

Undocumented

def is_idle(self):
Return True if there isn't any more spiders to process
@defer.inlineCallbacks
def open_spider(self, spider):
Open the given spider for scraping and allocate resources for it
concurrent_items =

Undocumented

crawler =

Undocumented

itemproc =

Undocumented

logformatter =

Undocumented

signals =

Undocumented

slot =

Undocumented

spidermw =

Undocumented