module documentation

An extension to retry failed requests that are potentially caused by temporary problems such as a connection timeout or HTTP 500 error.

You can change the behaviour of this middleware by modifing the scraping settings: RETRY_TIMES - how many times to retry a failed page RETRY_HTTP_CODES - which HTTP response codes to retry

Failed pages are collected on the scraping process and rescheduled at the end, once the spider has finished crawling all regular (non failed) pages.

Class ​Retry​Middleware Undocumented
Function get​_retry​_request Returns a new ~scrapy.Request object to retry the specified request, or None if retries of the specified request have been exhausted.
Variable retry​_logger Undocumented
def get_retry_request(request, *, spider, reason='unspecified', max_retry_times=None, priority_adjust=None, logger=retry_logger, stats_base_key='retry'):

Returns a new ~scrapy.Request object to retry the specified request, or None if retries of the specified request have been exhausted.

For example, in a ~scrapy.Spider callback, you could use it as follows:

def parse(self, response):
    if not response.text:
        new_request_or_none = get_retry_request(
            response.request,
            spider=self,
            reason='empty',
        )
        return new_request_or_none

spider is the ~scrapy.Spider instance which is asking for the retry request. It is used to access the :ref:`settings <topics-settings>` and :ref:`stats <topics-stats>`, and to provide extra logging context (see logging.debug).

reason is a string or an Exception object that indicates the reason why the request needs to be retried. It is used to name retry stats.

max_retry_times is a number that determines the maximum number of times that request can be retried. If not specified or None, the number is read from the :reqmeta:`max_retry_times` meta key of the request. If the :reqmeta:`max_retry_times` meta key is not defined or None, the number is read from the :setting:`RETRY_TIMES` setting.

priority_adjust is a number that determines how the priority of the new request changes in relation to request. If not specified, the number is read from the :setting:`RETRY_PRIORITY_ADJUST` setting.

logger is the logging.Logger object to be used when logging messages

stats_base_key is a string to be used as the base key for the retry-related job stats

Parameters
request:RequestUndocumented
spider:SpiderUndocumented
reason:Union[str, Exception]Undocumented
max​_retry​_times:Optional[int]Undocumented
priority​_adjust:Optional[int]Undocumented
logger:LoggerUndocumented
stats​_base​_key:strUndocumented
retry_logger =

Undocumented