class documentation

class FilesPipeline(MediaPipeline):

Known subclasses: scrapy.pipelines.images.ImagesPipeline

View In Hierarchy

Abstract pipeline that implement the file downloading

This pipeline tries to minimize network transfers and file processing, doing stat of the files and determining if file is new, uptodate or expired.

new files are those that pipeline never processed and needs to be
downloaded from supplier site the first time.
uptodate files are the ones that the pipeline processed and are still
valid files.
expired files are those that pipeline already processed but the last
modification was made long time ago, so a reprocessing is recommended to refresh it in case of change.
Class Method from​_settings Undocumented
Method file​_path Returns the path where downloaded media should be stored
Method get​_media​_requests Returns the media requests to download
Method item​_completed Called per item when all media requests has been processed
Constant DEFAULT​_FILES​_RESULT​_FIELD Undocumented
Constant DEFAULT​_FILES​_URLS​_FIELD Undocumented
Constant EXPIRES Undocumented
Constant MEDIA​_NAME Undocumented
Constant STORE​_SCHEMES Undocumented
Method __init__ Undocumented
Method ​_get​_store Undocumented
Method file​_downloaded Undocumented
Method inc​_stats Undocumented
Method media​_downloaded Handler for success downloads
Method media​_failed Handler for failed downloads
Method media​_to​_download Check request before starting download
Instance Variable expires Undocumented
Instance Variable FILES​_RESULT​_FIELD Undocumented
Instance Variable files​_result​_field Undocumented
Instance Variable FILES​_URLS​_FIELD Undocumented
Instance Variable files​_urls​_field Undocumented
Instance Variable store Undocumented

Inherited from MediaPipeline:

Class Method from​_crawler Undocumented
Constant LOG​_FAILED​_RESULTS Undocumented
Class ​Spider​Info Undocumented
Method ​_cache​_result​_and​_execute​_waiters Undocumented
Method ​_check​_media​_to​_download Undocumented
Method ​_check​_signature Undocumented
Method ​_compatible Wrapper for overridable methods to allow backwards compatibility
Method ​_handle​_statuses Undocumented
Method ​_key​_for​_pipe No summary
Method ​_make​_compatible Make overridable methods of MediaPipeline and subclasses backwards compatible
Method ​_modify​_media​_request Undocumented
Method ​_process​_request Undocumented
Method open​_spider Undocumented
Method process​_item Undocumented
Instance Variable ​_expects​_item Undocumented
Instance Variable allow​_redirects Undocumented
Instance Variable download​_func Undocumented
Instance Variable handle​_httpstatus​_list Undocumented
Instance Variable spiderinfo Undocumented
@classmethod
def from_settings(cls, settings):

Undocumented

def file_path(self, request, response=None, info=None, *, item=None):
Returns the path where downloaded media should be stored
def get_media_requests(self, item, info):
Returns the media requests to download
def item_completed(self, results, item, info):
Called per item when all media requests has been processed
DEFAULT_FILES_RESULT_FIELD: str =

Undocumented

Value
'files'
DEFAULT_FILES_URLS_FIELD: str =

Undocumented

Value
'file_urls'
EXPIRES: int =

Undocumented

Value
90
MEDIA_NAME: str =

Undocumented

Value
'file'
STORE_SCHEMES =

Undocumented

Value
{'': FSFilesStore,
 'file': FSFilesStore,
 's3': S3FilesStore,
 'gs': GCSFilesStore,
 'ftp': FTPFilesStore}
def __init__(self, store_uri, download_func=None, settings=None):
def _get_store(self, uri):

Undocumented

def file_downloaded(self, response, request, info, *, item=None):

Undocumented

def inc_stats(self, spider, status):

Undocumented

def media_downloaded(self, response, request, info, *, item=None):
Handler for success downloads
def media_failed(self, failure, request, info):
Handler for failed downloads
def media_to_download(self, request, info, *, item=None):
Check request before starting download
expires =

Undocumented

FILES_RESULT_FIELD =

Undocumented

files_result_field =

Undocumented

FILES_URLS_FIELD =

Undocumented

files_urls_field =

Undocumented

store =

Undocumented