class documentation

class Repository(DataSource):

View In Hierarchy

Repository(baseurl, destpath='.')

A data repository where multiple DataSource's share a base URL/directory.

Repository extends DataSource by prepending a base URL (or directory) to all the files it handles. Use Repository when you will be working with multiple files from one base URL. Initialize Repository with the base URL, then refer to each file by its filename only.

Parameters

baseurl : str
Path to the local directory or remote location that contains the data files.
destpath : str or None, optional
Path to the directory where the source file gets downloaded to for use. If destpath is None, a temporary directory will be created. The default path is the current directory.

Examples

To analyze all files in the repository, do something like this (note: this is not self-contained code):

>>> repos = np.lib._datasource.Repository('/home/user/data/dir/')
>>> for filename in filelist:
...     fp = repos.open(filename)
...     fp.analyze()
...     fp.close()

Similarly you could use a URL for a repository:

>>> repos = np.lib._datasource.Repository('http://www.xyz.edu/data')
Method __del__ Undocumented
Method __init__ Create a Repository with a shared url or directory of baseurl.
Method ​_findfile Extend DataSource method to prepend baseurl to path.
Method ​_fullpath Return complete path for path. Prepends baseurl if necessary.
Method abspath Return absolute path of file in the Repository directory.
Method exists Test if path exists prepending Repository base URL to path.
Method listdir List files in the source Repository.
Method open Open and return file-like object prepending Repository base URL.
Instance Variable ​_baseurl Undocumented

Inherited from DataSource:

Method ​_cache Cache the file specified by path.
Method ​_isurl Test if path is a net location. Tests the scheme and netloc.
Method ​_iswritemode Test if the given mode will open a file for writing.
Method ​_iszip Test if the filename is a zip file by looking at the file extension.
Method ​_possible​_names Return a tuple containing compressed filename variations.
Method ​_sanitize​_relative​_path Return a sanitised relative path for which os.path.abspath(os.path.join(base, path)).startswith(base)
Method ​_splitzipext Split zip extension from filename and return filename.
Instance Variable ​_destpath Undocumented
Instance Variable ​_istmpdest Undocumented
def __del__(self):

Undocumented

def __init__(self, baseurl, destpath=os.curdir):
Create a Repository with a shared url or directory of baseurl.
def _findfile(self, path):
Extend DataSource method to prepend baseurl to path.
def _fullpath(self, path):
Return complete path for path. Prepends baseurl if necessary.
def abspath(self, path):

Return absolute path of file in the Repository directory.

If path is an URL, then abspath will return either the location the file exists locally or the location it would exist when opened using the open method.

Parameters

path : str
Can be a local file or a remote URL. This may, but does not have to, include the baseurl with which the Repository was initialized.

Returns

out : str
Complete path, including the DataSource destination directory.
def exists(self, path):

Test if path exists prepending Repository base URL to path.

Test if path exists as (and in this order):

  • a local file.
  • a remote URL that has been downloaded and stored locally in the DataSource directory.
  • a remote URL that has not been downloaded, but is valid and accessible.

Parameters

path : str
Can be a local file or a remote URL. This may, but does not have to, include the baseurl with which the Repository was initialized.

Returns

out : bool
True if path exists.

Notes

When path is an URL, exists will return True if it's either stored locally in the DataSource directory, or is a valid remote URL. DataSource does not discriminate between the two, the file is accessible if it exists in either location.

def listdir(self):

List files in the source Repository.

Returns

files : list of str
List of file names (not containing a directory part).

Notes

Does not currently work for remote repositories.

def open(self, path, mode='r', encoding=None, newline=None):

Open and return file-like object prepending Repository base URL.

If path is an URL, it will be downloaded, stored in the DataSource directory and opened from there.

Parameters

path : str
Local file path or URL to open. This may, but does not have to, include the baseurl with which the Repository was initialized.
mode : {'r', 'w', 'a'}, optional
Mode to open path. Mode 'r' for reading, 'w' for writing, 'a' to append. Available modes depend on the type of object specified by path. Default is 'r'.
encoding : {None, str}, optional
Open text file with given encoding. The default encoding will be what io.open uses.
newline : {None, str}, optional
Newline to use when reading text file.

Returns

out : file object
File object.
_baseurl =

Undocumented