werkzeug.urls

module documentation

Functions for working with URLs.

Contains implementations of functions from urllib.parse that handle bytes and strings.

Class	`BaseURL`	Superclass of `URL` and `BytesURL`.
Class	`BytesURL`	Represents a parsed URL in bytes.
Class	`Href`	No summary
Class	`URL`	Represents a parsed URL. This behaves like a regular tuple but also has some extra attributes that give further insight into the URL.
Function	`iri_to_uri`	Convert an IRI to a URI. All non-ASCII and unsafe characters are quoted. If the URL has a domain, it is encoded to Punycode.
Function	`uri_to_iri`	Convert a URI to an IRI. All valid UTF-8 characters are unquoted, leaving all reserved and invalid characters quoted. If the URL has a domain, it is decoded from Punycode.
Function	`url_decode`	Parse a query string and return it as a `MultiDict`.
Function	`url_decode_stream`	No summary
Function	`url_encode`	URL encode a dict/`MultiDict`. If a value is `None` it will not appear in the result string. Per default only values are encoded into the target charset strings.
Function	`url_encode_stream`	Like `url_encode` but writes the results to a stream object. If the stream is `None` a generator over all encoded pairs is returned.
Function	`url_fix`	No summary
Function	`url_join`	Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.
Function	`url_parse`	No summary
Function	`url_quote`	URL encode a single string with a given encoding.
Function	`url_quote_plus`	URL encode a single string with the given encoding and convert whitespace to "+".
Function	`url_unparse`	The reverse operation to `url_parse`. This accepts arbitrary as well as `URL` tuples and returns a URL as a string.
Function	`url_unquote`	URL decode a single string with a given encoding. If the charset is set to `None` no decoding is performed and raw bytes are returned.
Function	`url_unquote_plus`	URL decode a single string with the given `charset` and decode "+" to whitespace.
Class	`_URLTuple`	Undocumented
Function	`_codec_error_url_quote`	Used in `uri_to_iri` after unquoting to re-quote any invalid bytes.
Function	`_fast_url_quote_plus`	Undocumented
Function	`_make_fast_url_quote`	Precompile the translation table for a URL encoding function.
Function	`_unquote_to_bytes`	Undocumented
Function	`_url_decode_impl`	Undocumented
Function	`_url_encode_impl`	Undocumented
Function	`_url_unquote_legacy`	Undocumented
Variable	`_always_safe`	Undocumented
Variable	`_bytetohex`	Undocumented
Variable	`_fast_quote_plus`	Undocumented
Variable	`_fast_url_quote`	Undocumented
Variable	`_hexdigits`	Undocumented
Variable	`_hextobyte`	Undocumented
Variable	`_scheme_re`	Undocumented
Variable	`_to_iri_unsafe`	Undocumented
Variable	`_to_uri_safe`	Undocumented
Variable	`_unquote_maps`	Undocumented

def iri_to_uri(iri, charset='utf-8', errors='strict', safe_conversion=False):

Convert an IRI to a URI. All non-ASCII and unsafe characters are quoted. If the URL has a domain, it is encoded to Punycode.

>>> iri_to_uri('http://\u2603.net/p\xe5th?q=\xe8ry%DF')
'http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF'

There is a general problem with IRI conversion with some protocols that are in violation of the URI specification. Consider the following two IRIs:

magnet:?xt=uri:whatever
itms-services://?action=download-manifest

After parsing, we don't know if the scheme requires the //, which is dropped if empty, but conveys different meanings in the final URL if it's present or not. In this case, you can use safe_conversion, which will return the URL unchanged if it only contains ASCII characters and no whitespace. This can result in a URI with unquoted characters if it was not already quoted correctly, but preserves the URL's semantics. Werkzeug uses this for the Location header for redirects.

Changed in version 0.15: All reserved characters remain unquoted. Previously, only some reserved characters were left unquoted.

Changed in version 0.9.6: The safe_conversion parameter was added.

New in version 0.6.

Parameters
iri:`t.Union[str, t.Tuple[str, str, str, str, str]]`	The IRI to convert.
charset:`str`	The encoding of the IRI.
errors:`str`	Error handler to use during `bytes.encode`.
safe_conversion:`bool`	Return the URL unchanged if it only contains ASCII characters and no whitespace. See the explanation below.
Returns
`str`	Undocumented

def uri_to_iri(uri, charset='utf-8', errors='werkzeug.url_quote'):

Convert a URI to an IRI. All valid UTF-8 characters are unquoted, leaving all reserved and invalid characters quoted. If the URL has a domain, it is decoded from Punycode.

>>> uri_to_iri("http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF")
'http://\u2603.net/p\xe5th?q=\xe8ry%DF'

Changed in version 0.15: All reserved and invalid characters remain quoted. Previously, only some reserved characters were preserved, and invalid bytes were replaced instead of left quoted.

New in version 0.6.

Parameters
uri:`t.Union[str, t.Tuple[str, str, str, str, str]]`	The URI to convert.
charset:`str`	The encoding to encode unquoted bytes with.
errors:`str`	Error handler to use during `bytes.encode`. By default, invalid bytes are left quoted.
Returns
`str`	Undocumented

def url_decode(s, charset='utf-8', decode_keys=None, include_empty=True, errors='replace', separator='&', cls=None):

Parse a query string and return it as a MultiDict.

Changed in version 2.0: The decode_keys parameter is deprecated and will be removed in Werkzeug 2.1.

Changed in version 0.5: In previous versions ";" and "&" could be used for url decoding. Now only "&" is supported. If you want to use ";", a different separator can be provided.

Changed in version 0.5: The cls parameter was added.

Parameters
s:`t.AnyStr`	The query string to parse.
charset:`str`	Decode bytes to string with this charset. If not given, bytes are returned as-is.
decode_keys:`None`	Undocumented
include_empty:`bool`	Include keys with empty values in the dict.
errors:`str`	Error handling behavior when decoding bytes.
separator:`str`	Separator character between pairs.
cls:`t.Optional[t.Type[ds.MultiDict]]`	Container to hold result instead of `MultiDict`.
Returns
`ds.MultiDict[str, str]`	Undocumented

def url_decode_stream(stream, charset='utf-8', decode_keys=None, include_empty=True, errors='replace', separator=b'&', cls=None, limit=None, return_iterator=False):

Works like url_decode but decodes a stream. The behavior of stream and limit follows functions like ~werkzeug.wsgi.make_line_iter. The generator of pairs is directly fed to the cls so you can consume the data while it's parsed.

Changed in version 2.0: The decode_keys and return_iterator parameters are deprecated and will be removed in Werkzeug 2.1.

New in version 0.8.

Parameters
stream:`t.IO[bytes]`	a stream with the encoded querystring
charset:`str`	the charset of the query string. If set to `None` no decoding will take place.
decode_keys:`None`	Undocumented
include_empty:`bool`	Set to `False` if you don't want empty values to appear in the dict.
errors:`str`	the decoding error behavior.
separator:`bytes`	the pair separator to be used, defaults to `&`
cls:`t.Optional[t.Type[ds.MultiDict]]`	an optional dict class to use. If this is not specified or `None` the default `MultiDict` is used.
limit:`t.Optional[int]`	the content length of the URL data. Not necessary if a limited stream is provided.
return_iterator:`bool`	Undocumented
Returns
`ds.MultiDict[str, str]`	Undocumented

def url_encode(obj, charset='utf-8', encode_keys=None, sort=False, key=None, separator='&'):

URL encode a dict/MultiDict. If a value is None it will not appear in the result string. Per default only values are encoded into the target charset strings.

Changed in version 2.0: The encode_keys parameter is deprecated and will be removed in Werkzeug 2.1.

Changed in version 0.5: Added the sort, key, and separator parameters.

Parameters
obj:`t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]]`	the object to encode into a query string.
charset:`str`	the charset of the query string.
encode_keys:`None`	Undocumented
sort:`bool`	set to `True` if you want parameters to be sorted by `key`.
key:`t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]]`	an optional function to be used for sorting. For more details check out the `sorted` documentation.
separator:`str`	the separator to be used for the pairs.
Returns
`str`	Undocumented

def url_encode_stream(obj, stream=None, charset='utf-8', encode_keys=None, sort=False, key=None, separator='&'):

Like url_encode but writes the results to a stream object. If the stream is None a generator over all encoded pairs is returned.

Changed in version 2.0: The encode_keys parameter is deprecated and will be removed in Werkzeug 2.1.

New in version 0.8.

Parameters
obj:`t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]]`	the object to encode into a query string.
stream:`t.Optional[t.IO[str]]`	a stream to write the encoded object into or `None` if an iterator over the encoded pairs should be returned. In that case the separator argument is ignored.
charset:`str`	the charset of the query string.
encode_keys:`None`	Undocumented
sort:`bool`	set to `True` if you want parameters to be sorted by `key`.
key:`t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]]`	an optional function to be used for sorting. For more details check out the `sorted` documentation.
separator:`str`	the separator to be used for the pairs.

def url_fix(s, charset='utf-8'):

Sometimes you get an URL by a user that just isn't a real URL because it contains unsafe characters like ' ' and so on. This function can fix some of the problems in a similar way browsers handle data entered by the user:

>>> url_fix('http://de.wikipedia.org/wiki/Elf (Begriffskl\xe4rung)')
'http://de.wikipedia.org/wiki/Elf%20(Begriffskl%C3%A4rung)'

Parameters
s:`str`	the string with the URL to fix.
charset:`str`	The target charset for the URL if the url was given as a string.
Returns
`str`	Undocumented

def url_join(base, url, allow_fragments=True):

Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.

Parameters
base:`t.Union[str, t.Tuple[str, str, str, str, str]]`	the base URL for the join operation.
url:`t.Union[str, t.Tuple[str, str, str, str, str]]`	the URL to join.
allow_fragments:`bool`	indicates whether fragments should be allowed.
Returns
`str`	Undocumented

def url_parse(url, scheme=None, allow_fragments=True):

Parses a URL from a string into a URL tuple. If the URL is lacking a scheme it can be provided as second argument. Otherwise, it is ignored. Optionally fragments can be stripped from the URL by setting allow_fragments to False.

The inverse of this function is url_unparse.

Parameters
url:`str`	the URL to parse.
scheme:`t.Optional[str]`	the default schema to use if the URL is schemaless.
allow_fragments:`bool`	if set to `False` a fragment will be removed from the URL.
Returns
`BaseURL`	Undocumented

def url_quote(string, charset='utf-8', errors='strict', safe='/:', unsafe=''):

URL encode a single string with a given encoding.

New in version 0.9.2: The unsafe parameter was added.

Parameters
string:`t.Union[str, bytes]`	Undocumented
charset:`str`	the charset to be used.
errors:`str`	Undocumented
safe:`t.Union[str, bytes]`	an optional sequence of safe characters.
unsafe:`t.Union[str, bytes]`	an optional sequence of unsafe characters.
s	the string to quote.
Returns
`str`	Undocumented

def url_quote_plus(string, charset='utf-8', errors='strict', safe=''):

URL encode a single string with the given encoding and convert whitespace to "+".

Parameters
string:`str`	Undocumented
charset:`str`	The charset to be used.
errors:`str`	Undocumented
safe:`str`	An optional sequence of safe characters.
s	The string to quote.
Returns
`str`	Undocumented

def url_unparse(components):

The reverse operation to url_parse. This accepts arbitrary as well as URL tuples and returns a URL as a string.

Parameters
components:`t.Tuple[str, str, str, str, str]`	the parsed URL as tuple which should be converted into a URL string.
Returns
`str`	Undocumented

def url_unquote(s, charset='utf-8', errors='replace', unsafe=''):

URL decode a single string with a given encoding. If the charset is set to None no decoding is performed and raw bytes are returned.

Parameters
s:`t.Union[str, bytes]`	the string to unquote.
charset:`str`	the charset of the query string. If set to `None` no decoding will take place.
errors:`str`	the error handling for the charset decoding.
unsafe:`str`	Undocumented
Returns
`str`	Undocumented

def url_unquote_plus(s, charset='utf-8', errors='replace'):

URL decode a single string with the given charset and decode "+" to whitespace.

Per default encoding errors are ignored. If you want a different behavior you can set errors to 'replace' or 'strict'.

Parameters
s:`t.Union[str, bytes]`	The string to unquote.
charset:`str`	the charset of the query string. If set to `None` no decoding will take place.
errors:`str`	The error handling for the `charset` decoding.
Returns
`str`	Undocumented

def _codec_error_url_quote(e):

Used in uri_to_iri after unquoting to re-quote any invalid bytes.

Parameters
e:`UnicodeError`	Undocumented
Returns
`t.Tuple[str, int]`	Undocumented

def _fast_url_quote_plus(string):

Undocumented

Parameters
string:`bytes`	Undocumented
Returns
`str`	Undocumented

def _make_fast_url_quote(charset='utf-8', errors='strict', safe='/:', unsafe=''):

Precompile the translation table for a URL encoding function.

Unlike url_quote, the generated function only takes the string to quote.

Parameters
charset:`str`	The charset to encode the result with.
errors:`str`	How to handle encoding errors.
safe:`t.Union[str, bytes]`	An optional sequence of safe characters to never encode.
unsafe:`t.Union[str, bytes]`	An optional sequence of unsafe characters to always encode.
Returns
`t.Callable[[bytes], str]`	Undocumented

def _unquote_to_bytes(string, unsafe=''):

Undocumented

Parameters
string:`t.Union[str, bytes]`	Undocumented
unsafe:`t.Union[str, bytes]`	Undocumented
Returns
`bytes`	Undocumented

def _url_decode_impl(pair_iter, charset, include_empty, errors):

Undocumented

Parameters
pair_iter:`t.Iterable[t.AnyStr]`	Undocumented
charset:`str`	Undocumented
include_empty:`bool`	Undocumented
errors:`str`	Undocumented
Returns
`t.Iterator[t.Tuple[str, str]]`	Undocumented

def _url_encode_impl(obj, charset, sort, key):

Undocumented

Parameters
obj:`t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]]`	Undocumented
charset:`str`	Undocumented
sort:`bool`	Undocumented
key:`t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]]`	Undocumented
Returns
`t.Iterator[str]`	Undocumented

def _url_unquote_legacy(value, unsafe=''):

Undocumented

Parameters
value:`str`	Undocumented
unsafe:`str`	Undocumented
Returns
`str`	Undocumented

_always_safe =

Undocumented

_bytetohex =

Undocumented

_fast_quote_plus =

Undocumented

_fast_url_quote =

Undocumented

_hexdigits: str =

Undocumented

_hextobyte =

Undocumented

_scheme_re =

Undocumented

_to_iri_unsafe =

Undocumented

_to_uri_safe: str =

Undocumented

_unquote_maps: t.Dict[t.FrozenSet[int], t.Dict[bytes, int]] =

Undocumented