Core locale representation and locale data access.
Unknown Field: copyright | |
| |
Unknown Field: license | |
BSD, see LICENSE for more details. |
Class | Locale |
Representation of a specific locale. |
Class | UnknownLocaleError |
Exception thrown when a locale is requested for which no locale data is available. |
Function | default_locale |
Returns the system default locale for a given category, based on environment variables. |
Function | get_global |
Return the dictionary for the given key in the global data. |
Function | get_locale_identifier |
The reverse of parse_locale . It creates a locale identifier out of a (language, territory, script, variant) tuple. Items can be set to None and trailing Nones can also be left out of the tuple. |
Function | negotiate_locale |
Find the best match between available and requested locale strings. |
Function | parse_locale |
Parse a locale identifier into a tuple of the form (language, territory, script, variant). |
Constant | LOCALE_ALIASES |
Undocumented |
Function | _raise_no_data_error |
Undocumented |
Variable | _default_plural_rule |
Undocumented |
Variable | _global_data |
Undocumented |
Returns the system default locale for a given category, based on environment variables.
>>> for name in ['LANGUAGE', 'LC_ALL', 'LC_CTYPE']: ... os.environ[name] = '' >>> os.environ['LANG'] = 'fr_FR.UTF-8' >>> default_locale('LC_MESSAGES') 'fr_FR'
The "C" or "POSIX" pseudo-locales are treated as aliases for the "en_US_POSIX" locale:
>>> os.environ['LC_MESSAGES'] = 'POSIX' >>> default_locale('LC_MESSAGES') 'en_US_POSIX'
The following fallbacks to the variable are always considered:
Parameters | |
category | one of the LC_XXX environment variable names |
aliases | a dictionary of aliases for locale identifiers |
Return the dictionary for the given key in the global data.
The global data is stored in the babel/global.dat file and contains information independent of individual locales.
>>> get_global('zone_aliases')['UTC'] u'Etc/UTC' >>> get_global('zone_territories')['Europe/Berlin'] u'DE'
The keys available are:
Note
The internal structure of the data may change between versions.
Parameters | |
key | the data key |
The reverse of parse_locale
. It creates a locale identifier out
of a (language, territory, script, variant) tuple. Items can be set to
None and trailing Nones can also be left out of the tuple.
>>> get_locale_identifier(('de', 'DE', None, '1999')) 'de_DE_1999'
Parameters | |
tup | the tuple as returned by parse_locale . |
sep | the separator for the identifier. |
Find the best match between available and requested locale strings.
>>> negotiate_locale(['de_DE', 'en_US'], ['de_DE', 'de_AT']) 'de_DE' >>> negotiate_locale(['de_DE', 'en_US'], ['en', 'de']) 'de'
Case is ignored by the algorithm, the result uses the case of the preferred locale identifier:
>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at']) 'de_DE'
>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at']) 'de_DE'
By default, some web browsers unfortunately do not include the territory in the locale identifier for many locales, and some don't even allow the user to easily add the territory. So while you may prefer using qualified locale identifiers in your web-application, they would not normally match the language-only locale sent by such browsers. To workaround that, this function uses a default mapping of commonly used langauge-only locale identifiers to identifiers including the territory:
>>> negotiate_locale(['ja', 'en_US'], ['ja_JP', 'en_US']) 'ja_JP'
Some browsers even use an incorrect or outdated language code, such as "no" for Norwegian, where the correct locale identifier would actually be "nb_NO" (Bokmål) or "nn_NO" (Nynorsk). The aliases are intended to take care of such cases, too:
>>> negotiate_locale(['no', 'sv'], ['nb_NO', 'sv_SE']) 'nb_NO'
You can override this default mapping by passing a different aliases
dictionary to this function, or you can bypass the behavior althogher by
setting the aliases
parameter to None
.
Parameters | |
preferred | the list of locale strings preferred by the user |
available | the list of locale strings available |
sep | character that separates the different parts of the locale strings |
aliases | a dictionary of aliases for locale identifiers |
Parse a locale identifier into a tuple of the form (language, territory, script, variant).
>>> parse_locale('zh_CN') ('zh', 'CN', None, None) >>> parse_locale('zh_Hans_CN') ('zh', 'CN', 'Hans', None)
The default component separator is "_", but a different separator can be
specified using the sep
parameter:
>>> parse_locale('zh-CN', sep='-') ('zh', 'CN', None, None)
If the identifier cannot be parsed into a locale, a ValueError
exception
is raised:
>>> parse_locale('not_a_LOCALE_String') Traceback (most recent call last): ... ValueError: 'not_a_LOCALE_String' is not a valid locale identifier
Encoding information and locale modifiers are removed from the identifier:
>>> parse_locale('it_IT@euro') ('it', 'IT', None, None) >>> parse_locale('en_US.UTF-8') ('en', 'US', None, None) >>> parse_locale('de_DE.iso885915@euro') ('de', 'DE', None, None)
See RFC 4646 for more information.
Parameters | |
identifier | the locale identifier string |
sep | character that separates the different components of the locale identifier |
Raises | |
ValueError | if the string does not appear to be a valid locale identifier |