Collection of utilities to manipulate structured arrays.
Most of these functions were initially implemented by John Hunter for matplotlib. They have been rewritten and extended for convenience.
Function | append_fields |
Add new fields to an existing array. |
Function | apply_along_fields |
Apply function 'func' as a reduction across fields of a structured array. |
Function | assign_fields_by_name |
Assigns values from one structured array to another by field name. |
Function | drop_fields |
Return a new array with fields in drop_names dropped. |
Function | find_duplicates |
Find the duplicates in a structured array along a given key |
Function | flatten_descr |
Flatten a structured data-type description. |
Function | get_fieldstructure |
Returns a dictionary with fields indexing lists of their parent fields. |
Function | get_names |
Returns the field names of the input datatype as a tuple. |
Function | get_names_flat |
Returns the field names of the input datatype as a tuple. Nested structure are flattened beforehand. |
Function | join_by |
Join arrays r1 and r2 on key key . |
Function | merge_arrays |
Merge arrays field by field. |
Function | rec_append_fields |
Add new fields to an existing array. |
Function | rec_drop_fields |
Returns a new numpy.recarray with fields in drop_names dropped. |
Function | rec_join |
Join arrays r1 and r2 on keys. Alternative to join_by, that always returns a np.recarray. |
Function | recursive_fill_fields |
Fills fields from output with fields from input, with support for nested structures. |
Function | rename_fields |
Rename the fields from a flexible-datatype ndarray or recarray. |
Function | repack_fields |
Re-pack the fields of a structured array or dtype in memory. |
Function | require_fields |
Casts a structured array to a new dtype using assignment by field-name. |
Function | stack_arrays |
Superposes arrays fields by fields |
Function | structured_to_unstructured |
Converts an n-D structured array into an (n+1)-D unstructured array. |
Function | unstructured_to_structured |
Converts an n-D unstructured array into an (n-1)-D structured array. |
Function | _append_fields_dispatcher |
Undocumented |
Function | _apply_along_fields_dispatcher |
Undocumented |
Function | _assign_fields_by_name_dispatcher |
Undocumented |
Function | _drop_fields_dispatcher |
Undocumented |
Function | _find_duplicates_dispatcher |
Undocumented |
Function | _fix_defaults |
Update the fill_value and masked data of output from the default given in a dictionary defaults. |
Function | _fix_output |
Private function: return a recarray, a ndarray, a MaskedArray or a MaskedRecords depending on the input parameters |
Function | _get_fields_and_offsets |
Returns a flat list of (dtype, count, offset) tuples of all the scalar fields in the dtype "dt", including nested fields, in left to right order. |
Function | _get_fieldspec |
Produce a list of name/dtype pairs corresponding to the dtype fields |
Function | _izip_fields |
Returns an iterator of concatenated fields from a sequence of arrays. |
Function | _izip_fields_flat |
Returns an iterator of concatenated fields from a sequence of arrays, collapsing any nested structure. |
Function | _izip_records |
Returns an iterator of concatenated items from a sequence of arrays. |
Function | _join_by_dispatcher |
Undocumented |
Function | _keep_fields |
Return a new array keeping only the fields in keep_names , and preserving the order of those fields. |
Function | _merge_arrays_dispatcher |
Undocumented |
Function | _rec_append_fields_dispatcher |
Undocumented |
Function | _rec_drop_fields_dispatcher |
Undocumented |
Function | _rec_join_dispatcher |
Undocumented |
Function | _recursive_fill_fields_dispatcher |
Undocumented |
Function | _rename_fields_dispatcher |
Undocumented |
Function | _repack_fields_dispatcher |
Undocumented |
Function | _require_fields_dispatcher |
Undocumented |
Function | _stack_arrays_dispatcher |
Undocumented |
Function | _structured_to_unstructured_dispatcher |
Undocumented |
Function | _unstructured_to_structured_dispatcher |
Undocumented |
Function | _zip_descr |
Combine the dtype description of a series of arrays. |
Function | _zip_dtype |
Undocumented |
Add new fields to an existing array.
The names of the fields are given with the names
arguments,
the corresponding values with the data
arguments.
If a single field is appended, names
, data
and dtypes
do not have
to be lists but just values.
data
.Apply function 'func' as a reduction across fields of a structured array.
This is similar to apply_along_axis
, but treats the fields of a
structured array as an extra axis. The fields are all first cast to a
common type following the type-promotion rules from numpy.result_type
applied to the field's dtypes.
axis
argument, like np.mean, np.sum, etc.>>> from numpy.lib import recfunctions as rfn >>> b = np.array([(1, 2, 5), (4, 5, 7), (7, 8 ,11), (10, 11, 12)], ... dtype=[('x', 'i4'), ('y', 'f4'), ('z', 'f8')]) >>> rfn.apply_along_fields(np.mean, b) array([ 2.66666667, 5.33333333, 8.66666667, 11. ]) >>> rfn.apply_along_fields(np.mean, b[['x', 'z']]) array([ 3. , 5.5, 9. , 11. ])
Assigns values from one structured array to another by field name.
Normally in numpy >= 1.14, assignment of one structured array to another copies fields "by position", meaning that the first field from the src is copied to the first field of the dst, and so on, regardless of field name.
This function instead copies "by field name", such that fields in the dst are assigned from the identically named field in the src. This applies recursively for nested structures. This is how structure assignment worked in numpy >= 1.6 to <= 1.13.
dst : ndarray src : ndarray
The source and destination arrays during assignment.
Return a new array with fields in drop_names
dropped.
Nested fields are supported.
drop_fields
returns an array with 0 fields if all fields are dropped,
rather than returning None as it did previously.asrecarray=True
) or
a plain ndarray or masked array with flexible dtype. The default
is False.>>> from numpy.lib import recfunctions as rfn >>> a = np.array([(1, (2, 3.0)), (4, (5, 6.0))], ... dtype=[('a', np.int64), ('b', [('ba', np.double), ('bb', np.int64)])]) >>> rfn.drop_fields(a, 'a') array([((2., 3),), ((5., 6),)], dtype=[('b', [('ba', '<f8'), ('bb', '<i8')])]) >>> rfn.drop_fields(a, 'ba') array([(1, (3,)), (4, (6,))], dtype=[('a', '<i8'), ('b', [('bb', '<i8')])]) >>> rfn.drop_fields(a, ['ba', 'bb']) array([(1,), (4,)], dtype=[('a', '<i8')])
Find the duplicates in a structured array along a given key
>>> from numpy.lib import recfunctions as rfn >>> ndtype = [('a', int)] >>> a = np.ma.array([1, 1, 1, 2, 2, 3, 3], ... mask=[0, 0, 1, 0, 0, 0, 1]).view(ndtype) >>> rfn.find_duplicates(a, ignoremask=True, return_index=True) (masked_array(data=[(1,), (1,), (2,), (2,)], mask=[(False,), (False,), (False,), (False,)], fill_value=(999999,), dtype=[('a', '<i8')]), array([0, 1, 3, 4]))
Flatten a structured data-type description.
>>> from numpy.lib import recfunctions as rfn >>> ndtype = np.dtype([('a', '<i4'), ('b', [('ba', '<f8'), ('bb', '<i4')])]) >>> rfn.flatten_descr(ndtype) (('a', dtype('int32')), ('ba', dtype('float64')), ('bb', dtype('int32')))
Returns a dictionary with fields indexing lists of their parent fields.
This function is used to simplify access to fields nested in other fields.
>>> from numpy.lib import recfunctions as rfn >>> ndtype = np.dtype([('A', int), ... ('B', [('BA', int), ... ('BB', [('BBA', int), ('BBB', int)])])]) >>> rfn.get_fieldstructure(ndtype) ... # XXX: possible regression, order of BBA and BBB is swapped {'A': [], 'B': [], 'BA': ['B'], 'BB': ['B'], 'BBA': ['B', 'BB'], 'BBB': ['B', 'BB']}
Returns the field names of the input datatype as a tuple.
>>> from numpy.lib import recfunctions as rfn >>> rfn.get_names(np.empty((1,), dtype=int)) Traceback (most recent call last): ... AttributeError: 'numpy.ndarray' object has no attribute 'names'
>>> rfn.get_names(np.empty((1,), dtype=[('A',int), ('B', float)])) Traceback (most recent call last): ... AttributeError: 'numpy.ndarray' object has no attribute 'names' >>> adtype = np.dtype([('a', int), ('b', [('ba', int), ('bb', int)])]) >>> rfn.get_names(adtype) ('a', ('b', ('ba', 'bb')))
Returns the field names of the input datatype as a tuple. Nested structure are flattened beforehand.
>>> from numpy.lib import recfunctions as rfn >>> rfn.get_names_flat(np.empty((1,), dtype=int)) is None Traceback (most recent call last): ... AttributeError: 'numpy.ndarray' object has no attribute 'names' >>> rfn.get_names_flat(np.empty((1,), dtype=[('A',int), ('B', float)])) Traceback (most recent call last): ... AttributeError: 'numpy.ndarray' object has no attribute 'names' >>> adtype = np.dtype([('a', int), ('b', [('ba', int), ('bb', int)])]) >>> rfn.get_names_flat(adtype) ('a', 'b', 'ba', 'bb')
Join arrays r1
and r2
on key key
.
The key should be either a string or a sequence of string corresponding
to the fields used to join the array. An exception is raised if the
key
field cannot be found in the two input arrays. Neither r1
nor
r2
should have any duplicates along key
: the presence of duplicates
will make the output quite unreliable. Note that duplicates are not
looked for by the algorithm.
asrecarray==True
) or a ndarray.usemask==True
)
or just a flexible-type ndarray.Merge arrays field by field.
>>> from numpy.lib import recfunctions as rfn >>> rfn.merge_arrays((np.array([1, 2]), np.array([10., 20., 30.]))) array([( 1, 10.), ( 2, 20.), (-1, 30.)], dtype=[('f0', '<i8'), ('f1', '<f8')])
>>> rfn.merge_arrays((np.array([1, 2], dtype=np.int64), ... np.array([10., 20., 30.])), usemask=False) array([(1, 10.0), (2, 20.0), (-1, 30.0)], dtype=[('f0', '<i8'), ('f1', '<f8')]) >>> rfn.merge_arrays((np.array([1, 2]).view([('a', np.int64)]), ... np.array([10., 20., 30.])), ... usemask=False, asrecarray=True) rec.array([( 1, 10.), ( 2, 20.), (-1, 30.)], dtype=[('a', '<i8'), ('f1', '<f8')])
Add new fields to an existing array.
The names of the fields are given with the names
arguments,
the corresponding values with the data
arguments.
If a single field is appended, names
, data
and dtypes
do not have
to be lists but just values.
data
.append_fields
appended_array : np.recarray
Join arrays r1
and r2
on keys.
Alternative to join_by, that always returns a np.recarray.
join_by : equivalent function
Fills fields from output with fields from input, with support for nested structures.
output
should be at least the same size as input
>>> from numpy.lib import recfunctions as rfn >>> a = np.array([(1, 10.), (2, 20.)], dtype=[('A', np.int64), ('B', np.float64)]) >>> b = np.zeros((3,), dtype=a.dtype) >>> rfn.recursive_fill_fields(a, b) array([(1, 10.), (2, 20.), (0, 0.)], dtype=[('A', '<i8'), ('B', '<f8')])
Rename the fields from a flexible-datatype ndarray or recarray.
Nested fields are supported.
>>> from numpy.lib import recfunctions as rfn >>> a = np.array([(1, (2, [3.0, 30.])), (4, (5, [6.0, 60.]))], ... dtype=[('a', int),('b', [('ba', float), ('bb', (float, 2))])]) >>> rfn.rename_fields(a, {'a':'A', 'bb':'BB'}) array([(1, (2., [ 3., 30.])), (4, (5., [ 6., 60.]))], dtype=[('A', '<i8'), ('b', [('ba', '<f8'), ('BB', '<f8', (2,))])])
Re-pack the fields of a structured array or dtype in memory.
The memory layout of structured datatypes allows fields at arbitrary byte offsets. This means the fields can be separated by padding bytes, their offsets can be non-monotonically increasing, and they can overlap.
This method removes any overlaps and reorders the fields in memory so they
have increasing byte offsets, and adds or removes padding bytes depending
on the align
option, which behaves like the align
option to np.dtype
.
If align=False
, this method produces a "packed" memory layout in which
each field starts at the byte the previous field ended, and any padding
bytes are removed.
If align=True
, this methods produces an "aligned" memory layout in which
each field's offset is a multiple of its alignment, and the total itemsize
is a multiple of the largest alignment, by adding padding bytes as needed.
a
with fields repacked, or a
itself if no repacking was
needed.>>> from numpy.lib import recfunctions as rfn >>> def print_offsets(d): ... print("offsets:", [d.fields[name][1] for name in d.names]) ... print("itemsize:", d.itemsize) ... >>> dt = np.dtype('u1, <i8, <f8', align=True) >>> dt dtype({'names': ['f0', 'f1', 'f2'], 'formats': ['u1', '<i8', '<f8'], 'offsets': [0, 8, 16], 'itemsize': 24}, align=True) >>> print_offsets(dt) offsets: [0, 8, 16] itemsize: 24 >>> packed_dt = rfn.repack_fields(dt) >>> packed_dt dtype([('f0', 'u1'), ('f1', '<i8'), ('f2', '<f8')]) >>> print_offsets(packed_dt) offsets: [0, 1, 9] itemsize: 17
Casts a structured array to a new dtype using assignment by field-name.
This function assigns from the old to the new array by name, so the value of a field in the output array is the value of the field with the same name in the source array. This has the effect of creating a new ndarray containing only the fields "required" by the required_dtype.
If a field name in the required_dtype does not exist in the input array, that field is created and set to 0 in the output array.
>>> from numpy.lib import recfunctions as rfn >>> a = np.ones(4, dtype=[('a', 'i4'), ('b', 'f8'), ('c', 'u1')]) >>> rfn.require_fields(a, [('b', 'f4'), ('c', 'u1')]) array([(1., 1), (1., 1), (1., 1), (1., 1)], dtype=[('b', '<f4'), ('c', 'u1')]) >>> rfn.require_fields(a, [('b', 'f4'), ('newf', 'u1')]) array([(1., 0), (1., 0), (1., 0), (1., 0)], dtype=[('b', '<f4'), ('newf', 'u1')])
Superposes arrays fields by fields
asrecarray==True
) or a ndarray.usemask==True
)
or just a flexible-type ndarray.>>> from numpy.lib import recfunctions as rfn >>> x = np.array([1, 2,]) >>> rfn.stack_arrays(x) is x True >>> z = np.array([('A', 1), ('B', 2)], dtype=[('A', '|S3'), ('B', float)]) >>> zz = np.array([('a', 10., 100.), ('b', 20., 200.), ('c', 30., 300.)], ... dtype=[('A', '|S3'), ('B', np.double), ('C', np.double)]) >>> test = rfn.stack_arrays((z,zz)) >>> test masked_array(data=[(b'A', 1.0, --), (b'B', 2.0, --), (b'a', 10.0, 100.0), (b'b', 20.0, 200.0), (b'c', 30.0, 300.0)], mask=[(False, False, True), (False, False, True), (False, False, False), (False, False, False), (False, False, False)], fill_value=(b'N/A', 1.e+20, 1.e+20), dtype=[('A', 'S3'), ('B', '<f8'), ('C', '<f8')])
Converts an n-D structured array into an (n+1)-D unstructured array.
The new array will have a new last dimension equal in size to the number of field-elements of the input array. If not supplied, the output datatype is determined from the numpy type promotion rules applied to all the field datatypes.
Nested fields, as well as each element of any subarray fields, all count as a single field-elements.
ndarray.astype
. If true, always return a copy.
If false, and dtype
requirements are satisfied, a view is returned.ndarray.astype
. Controls what kind of data
casting may occur.>>> from numpy.lib import recfunctions as rfn >>> a = np.zeros(4, dtype=[('a', 'i4'), ('b', 'f4,u2'), ('c', 'f4', 2)]) >>> a array([(0, (0., 0), [0., 0.]), (0, (0., 0), [0., 0.]), (0, (0., 0), [0., 0.]), (0, (0., 0), [0., 0.])], dtype=[('a', '<i4'), ('b', [('f0', '<f4'), ('f1', '<u2')]), ('c', '<f4', (2,))]) >>> rfn.structured_to_unstructured(a) array([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])
>>> b = np.array([(1, 2, 5), (4, 5, 7), (7, 8 ,11), (10, 11, 12)], ... dtype=[('x', 'i4'), ('y', 'f4'), ('z', 'f8')]) >>> np.mean(rfn.structured_to_unstructured(b[['x', 'z']]), axis=-1) array([ 3. , 5.5, 9. , 11. ])
Converts an n-D unstructured array into an (n-1)-D structured array.
The last dimension of the input array is converted into a structure, with number of field-elements equal to the size of the last dimension of the input array. By default all output fields have the input array's dtype, but an output structured dtype with an equal number of fields-elements can be supplied instead.
Nested fields, as well as each element of any subarray fields, all count towards the number of field-elements.
ndarray.astype
. If true, always return a copy.
If false, and dtype
requirements are satisfied, a view is returned.ndarray.astype
. Controls what kind of data
casting may occur.>>> from numpy.lib import recfunctions as rfn >>> dt = np.dtype([('a', 'i4'), ('b', 'f4,u2'), ('c', 'f4', 2)]) >>> a = np.arange(20).reshape((4,5)) >>> a array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]]) >>> rfn.unstructured_to_structured(a, dt) array([( 0, ( 1., 2), [ 3., 4.]), ( 5, ( 6., 7), [ 8., 9.]), (10, (11., 12), [13., 14.]), (15, (16., 17), [18., 19.])], dtype=[('a', '<i4'), ('b', [('f0', '<f4'), ('f1', '<u2')]), ('c', '<f4', (2,))])
Undocumented
output
from the default given in a dictionary defaults.Produce a list of name/dtype pairs corresponding to the dtype fields
Similar to dtype.descr, but the second item of each tuple is a dtype, not a string. As a result, this handles subarray dtypes
Can be passed to the dtype constructor to reconstruct the dtype, noting that this (deliberately) discards field offsets.
>>> dt = np.dtype([(('a', 'A'), np.int64), ('b', np.double, 3)]) >>> dt.descr [(('a', 'A'), '<i8'), ('b', '<f8', (3,))] >>> _get_fieldspec(dt) [(('a', 'A'), dtype('int64')), ('b', dtype(('<f8', (3,))))]
Returns an iterator of concatenated items from a sequence of arrays.
Undocumented
Return a new array keeping only the fields in keep_names
,
and preserving the order of those fields.
asrecarray=True
) or
a plain ndarray or masked array with flexible dtype. The default
is False.Undocumented
Undocumented
Undocumented
Undocumented