Declarative classes¶
In this section a series of classes that are meant to declare datatypes that are required for creating primary PyTables datasets are described.
The Atom class and its descendants¶
- class tables.Atom(nptype: str | dtype, shape: tuple[int64, ...], dflt: Any)[source]¶
Defines the type of atomic cells stored in a dataset.
The meaning of atomic is that individual elements of a cell can not be extracted directly by indexing (i.e. __getitem__()) the dataset; e.g. if a dataset has shape (2, 2) and its atoms have shape (3,), to get the third element of the cell at (1, 0) one should use dataset[1,0][2] instead of dataset[1,0,2].
The Atom class is meant to declare the different properties of the base element (also known as atom) of CArray, EArray and VLArray datasets, although they are also used to describe the base elements of Array datasets. Atoms have the property that their length is always the same. However, you can grow datasets along the extensible dimension in the case of EArray or put a variable number of them on a VLArray row. Moreover, they are not restricted to scalar values, and they can be fully multidimensional objects.
- Parameters:
nptype (str or np.dtype) – Sets the Numpy data type of the atom.
shape (tuple) – Sets the shape of the atom. An integer shape of N is equivalent to the tuple (N,).
dflt (Any) – Sets the default value for the atom.
class. (The following are the public methods and attributes of the Atom)
Notes
A series of descendant classes are offered in order to make the use of these element descriptions easier. You should use a particular Atom descendant class whenever you know the exact type you will need when writing your code. Otherwise, you may use one of the Atom.from_*() factory Methods.
Atom attributes
- dflt¶
The default value of the atom.
If the user does not supply a value for an element while filling a dataset, this default value will be written to disk. If the user supplies a scalar value for a multidimensional atom, this value is automatically broadcast to all the items in the atom cell. If dflt is not supplied, an appropriate zero value (or null string) will be chosen by default. Please note that default values are kept internally as NumPy objects.
- dtype¶
The NumPy dtype that most closely matches this atom.
- itemsize¶
Size in bytes of a single item in the atom. Specially useful for atoms of the string kind.
- kind¶
The PyTables kind of the atom (a string).
- shape¶
The shape of the atom (a tuple for scalar atoms).
- type¶
The PyTables type of the atom (a string).
Atoms can be compared with atoms and other objects for strict (in)equality without having to compare individual attributes:
>>> atom1 = StringAtom(itemsize=10) # same as ``atom2`` >>> atom2 = Atom.from_kind('string', 10) # same as ``atom1`` >>> atom3 = IntAtom() >>> bool(atom1 == 'foo') False >>> bool(atom1 == atom2) True >>> bool(atom2 != atom1) False >>> bool(atom1 == atom3) False >>> bool(atom3 != atom2) True
Atom properties¶
- Atom.ndim¶
The number of dimensions of the atom.
Added in version 2.4.
- Atom.recarrtype¶
String type to be used in numpy.rec.array().
- Atom.size¶
Total size in bytes of the atom.
Atom methods¶
- Atom.copy(**override) Atom [source]¶
Get a copy of the atom, possibly overriding some arguments.
Constructor arguments to be overridden must be passed as keyword arguments:
>>> atom1 = Int32Atom(shape=12) >>> atom2 = atom1.copy() >>> print(atom1) Int32Atom(shape=(12,), dflt=0) >>> print(atom2) Int32Atom(shape=(12,), dflt=0) >>> atom1 is atom2 False >>> atom3 = atom1.copy(shape=(2, 2)) >>> print(atom3) Int32Atom(shape=(2, 2), dflt=0) >>> atom1.copy(foobar=42) Traceback (most recent call last): ... TypeError: ...__init__() got an unexpected keyword argument 'foobar'
Atom factory methods¶
- classmethod Atom.from_dtype(dtype: dtype, dflt: Any = None) Atom [source]¶
Create an Atom from a NumPy dtype.
An optional default value may be specified as the dflt argument. Information in the dtype not represented in an Atom is ignored:
>>> import numpy as np >>> Atom.from_dtype(np.dtype((np.int16, (2, 2)))) Int16Atom(shape=(2, 2), dflt=0) >>> Atom.from_dtype(np.dtype('float64')) Float64Atom(shape=(), dflt=0.0)
Note: for easier use in Python 3, where all strings lead to the Unicode dtype, this dtype will also generate a StringAtom. Since this is only viable for strings that are castable as ascii, a warning is issued.
>>> Atom.from_dtype(np.dtype('U20')) Atom.py:392: FlavorWarning: support for unicode type is very limited, and only works for strings that can be cast as ascii StringAtom(itemsize=20, shape=(), dflt=b'')
- classmethod Atom.from_kind(kind: str, itemsize: int | None = None, shape: tuple[int64, ...] = (), dflt: Any = None) Atom [source]¶
Create an Atom from a PyTables kind.
Optional item size, shape and default value may be specified as the itemsize, shape and dflt arguments, respectively. Bear in mind that not all atoms support a default item size:
>>> Atom.from_kind('int', itemsize=2, shape=(2, 2)) Int16Atom(shape=(2, 2), dflt=0) >>> Atom.from_kind('int', shape=(2, 2)) Int32Atom(shape=(2, 2), dflt=0) >>> Atom.from_kind('int', shape=1) Int32Atom(shape=(1,), dflt=0) >>> Atom.from_kind('string', dflt=b'hello') Traceback (most recent call last): ... ValueError: no default item size for kind ``string`` >>> Atom.from_kind('Float') Traceback (most recent call last): ... ValueError: unknown kind: 'Float'
Moreover, some kinds with atypical constructor signatures are not supported; you need to use the proper constructor:
>>> Atom.from_kind('enum') Traceback (most recent call last): ... ValueError: the ``enum`` kind is not supported...
- classmethod Atom.from_sctype(sctype: str | dtype, shape: tuple[int64, ...] = (), dflt: Any = None) Atom [source]¶
Create an Atom from a NumPy scalar type sctype.
Optional shape and default value may be specified as the shape and dflt arguments, respectively. Information in the sctype not represented in an Atom is ignored:
>>> import numpy as np >>> Atom.from_sctype(np.int16, shape=(2, 2)) Int16Atom(shape=(2, 2), dflt=0) >>> Atom.from_sctype('S5', dflt='hello') Traceback (most recent call last): ... ValueError: unknown NumPy scalar type: 'S5' >>> Atom.from_sctype('float64') Float64Atom(shape=(), dflt=0.0)
- classmethod Atom.from_type(type: str, shape: tuple[int64, ...] = (), dflt: Any = None) Atom [source]¶
Create an Atom from a PyTables type.
Optional shape and default value may be specified as the shape and dflt arguments, respectively:
>>> Atom.from_type('bool') BoolAtom(shape=(), dflt=False) >>> Atom.from_type('int16', shape=(2, 2)) Int16Atom(shape=(2, 2), dflt=0) >>> Atom.from_type('string40', dflt='hello') Traceback (most recent call last): ... ValueError: unknown type: 'string40' >>> Atom.from_type('Float64') Traceback (most recent call last): ... ValueError: unknown type: 'Float64'
Atom Sub-classes¶
- class tables.StringAtom(itemsize: int, shape: tuple[int64, ...] = (), dflt: str | bytes = b'')[source]¶
Defines an atom of type string.
The item size is the maximum length in characters of strings.
- property itemsize: int¶
Size in bytes of a sigle item in the atom.
- class tables.BoolAtom(shape: tuple[int64, ...] = (), dflt: bool = False)[source]¶
Defines an atom of type bool.
- class tables.IntAtom(itemsize: int | None = 4, shape: tuple[int64, ...] = (), dflt: Any = 0)[source]¶
Defines an atom of a signed integral type (int kind).
- class tables.UIntAtom(itemsize: int | None = 4, shape: tuple[int64, ...] = (), dflt: Any = 0)[source]¶
Defines an atom of an unsigned integral type (uint kind).
- class tables.FloatAtom(itemsize: int | None = 8, shape: tuple[int64, ...] = (), dflt: Any = 0.0)[source]¶
Defines an atom of a floating point type (float kind).
- class tables.ComplexAtom(itemsize: int, shape: tuple[int64, ...] = (), dflt: Any = 0j)[source]¶
Defines an atom of kind complex.
Allowed item sizes are 8 (single precision) and 16 (double precision). This class must be used instead of more concrete ones to avoid confusions with numarray-like precision specifications used in PyTables 1.X.
- dflt: Any¶
The default value of the atom.
If the user does not supply a value for an element while filling a dataset, this default value will be written to disk. If the user supplies a scalar value for a multidimensional atom, this value is automatically broadcast to all the items in the atom cell. If dflt is not supplied, an appropriate zero value (or null string) will be chosen by default. Please note that default values are kept internally as NumPy objects.
- dtype: dtype¶
The NumPy dtype that most closely matches this atom.
- property itemsize: int¶
Size in bytes of a sigle item in the atom.
- shape: tuple[int64, ...]¶
The shape of the atom (a tuple for scalar atoms).
- class tables.Time32Atom(shape: tuple[int64, ...] = (), dflt=0)[source]¶
Defines an atom of type time32.
- class tables.Time64Atom(shape: tuple[int64, ...] = (), dflt: float = 0.0)[source]¶
Defines an atom of type time64.
- class tables.EnumAtom(enum: Enum | Any, dflt: Any, base: Atom | str, shape: tuple[int64, ...] = ())[source]¶
Description of an atom of an enumerated type.
Instances of this class describe the atom type used to store enumerated values. Those values belong to an enumerated type, defined by the first argument (enum) in the constructor of the atom, which accepts the same kinds of arguments as the Enum class (see The Enum class). The enumerated type is stored in the enum attribute of the atom.
A default value must be specified as the second argument (dflt) in the constructor; it must be the name (a string) of one of the enumerated values in the enumerated type. When the atom is created, the corresponding concrete value is broadcast and stored in the dflt attribute (setting different default values for items in a multidimensional atom is not supported yet). If the name does not match any value in the enumerated type, a KeyError is raised.
Another atom must be specified as the base argument in order to determine the base type used for storing the values of enumerated values in memory and disk. This storage atom is kept in the base attribute of the created atom. As a shorthand, you may specify a PyTables type instead of the storage atom, implying that this has a scalar shape.
The storage atom should be able to represent each and every concrete value in the enumeration. If it is not, a TypeError is raised. The default value of the storage atom is ignored.
The type attribute of enumerated atoms is always enum.
Enumerated atoms also support comparisons with other objects:
>>> enum = ['T0', 'T1', 'T2'] >>> atom1 = EnumAtom(enum, 'T0', 'int8') # same as ``atom2`` >>> atom2 = EnumAtom(enum, 'T0', Int8Atom()) # same as ``atom1`` >>> atom3 = EnumAtom(enum, 'T0', 'int16') >>> atom4 = Int8Atom() >>> atom1 == enum False >>> atom1 == atom2 True >>> atom2 != atom1 False >>> atom1 == atom3 False >>> atom1 == atom4 False >>> atom4 != atom1 True
Examples
The next C enum construction:
enum myEnum { T0, T1, T2 };
would correspond to the following PyTables declaration:
>>> my_enum_atom = EnumAtom(['T0', 'T1', 'T2'], 'T0', 'int32')
Please note the dflt argument with a value of ‘T0’. Since the concrete value matching T0 is unknown right now (we have not used explicit concrete values), using the name is the only option left for defining a default value for the atom.
The chosen representation of values for this enumerated atom uses unsigned 32-bit integers, which surely wastes quite a lot of memory. Another size could be selected by using the base argument (this time with a full-blown storage atom):
>>> my_enum_atom = EnumAtom(['T0', 'T1', 'T2'], 'T0', UInt8Atom())
You can also define multidimensional arrays for data elements:
>>> my_enum_atom = EnumAtom( ... ['T0', 'T1', 'T2'], 'T0', base='uint32', shape=(3,2))
for 3x2 arrays of uint32.
- dflt: Any¶
The default value of the atom.
If the user does not supply a value for an element while filling a dataset, this default value will be written to disk. If the user supplies a scalar value for a multidimensional atom, this value is automatically broadcast to all the items in the atom cell. If dflt is not supplied, an appropriate zero value (or null string) will be chosen by default. Please note that default values are kept internally as NumPy objects.
- dtype: dtype¶
The NumPy dtype that most closely matches this atom.
- property itemsize: int¶
Size in bytes of a single item in the atom.
- shape: tuple[int64, ...]¶
The shape of the atom (a tuple for scalar atoms).
Pseudo atoms¶
Now, there come three special classes, ObjectAtom, VLStringAtom and VLUnicodeAtom, that actually do not descend from Atom, but which goal is so similar that they should be described here. Pseudo-atoms can only be used with VLArray datasets (see The VLArray class), and they do not support multidimensional values, nor multiple values per row.
They can be recognised because they also have kind, type and shape attributes, but no size, itemsize or dflt ones. Instead, they have a base atom which defines the elements used for storage.
See examples/vlarray1.py
and examples/vlarray2.py
for further
examples on VLArray datasets, including object serialization and string
management.
ObjectAtom¶
- class tables.ObjectAtom[source]¶
Defines an atom of type object.
This class is meant to fit any kind of Python object in a row of a VLArray dataset by using pickle behind the scenes. Due to the fact that you can not foresee how long will be the output of the pickle serialization (i.e. the atom already has a variable length), you can only fit one object per row. However, you can still group several objects in a single tuple or list and pass it to the
VLArray.append()
method.Object atoms do not accept parameters and they cause the reads of rows to always return Python objects. You can regard object atoms as an easy way to save an arbitrary number of generic Python objects in a VLArray dataset.
VLStringAtom¶
- class tables.VLStringAtom[source]¶
Defines an atom of type
vlstring
.This class describes a row of the VLArray class, rather than an atom. It differs from the StringAtom class in that you can only add one instance of it to one specific row, i.e. the
VLArray.append()
method only accepts one object when the base atom is of this type.This class stores bytestrings. It does not make assumptions on the encoding of the string, and raw bytes are stored as is. To store a string you will need to explicitly convert it to a bytestring before you can save them:
>>> s = 'A unicode string: hbar = ℏ' >>> bytestring = s.encode('utf-8') >>> VLArray.append(bytestring)
For full Unicode support, using VLUnicodeAtom (see VLUnicodeAtom) is recommended.
Variable-length string atoms do not accept parameters and they cause the reads of rows to always return Python bytestrings. You can regard vlstring atoms as an easy way to save generic variable length strings.
VLUnicodeAtom¶
- class tables.VLUnicodeAtom[source]¶
Defines an atom of type vlunicode.
This class describes a row of the VLArray class, rather than an atom. It is very similar to VLStringAtom (see VLStringAtom), but it stores Unicode strings (using 32-bit characters a la UCS-4, so all strings of the same length also take up the same space).
This class does not make assumptions on the encoding of plain input strings. Plain strings are supported as long as no character is out of the ASCII set; otherwise, you will need to explicitly convert them to Unicode before you can save them.
Variable-length Unicode atoms do not accept parameters and they cause the reads of rows to always return Python Unicode strings. You can regard vlunicode atoms as an easy way to save variable length Unicode strings.
The Col class and its descendants¶
- class tables.Col(nptype: str | dtype, shape: tuple[int64, ...], dflt: Any)[source]¶
Defines a non-nested column.
Col instances are used as a means to declare the different properties of a non-nested column in a table or nested column. Col classes are descendants of their equivalent Atom classes (see The Atom class and its descendants), but their instances have an additional _v_pos attribute that is used to decide the position of the column inside its parent table or nested column (see the IsDescription class in The IsDescription class for more information on column positions).
In the same fashion as Atom, you should use a particular Col descendant class whenever you know the exact type you will need when writing your code. Otherwise, you may use one of the Col.from_*() factory methods.
Each factory method inherited from the Atom class is available with the same signature, plus an additional pos parameter (placed in last position) which defaults to None and that may take an integer value. This parameter might be used to specify the position of the column in the table.
Besides, there are the next additional factory methods, available only for Col objects.
The following parameters are available for most Col-derived constructors.
- Parameters:
itemsize (int) – For types with a non-fixed size, this sets the size in bytes of individual items in the column.
shape (tuple) – Sets the shape of the column. An integer shape of N is equivalent to the tuple (N,).
dflt (Any) – Sets the default value for the column.
pos (int) – Sets the position of column in table. If unspecified, the position will be randomly selected.
attrs (dict) – Attribute metadata stored in the column (see The AttributeSet class).
Col instance variables¶
In addition to the variables that they inherit from the Atom class, Col instances have the following attributes.
- Col._v_pos¶
The relative position of this column with regard to its column siblings.
- Col._v_col_attrs¶
Additional metadata information. See The AttributeSet class.
Col factory methods¶
Col sub-classes¶
- class tables.StringCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.BoolCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.IntCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Int8Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Int16Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Int32Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Int64Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.UIntCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.UInt8Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.UInt16Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.UInt32Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.UInt64Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Float32Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Float64Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.ComplexCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.TimeCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Time32Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.Time64Col(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
- class tables.EnumCol(*args, **kwargs)¶
Defines a non-nested column of a particular type.
The constructor accepts the same arguments as the equivalent Atom class, plus an additional
pos
argument for position information, which is assigned to the _v_pos attribute and anattrs
argument for storing additional metadata similar to table.attrs, which is assigned to the _v_col_attrs attribute.
The IsDescription class¶
- class tables.IsDescription[source]¶
Description of the structure of a table or nested column.
This class is designed to be used as an easy, yet meaningful way to describe the structure of new Table (see The Table class) datasets or nested columns through the definition of derived classes. In order to define such a class, you must declare it as descendant of IsDescription, with as many attributes as columns you want in your table. The name of each attribute will become the name of a column, and its value will hold a description of it.
Ordinary columns can be described using instances of the Col class (see The Col class and its descendants). Nested columns can be described by using classes derived from IsDescription, instances of it, or name-description dictionaries. Derived classes can be declared in place (in which case the column takes the name of the class) or referenced by name.
Nested columns can have a _v_pos special attribute which sets the relative position of the column among sibling columns also having explicit positions. The pos constructor argument of Col instances is used for the same purpose. Columns with no explicit position will be placed afterwards in alphanumeric order.
Once you have created a description object, you can pass it to the Table constructor, where all the information it contains will be used to define the table structure.
IsDescription attributes
- _v_pos¶
Sets the position of a possible nested column description among its sibling columns. This attribute can be specified when declaring an IsDescription subclass to complement its metadata.
- columns¶
Maps the name of each column in the description to its own descriptive object. This attribute is automatically created when an IsDescription subclass is declared. Please note that declared columns can no longer be accessed as normal class variables after its creation.
Description helper functions¶
- tables.description.descr_from_dtype(dtype_: dtype, ptparams: dict[str, Any] | None = None) tuple[Description, str] [source]¶
Get a description instance and byteorder from a (nested) NumPy dtype.
- tables.description.dtype_from_descr(descr: dict | Type[IsDescription] | IsDescription, byteorder: str | None = None, ptparams: dict[str, Any] | None = None) dtype [source]¶
Get a (nested) NumPy dtype from a description instance and byteorder.
The descr parameter can be a Description or IsDescription instance, sub-class of IsDescription or a dictionary.
The AttributeSet class¶
- class tables.attributeset.AttributeSet(node: Node)[source]¶
Container for the HDF5 attributes of a Node.
This class provides methods to create new HDF5 node attributes, and to get, rename or delete existing ones.
Like in Group instances (see The Group class), AttributeSet instances make use of the natural naming convention, i.e. you can access the attributes on disk as if they were normal Python attributes of the AttributeSet instance.
This offers the user a very convenient way to access HDF5 node attributes. However, for this reason and in order not to pollute the object namespace, one can not assign normal attributes to AttributeSet instances, and their members use names which start by special prefixes as happens with Group objects.
Notes on native and pickled attributes
The values of most basic types are saved as HDF5 native data in the HDF5 file. This includes Python bool, int, float, complex and str (but not long nor unicode) values, as well as their NumPy scalar versions and homogeneous or structured NumPy arrays of them. When read, these values are always loaded as NumPy scalar or array objects, as needed.
For that reason, attributes in native HDF5 files will always be mapped into NumPy objects. Specifically, a multidimensional attribute will be mapped into a multidimensional ndarray and a scalar will be mapped into a NumPy scalar object (for example, a scalar H5T_NATIVE_LLONG will be read and returned as a numpy.int64 scalar).
However, other kinds of values are serialized using pickle, so you only will be able to correctly retrieve them using a Python-aware HDF5 library. Thus, if you want to save Python scalar values and make sure you are able to read them with generic HDF5 tools, you should make use of scalar or homogeneous/structured array NumPy objects (for example, numpy.int64(1) or numpy.array([1, 2, 3], dtype=’int16’)).
One more advice: because of the various potential difficulties in restoring a Python object stored in an attribute, you may end up getting a pickle string where a Python object is expected. If this is the case, you may wish to run pickle.loads() on that string to get an idea of where things went wrong, as shown in this example:
>>> import os, tempfile >>> import tables as tb >>> >>> class MyClass: ... foo = 'bar' ... >>> myObject = MyClass() # save object of custom class in HDF5 attr >>> h5fname = tempfile.mktemp(suffix='.h5') >>> h5f = tb.open_file(h5fname, 'w') >>> h5f.root._v_attrs.obj = myObject # store the object >>> print(h5f.root._v_attrs.obj.foo) # retrieve it bar >>> h5f.close() >>> >>> del MyClass, myObject # delete class of object and reopen file >>> h5f = tb.open_file(h5fname, 'r') >>> print(repr(h5f.root._v_attrs.obj)) b'ccopy_reg\n_reconstructor... >>> import pickle # let's unpickle that to see what went wrong >>> pickle.loads(h5f.root._v_attrs.obj) Traceback (most recent call last): ... AttributeError: Can't get attribute 'MyClass' ... >>> # So the problem was not in the stored object, ... # but in the *environment* where it was restored. ... h5f.close() >>> os.remove(h5fname)
Notes on AttributeSet methods
Note that this class overrides the __getattr__(), __setattr__(), __delattr__() and __dir__() special methods. This allows you to read, assign or delete attributes on disk by just using the next constructs:
leaf.attrs.myattr = 'str attr' # set a string (native support) leaf.attrs.myattr2 = 3 # set an integer (native support) leaf.attrs.myattr3 = [3, (1, 2)] # a generic object (Pickled) attrib = leaf.attrs.myattr # get the attribute ``myattr`` del leaf.attrs.myattr # delete the attribute ``myattr``
In addition, the dictionary-like __getitem__(), __setitem__() and __delitem__() methods are available, so you may write things like this:
for name in node._v_attrs._f_list(): print("name: %s, value: %s" % (name, node._v_attrs[name]))
Use whatever idiom you prefer to access the attributes.
Finally, on interactive python sessions you may get autocompletions of attributes named as valid python identifiers by pressing the [Tab] key, or to use the dir() global function.
If an attribute is set on a target node that already has a large number of attributes, a PerformanceWarning will be issued.
AttributeSet attributes
- _v_attrnames¶
A list with all attribute names.
- _v_attrnamessys¶
A list with system attribute names.
- _v_attrnamesuser¶
A list with user attribute names.
- _v_unimplemented¶
A list of attribute names with unimplemented native HDF5 types.
AttributeSet properties¶
- AttributeSet._v_node¶
The
Node
instance this attribute set is associated with.
AttributeSet methods¶
- AttributeSet._f_copy(where: Node) None [source]¶
Copy attributes to the where node.
Copies all user and certain system attributes to the given where node (a Node instance - see The Node class), replacing the existing ones.
- AttributeSet._f_list(attrset: Literal['all', 'sys', 'user'] = 'user') list[str] [source]¶
Get a list of attribute names.
The attrset string selects the attribute set to be used. A ‘user’ value returns only user attributes (this is the default). A ‘sys’ value returns only system attributes. Finally, ‘all’ returns both system and user attributes.