Helper classes¶
This section describes some classes that do not fit in any other section and that mainly serve for ancillary purposes.
The Filters class¶
- class tables.Filters(complevel: int = 0, complib: Literal['zlib', 'lzo', 'bzip2', 'blosc', 'blosc2'] = 'zlib', shuffle: bool = True, bitshuffle: bool = False, fletcher32: bool = False, least_significant_digit: int | None = None, _new: bool = True)[source]¶
Container for filter properties.
This class is meant to serve as a container that keeps information about the filter properties associated with the chunked leaves, that is Table, CArray, EArray and VLArray.
Instances of this class can be directly compared for equality.
- Parameters:
complevel (int) – Specifies a compression level for data. The allowed range is 0-9. A value of 0 (the default) disables compression.
complib (str) – Specifies the compression library to be used. Right now, ‘zlib’ (the default), ‘lzo’, ‘bzip2’, ‘blosc’ and ‘blosc2’ are supported. Additional compressors for Blosc like ‘blosc:blosclz’ (‘blosclz’ is the default in case the additional compressor is not specified), ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:zlib’ and ‘blosc:zstd’ are supported too. Also, additional compressors for Blosc2 like ‘blosc2:blosclz’ (‘blosclz’ is the default in case the additional compressor is not specified), ‘blosc2:lz4’, ‘blosc2:lz4hc’, ‘blosc2:zlib’ and ‘blosc2:zstd’ are supported too. Specifying a compression library which is not available in the system issues a FiltersWarning and sets the library to the default one.
shuffle (bool) – Whether to use the Shuffle filter in the HDF5 library. This is normally used to improve the compression ratio. A false value disables shuffling and a true one enables it. The default value depends on whether compression is enabled or not; if compression is enabled, shuffling defaults to be enabled, else shuffling is disabled. Shuffling can only be used when compression is enabled.
bitshuffle (bool) – Whether to use the BitShuffle filter in the Blosc/Blosc2 libraries. This is normally used to improve the compression ratio. A false value disables bitshuffling and a true one enables it. The default value is disabled.
fletcher32 (bool) – Whether to use the Fletcher32 filter in the HDF5 library. This is used to add a checksum on each data chunk. A false value (the default) disables the checksum.
least_significant_digit (int) –
If specified, data will be truncated (quantized). In conjunction with enabling compression, this produces ‘lossy’, but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using
around(scale*data)/scale
, wherescale = 2**bits
, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Default is None, or no quantization.Note
quantization is only applied if some form of compression is enabled
Examples
This is a small example on using the Filters class:
import numpy as np import tables as tb fileh = tb.open_file('test5.h5', mode='w') atom = Float32Atom() filters = Filters(complevel=1, complib='blosc', fletcher32=True) arr = fileh.create_earray(fileh.root, 'earray', atom, (0,2), "A growable array", filters=filters) # Append several rows in only one call arr.append(np.array([[1., 2.], [2., 3.], [3., 4.]], dtype=np.float32)) # Print information on that enlargeable array print("Result Array:") print(repr(arr)) fileh.close()
This enforces the use of the Blosc library, a compression level of 1 and a Fletcher32 checksum filter as well. See the output of this example:
Result Array: /earray (EArray(3, 2), fletcher32, shuffle, blosc(1)) 'A growable array' type = float32 shape = (3, 2) itemsize = 4 nrows = 3 extdim = 0 flavor = 'numpy' byteorder = 'little'
Filters attributes
- fletcher32¶
Whether the Fletcher32 filter is active or not.
- complevel¶
The compression level (0 disables compression).
- complib¶
The compression filter used (irrelevant when compression is not enabled).
- shuffle¶
Whether the Shuffle filter is active or not.
- bitshuffle¶
Whether the BitShuffle filter is active or not (Blosc/Blosc2 only).
Filters methods¶
- Filters.copy(**override) Filters [source]¶
Get a copy of the filters, possibly overriding some arguments.
Constructor arguments to be overridden must be passed as keyword arguments.
Using this method is recommended over replacing the attributes of an instance, since instances of this class may become immutable in the future:
>>> filters1 = Filters() >>> filters2 = filters1.copy() >>> filters1 == filters2 True >>> filters1 is filters2 False >>> filters3 = filters1.copy(complevel=1) Traceback (most recent call last): ... ValueError: compression library ``None`` is not supported... >>> filters3 = filters1.copy(complevel=1, complib='zlib') >>> print(filters1) Filters(complevel=0, shuffle=False, bitshuffle=False, fletcher32=False, least_significant_digit=None) >>> print(filters3) Filters(complevel=1, complib='zlib', shuffle=False, bitshuffle=False, fletcher32=False, least_significant_digit=None) >>> filters1.copy(foobar=42) Traceback (most recent call last): ... TypeError: ...__init__() got an unexpected keyword argument 'foobar'
The Index class¶
- class tables.index.Index(parentnode: Group, name: str, atom: Atom | None = None, title: str = '', kind: Literal['ultralight', 'light', 'medium', 'full'] | None = None, optlevel: int | None = None, filters: Filters | None = None, tmp_dir: str | None = None, expectedrows: int = 0, byteorder: str | None = None, blocksizes: tuple[int, int, int, int] | None = None, new: bool = True)[source]¶
Represents the index of a column in a table.
This class is used to keep the indexing information for columns in a Table dataset (see The Table class). It is actually a descendant of the Group class (see The Group class), with some added functionality. An Index is always associated with one and only one column in the table.
Note
This class is mainly intended for internal use, but some of its documented attributes and methods may be interesting for the programmer.
- Parameters:
parentnode –
The parent
Group
object.Changed in version 3.0: Renamed from parentNode to parentnode.
name (str) – The name of this node in its parent group.
atom (Atom) – An Atom object representing the shape and type of the atomic objects to be saved. Only scalar atoms are supported.
title – Sets a TITLE attribute of the Index entity.
kind – The desired kind for this index. The ‘full’ kind specifies a complete track of the row position (64-bit), while the ‘medium’, ‘light’ or ‘ultralight’ kinds only specify in which chunk the row is (using 32-bit, 16-bit and 8-bit respectively).
optlevel – The desired optimization level for this index.
filters (Filters) – An instance of the Filters class that provides information about the desired I/O filters to be applied during the life of this object.
tmp_dir – The directory for the temporary files.
expectedrows – Represents an user estimate about the number of row slices that will be added to the growable dimension in the IndexArray object.
byteorder – The byteorder of the index datasets on-disk.
blocksizes – The four main sizes of the compound blocks in index datasets (a low level parameter).
new – Whether this Index is new or has to be read from disk.
Index instance variables¶
- Index.column¶
The Column (see The Column class) instance for the indexed column.
- Index.dirty¶
Whether the index is dirty or not. Dirty indexes are out of sync with column data, so they exist but they are not usable.
- Index.filters¶
Filter properties for this index - see Filters in The Filters class.
- Index.is_csi¶
Whether the index is completely sorted or not.
Changed in version 3.0: The is_CSI property has been renamed into is_csi.
- tables.index.Index.nelements¶
The number of currently indexed rows for this column.
Index methods¶
Index special methods¶
- Index.__getitem__(key: int | slice) int | ndarray [source]¶
Return the indices values of index in the specified range.
If key argument is an integer, the corresponding index is returned. If key is a slice, the range of indices determined by it is returned. A negative value of step in slice is supported, meaning that the results will be returned in reverse order.
This method is equivalent to
Index.read_indices()
.
The IndexArray class¶
- class tables.indexes.IndexArray(parentnode: Group, name: str, atom: Atom | None = None, title: str = '', filters: Filters | None = None, byteorder: str | None = None)[source]¶
Represent the index (sorted or reverse index) dataset in HDF5 file.
All NumPy typecodes are supported except for complex datatypes.
- Parameters:
parentnode –
The Index class from which this object will hang off.
Changed in version 3.0: Renamed from parentNode to parentnode.
name (str) – The name of this node in its parent group.
atom – An Atom object representing the shape and type of the atomic objects to be saved. Only scalar atoms are supported.
title – Sets a TITLE attribute on the array entity.
filters (Filters) – An instance of the Filters class that provides information about the desired I/O filters to be applied during the life of this object.
byteorder – The byteroder of the data on-disk.
- property chunksize: int¶
The chunksize for this object.
- property slicesize: int¶
The slicesize for this object.
The Enum class¶
- class tables.misc.enum.Enum(enum: list[str] | tuple[str, ...] | dict[str, Any] | Enum)[source]¶
Enumerated type.
Each instance of this class represents an enumerated type. The values of the type must be declared exhaustively and named with strings, and they might be given explicit concrete values, though this is not compulsory. Once the type is defined, it can not be modified.
There are three ways of defining an enumerated type. Each one of them corresponds to the type of the only argument in the constructor of Enum:
Sequence of names: each enumerated value is named using a string, and its order is determined by its position in the sequence; the concrete value is assigned automatically:
>>> boolEnum = Enum(['True', 'False'])
Mapping of names: each enumerated value is named by a string and given an explicit concrete value. All of the concrete values must be different, or a ValueError will be raised:
>>> priority = Enum({'red': 20, 'orange': 10, 'green': 0}) >>> colors = Enum({'red': 1, 'blue': 1}) Traceback (most recent call last): ... ValueError: enumerated values contain duplicate concrete values: 1
Enumerated type: in that case, a copy of the original enumerated type is created. Both enumerated types are considered equal:
>>> prio2 = Enum(priority) >>> priority == prio2 True
Please note that names starting with _ are not allowed, since they are reserved for internal usage:
>>> prio2 = Enum(['_xx']) Traceback (most recent call last): ... ValueError: name of enumerated value can not start with ``_``: '_xx'
The concrete value of an enumerated value is obtained by getting its name as an attribute of the Enum instance (see __getattr__()) or as an item (see __getitem__()). This allows comparisons between enumerated values and assigning them to ordinary Python variables:
>>> redv = priority.red >>> redv == priority['red'] True >>> redv > priority.green True >>> priority.red == priority.orange False
The name of the enumerated value corresponding to a concrete value can also be obtained by using the __call__() method of the enumerated type. In this way you get the symbolic name to use it later with __getitem__():
>>> priority(redv) 'red' >>> priority.red == priority[priority(priority.red)] True
(If you ask, the __getitem__() method is not used for this purpose to avoid ambiguity in the case of using strings as concrete values.)
Enum special methods¶
- Enum.__call__(value: Any, *default: Any) Any [source]¶
Get the name of the enumerated value with that concrete value.
If there is no value with that concrete value in the enumeration and a second argument is given as a default, this is returned. Else, a ValueError is raised.
This method can be used for checking that a concrete value belongs to the set of concrete values in an enumerated type.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum(5) 'T2' >>> enum(42, None) is None True >>> enum(42) Traceback (most recent call last): ... ValueError: no enumerated value with that concrete value: 42
- Enum.__contains__(name: str) bool [source]¶
Is there an enumerated value with that name in the type?
If the enumerated type has an enumerated value with that name, True is returned. Otherwise, False is returned. The name must be a string.
This method does not check for concrete values matching a value in an enumerated type. For that, please use the
Enum.__call__()
method.Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> 'T1' in enum True >>> 'foo' in enum False >>> 0 in enum Traceback (most recent call last): ... TypeError: name of enumerated value is not a string: 0 >>> enum.T1 in enum # Be careful with this! Traceback (most recent call last): ... TypeError: name of enumerated value is not a string: 2
- Enum.__eq__(other: Enum) bool [source]¶
Is the other enumerated type equivalent to this one?
Two enumerated types are equivalent if they have exactly the same enumerated values (i.e. with the same names and concrete values).
Examples
Let
enum*
be enumerated types defined as:>>> enum1 = Enum({'T0': 0, 'T1': 2}) >>> enum2 = Enum(enum1) >>> enum3 = Enum({'T1': 2, 'T0': 0}) >>> enum4 = Enum({'T0': 0, 'T1': 2, 'T2': 5}) >>> enum5 = Enum({'T0': 0}) >>> enum6 = Enum({'T0': 10, 'T1': 20})
then:
>>> enum1 == enum1 True >>> enum1 == enum2 == enum3 True >>> enum1 == enum4 False >>> enum5 == enum1 False >>> enum1 == enum6 False
Comparing enumerated types with other kinds of objects produces a false result:
>>> enum1 == {'T0': 0, 'T1': 2} False >>> enum1 == ['T0', 'T1'] False >>> enum1 == 2 False
- Enum.__getattr__(name: str) Any [source]¶
Get the concrete value of the enumerated value with that name.
The name of the enumerated value must be a string. If there is no value with that name in the enumeration, an AttributeError is raised.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum.T1 2 >>> enum.foo Traceback (most recent call last): ... AttributeError: no enumerated value with that name: 'foo'
- Enum.__getitem__(name: str) Any [source]¶
Get the concrete value of the enumerated value with that name.
The name of the enumerated value must be a string. If there is no value with that name in the enumeration, a KeyError is raised.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum['T1'] 2 >>> enum['foo'] Traceback (most recent call last): ... KeyError: "no enumerated value with that name: 'foo'"
- Enum.__iter__() Generator[Any, None, None] [source]¶
Iterate over the enumerated values.
Enumerated values are returned as (name, value) pairs in no particular order.
Examples
>>> enumvals = {'red': 4, 'green': 2, 'blue': 1} >>> enum = Enum(enumvals) >>> enumdict = dict([(name, value) for (name, value) in enum]) >>> enumvals == enumdict True
The UnImplemented class¶
- class tables.UnImplemented(parentnode: Group, name: str)[source]¶
This class represents datasets not supported by PyTables in an HDF5 file.
When reading a generic HDF5 file (i.e. one that has not been created with PyTables, but with some other HDF5 library based tool), chances are that the specific combination of datatypes or dataspaces in some dataset might not be supported by PyTables yet. In such a case, this dataset will be mapped into an UnImplemented instance and the user will still be able to access the complete object tree of the generic HDF5 file. The user will also be able to read and write the attributes of the dataset, access some of its metadata, and perform certain hierarchy manipulation operations like deleting or moving (but not copying) the node. Of course, the user will not be able to read the actual data on it.
This is an elegant way to allow users to work with generic HDF5 files despite the fact that some of its datasets are not supported by PyTables. However, if you are really interested in having full access to an unimplemented dataset, please get in contact with the developer team.
This class does not have any public instance variables or methods, except those inherited from the Leaf class (see The Leaf class).
- byteorder: str | None¶
The endianness of data in memory (‘big’, ‘little’ or ‘irrelevant’).
- nrows¶
The length of the first dimension of the data.
- shape¶
The shape of the stored data.
The Unknown class¶
The ChunkInfo class¶
- class tables.ChunkInfo(start: tuple[int, ...] | None, filter_mask: int | None, offset: int | None, size: int | None)[source]¶
Information about storage for a given chunk.
It may also refer to a chunk which is within the dataset’s shape but that does not exist in storage, i.e. a missing chunk.
An instance of this named tuple class contains the following information, in field order:
- start¶
The coordinates in dataset items where the chunk starts, a tuple of integers with the same rank as the dataset. These coordinates are always aligned with chunk boundaries. Also present for missing chunks.
- filter_mask¶
An integer where each active bit signals that the filter in its position in the pipeline was disabled when storing the chunk. For instance,
0b10
disables shuffling,0b100
disables szip, and so on.None
for missing chunks.
- offset¶
An integer which indicates the offset in bytes of chunk data as it exists in storage.
None
for missing chunks.
- size¶
An integer which indicates the size in bytes of chunk data as it exists in storage.
None
for missing chunks.
Exceptions module¶
In the exceptions
module exceptions and warnings that are specific
to PyTables are declared.
- exception tables.HDF5ExtError(*args, **kargs)[source]¶
A low level HDF5 operation failed.
This exception is raised the low level PyTables components used for accessing HDF5 files. It usually signals that something is not going well in the HDF5 library or even at the Input/Output level.
Errors in the HDF5 C library may be accompanied by an extensive HDF5 back trace on standard error (see also
tables.silence_hdf5_messages()
).Changed in version 2.4.
- Parameters:
message – error message
h5bt –
This parameter (keyword only) controls the HDF5 back trace handling. Any keyword arguments other than h5bt is ignored.
if set to False the HDF5 back trace is ignored and the
HDF5ExtError.h5backtrace
attribute is set to Noneif set to True the back trace is retrieved from the HDF5 library and stored in the
HDF5ExtError.h5backtrace
attribute as a list of tuplesif set to “VERBOSE” (default) the HDF5 back trace is stored in the
HDF5ExtError.h5backtrace
attribute and also included in the string representation of the exceptionif not set (or set to None) the default policy is used (see
HDF5ExtError.DEFAULT_H5_BACKTRACE_POLICY
)
- format_h5_backtrace(backtrace: list[tuple[str, int, str, str]] | None = None) str [source]¶
Convert the HDF5 trace back represented as a list of tuples. (see
HDF5ExtError.h5backtrace
) into a string.Added in version 2.4.
- DEFAULT_H5_BACKTRACE_POLICY = 'VERBOSE'¶
Default policy for HDF5 backtrace handling
if set to False the HDF5 back trace is ignored and the
HDF5ExtError.h5backtrace
attribute is set to Noneif set to True the back trace is retrieved from the HDF5 library and stored in the
HDF5ExtError.h5backtrace
attribute as a list of tuplesif set to “VERBOSE” (default) the HDF5 back trace is stored in the
HDF5ExtError.h5backtrace
attribute and also included in the string representation of the exception
This parameter can be set using the
PT_DEFAULT_H5_BACKTRACE_POLICY
environment variable. Allowed values are “IGNORE” (or “FALSE”), “SAVE” (or “TRUE”) and “VERBOSE” to set the policy to False, True and “VERBOSE” respectively. The special value “DEFAULT” can be used to reset the policy to the default valueAdded in version 2.4.
- h5backtrace¶
HDF5 back trace.
Contains the HDF5 back trace as a (possibly empty) list of tuples. Each tuple has the following format:
(filename, line number, function name, text)
Depending on the value of the h5bt parameter passed to the initializer the h5backtrace attribute can be set to None. This means that the HDF5 back trace has been simply ignored (not retrieved from the HDF5 C library error stack) or that there has been an error (silently ignored) during the HDF5 back trace retrieval.
Added in version 2.4.
See also
traceback.format_list
traceback.format_list()
- exception tables.ClosedNodeError[source]¶
The operation can not be completed because the node is closed.
For instance, listing the children of a closed group is not allowed.
- exception tables.ClosedFileError[source]¶
The operation can not be completed because the hosting file is closed.
For instance, getting an existing node from a closed file is not allowed.
- exception tables.FileModeError[source]¶
The operation can not be carried out because the mode in which the hosting file is opened is not adequate.
For instance, removing an existing leaf from a read-only file is not allowed.
- exception tables.NodeError[source]¶
Invalid hierarchy manipulation operation requested.
This exception is raised when the user requests an operation on the hierarchy which can not be run because of the current layout of the tree. This includes accessing nonexistent nodes, moving or copying or creating over an existing node, non-recursively removing groups with children, and other similarly invalid operations.
A node in a PyTables database cannot be simply overwritten by replacing it. Instead, the old node must be removed explicitly before another one can take its place. This is done to protect interactive users from inadvertently deleting whole trees of data by a single erroneous command.
- exception tables.NoSuchNodeError[source]¶
An operation was requested on a node that does not exist.
This exception is raised when an operation gets a path name or a
(where, name)
pair leading to a nonexistent node.
- exception tables.UndoRedoError[source]¶
Problems with doing/redoing actions with Undo/Redo feature.
This exception indicates a problem related to the Undo/Redo mechanism, such as trying to undo or redo actions with this mechanism disabled, or going to a nonexistent mark.
- exception tables.UndoRedoWarning[source]¶
Issued when an action not supporting Undo/Redo is run.
This warning is only shown when the Undo/Redo mechanism is enabled.
- exception tables.NaturalNameWarning[source]¶
Issued when a non-pythonic name is given for a node.
This is not an error and may even be very useful in certain contexts, but one should be aware that such nodes cannot be accessed using natural naming (instead,
getattr()
must be used explicitly).
- exception tables.PerformanceWarning[source]¶
Warning for operations which may cause a performance drop.
This warning is issued when an operation is made on the database which may cause it to slow down on future operations (i.e. making the node tree grow too much).
- exception tables.FlavorError[source]¶
Unsupported or unavailable flavor or flavor conversion.
This exception is raised when an unsupported or unavailable flavor is given to a dataset, or when a conversion of data between two given flavors is not supported nor available.
- exception tables.FlavorWarning[source]¶
Unsupported or unavailable flavor conversion.
This warning is issued when a conversion of data between two given flavors is not supported nor available, and raising an error would render the data inaccessible (e.g. on a dataset of an unavailable flavor in a read-only file).
See the FlavorError class for more information.
- exception tables.FiltersWarning[source]¶
Unavailable filters.
This warning is issued when a valid filter is specified but it is not available in the system. It may mean that an available default filter is to be used instead.
- exception tables.OldIndexWarning[source]¶
Unsupported index format.
This warning is issued when an index in an unsupported format is found. The index will be marked as invalid and will behave as if it doesn’t exist.
- exception tables.DataTypeWarning[source]¶
Unsupported data type.
This warning is issued when an unsupported HDF5 data type is found (normally in a file created with other tool than PyTables).
- exception tables.ExperimentalFeatureWarning[source]¶
Generic warning for experimental features.
This warning is issued when using a functionality that is still experimental and that users have to use with care.
- exception tables.ChunkError[source]¶
An operation related to direct chunk access failed.
This exception may be related with the properties of the dataset or the chunk being accessed, or with how the chunk is being accessed. It is a base for more specific exceptions.
- exception tables.NotChunkedError[source]¶
A direct chunking operation was attempted on a non-chunked dataset.
For instance, chunk information was requested for a plain
Array
instance.