Release notes for PyTables 3.9 series

Author:

PyTables Developers

Contact:

pytables-dev@googlegroups.com

Changes from 3.9.1 to 3.9.2

Bugfixes

  • Fix the assembly of returned slice data in Blosc2 NDim optimized slice reads by using Blosc2’s b2nd_copy_buffer (gh-1078). The bug only showed up when the chunk did not fully cover the innermost dimension. Add unit tests to ckeck for regressions, along with foreign-generated files, and enable and fix Blosc2 NDim tests which were not being run. Thanks to Ivan Vilata.

Improvements

  • PyTables wheels now use a threadsafe build of the HDF5 library (gh-1075 and gh-1077). Concurrent reads should be possible with no need for additional locking or monkey-patching of the file open function. Thanks to Kiet Pham.

  • Partial support for the future NumPy 2, with some tests still failing (gh-1068). Thanks to Thomas Grainger.

  • Relax the reading of Blosc2 NDim to cope with datasets stored with other tools (gh-1072), e.g. missing chunk rank/shape in filter values, having HDF5 chunks where the Blosc2 super-chunk contains more than one inner chunk, or chunks with data not padded to the full chunk size (example script and tests included). Also enhance checks, comments and logged messages. Thanks to Ivan Vilata.

Other changes

  • Drop compatibility with the obsolete HDF5 1.8 API. PyTables now requires at least the 1.10 API (gh-1080). Thanks to Antonio Valentino.

  • Require python-blosc2 >= 2.3.0 or c-blosc2 >= 2.11.0 (which adds support for the b2nd_copy_buffer function).

  • Use the main Conda Forge channel for Python 3.12 (gh-1066). Thanks to Thomas Grainger.

  • Assorted fixes to the b2nd slicing benchmark. Thanks to Ivan Vilata.

  • Assorted fixes to b2nd slicing optimization tips (gh-1069). Thanks to Ivan Vilata.

Thanks

In alphabetical order:

  • Antonio Valentino

  • Ivan Vilata

  • Kiet Pham

  • Thomas Grainger

Changes from 3.9.0 to 3.9.1

  • Minimum supported version for Python is 3.9 (see gh-1062).

Changes from 3.8.0 to 3.9.0

New features

  • Apply optimized slice read to Blosc2-compressed CArray and EArray, with Blosc2 NDim 2-level partitioning for multidimensional arrays (gh-1056). See “Multidimensional slicing and chunk/block sizes” in the User’s Guide. Thanks to Marta Iborra and Ivan Vilata. This development was funded by a NumFOCUS grant.

  • Add basic API for column-level attributes as Col._v_col_attrs (gh-893 and gh-821). Thanks to Jonathan Wheeler, Thorben Menne, Ezequiel Cimadevilla Alvarez, odidev, Sander Roet, Antonio Valentino, Munehiro Nishida, Zbigniew Jędrzejewski-Szmek, Laurent Repiton, xmatthias, Logan Kilpatrick.

Other changes

  • Add support for the forthcoming Python 3.12 with binary wheels and automated testing.

  • Drop wheels and automated testing for Python 3.8; users or distributions may still build and test with Python 3.8 on their own (see commit ae1e60e and commit 47f5946).

  • New benchmark for ERA5 climate data. Thanks to Óscar Guiñón.

  • New “100 trillion baby” benchmark. Thanks to Francesc Alted.

  • New benchmark for querying meteorologic data. Thanks to Francesc Alted.

Improvements

  • Use H5Dchunk_iter (when available) to speed up walking over many chunks in a very large table, as well as with random reads (gh-991, gh-997, gh-999). Thanks to Francesc Alted and Mark Kittisopikul.

  • Improve setup.py (now using pyproject.toml as per PEP 518) and blosc2 discovery mechanism. Blosc2 may be used both via python-blosc2 or system c-blosc2 (gh-987, gh-1000, gh-998, gh-1017, gh-1045). Thanks to Antonio Valentino, Ben Greiner, Iwo-KX, nega.

  • Enable compatibility with Cython 3 (gh-1008 and gh-1003). Thanks to Matus Valo and Michał Górny.

  • Set GitHub workflow permissions to least privileges (gh-1007). Thanks to Joyce Brum.

  • Add SECURITY.md with security policy (gh-1012 and gh-1011). Thanks to Joyce Brum.

  • Handle py-cpuinfo missing in some platforms (gh-1013). Thanks to Sam James.

  • Avoid NumPy >= 1.25 deprecations, use numpy.all, numpy.any, etc. instead. Thanks to Antonio Valentino.

  • Avoid C-related build warnings. Thanks to Antonio Valentino.

  • Streamline CI wheel building & testing with cibuildwheel, more clear distinctions between build and runtime dependencies.

  • Update included c-blosc to v1.21.5 (fixes SSE2/AVX build issue).

  • Require python-blosc2 >= 2.2.8 or c-blosc2 >= 2.10.4 (Python 3.12 support and assorted fixes).

  • Update external libraries for CI-based wheel builds (gh-1018 and gh-967):

    • hdf5 v1.14.2

    • lz4 v1.9.4

    • zlib v1.2.13

Bugfixes

  • Fix crash in Blosc2 optimized path with large tables (gh-995 and gh-996). Thanks to Francesc Alted.

  • Fix compatibility with NumExpr v2.8.5 (gh-1046). Thanks to Antonio Valentino.

  • Fix build errors on Windows ARM64 (gh-989). Thanks to Cristoph Gohlke.

  • Fix ptrepack failures with external links (gh-938 and gh-990). Thanks to Adrian Altenhoff.

  • Replace stderr messages with Python warnings (gh-992 and gh-993). Thanks to Maximilian Linhoff.

  • Fixes to CI workflow and wheel building (gh-1009, gh-1047). Thanks to Antonio Valentino.

  • Fix garbled rendering of File.get_node docstring (gh-1021). Thanks to Steffen Rehberg.

  • Fix open extern “C” block (gh-1026). Thanks to Ivan Vilata.

  • Fix Cython slice indexing under Python 3.12 (gh-1033). Thanks to Zbigniew Jędrzejewski-Szmek.

  • Fix unsafe temporary file creation in benchmark (gh-1053). Thanks to Al Arafat Tanin (Project Alpha-Omega).

Thanks

In alphabetical order:

  • Adrian Altenhoff

  • Al Arafat Tanin

  • Antonio Valentino

  • Ben Greiner

  • Cristoph Gohlke

  • Ezequiel Cimadevilla Alvarez

  • Francesc Alted

  • Ivan Vilata

  • Iwo-KX

  • Jonathan Wheeler

  • Joyce Brum

  • Laurent Repiton

  • Logan Kilpatrick

  • Mark Kittisopikul

  • Marta Iborra

  • Matus Valo

  • Maximilian Linhoff

  • Michał Górny

  • Munehiro Nishida

  • nega

  • odidev

  • Óscar Guiñón

  • Sam James

  • Sander Roet

  • Seth Troisi

  • Steffen Rehberg

  • Thorben Menne

  • xmatthias

  • Zbigniew Jędrzejewski-Szmek