Contents
User's Manual
You can access the online documentation (including tutorials) in HTML or PDF.
Frequently Asked Questions
Questions about PyTables? Be sure to read the FAQ section before asking on the list.
Mailing Lists
If you have any problem or question that the FAQ or manual can't answer, you may want to ask the users' mailing list (search) to see if other users can help you. But first, check the list archives to see if your question has been already discussed.
You might be interested in subscribing only to the PyTables' announcement list, a very low-traffic list for announcing new releases of PyTables and related software.
Videos
These are the videos of a series dedicated to introduce the main features of PyTables in a visual and easy to grasp manner. More videos will be made available with the time:
PyTables, part I: Introduction: HDF5 file creation, the object tree, homogeneous array storage, natural naming, working with attributes.
PyTables, part II: Working with tables: Creation of tables with multidimensional and nested columns, and how to efficiently query them.
Hints for SQL users
If you are a seasoned user of SQL or relational databases, you may be interested in HintsForSQLUsers, a gentle introduction and cookbook to PyTables based on the concepts of SQL and RDBMS.
Presentations
Here are the slides of some presentations about PyTables that you may find useful:
An on-disk binary data container, query engine and computational kernel. Tutorial given at the PyData Conference 2012, New York, NY, USA (October 2012).
An on-disk binary data container. Talk given at the Austin Python Meetup, Austin, TX, USA (May 2012).
Large Data Analysis with Python. Seminar given at the German Neuroinformatics Node, Munich, Germany (November 2010).
Highly Efficient Computations In Python: Well Beyond NumPy. Tutorial given at EuroSciPy 2010 conference in Paris, France (July 2010).
Starving CPUs (and coping with that in PyTables). Seminar given at FOM Institute for Plasma Physics Rijnhuizen, The Netherlands (September 2009).
On The Data Access Issue (or Why Modern CPUs Are Starving). Keynote presented at EuroSciPy 2009 conference in Leipzig, Germany (July 2009).
An Overview of Future Improvements to OPSI. Informal talk given at the THG headquarters in Urbana-Champaign, Illinois, USA (October 2007).
Finding Needles in a Huge DataStack. Talk given at the EuroPython 2006 Conference, held at CERN, Genève, Switzerland (July 2006).
Presentation given at the HDF Workshop 2005, held at San Francisco, USA (December 2005).
I and II Workshop in Free Software and Scientific Computing given at the Universitat Jaume I, Castelló, Spain (October 2004). In Catalan.
Presentation given at the SciPy Workshop 2004, held at Caltech, Pasadena, USA (September 2004).
Slides of presentation given at EuroPython Conference in Charleroi, Belgium (June 2003).
Presentation for the iParty5 held at Castelló, Spain (May 2003). In Spanish.
Talk on PyTables given at the PyCon 2003 Convention held at Washington, USA (March 203).
Reports
White Paper on OPSI indexes, explaining the powerful new indexing engine in PyTables Pro.
Performance study on how the new object tree cache introduced in PyTables 1.2 can accelerate the opening of files with a large number of objects, while being quite less memory hungry.
Paper version of the presentation at PyCon2003.
User Contributed Documents
Also, you may want to check the UserDocuments page, where PyTables users explain how they have dealt with various problems. If you want to add your own documents, you are more than welcome!
Usage Examples
Besides the tutorials in the User Manual (above), you can see several simple examples here.
Getting Started
Here is some simple code to create a table in a group.
1 from tables import *
2
3 # Define a user record to characterize some kind of particles
4 class Particle(IsDescription):
5 name = StringCol(16) # 16-character String
6 idnumber = Int64Col() # Signed 64-bit integer
7 ADCcount = UInt16Col() # Unsigned short integer
8 TDCcount = UInt8Col() # unsigned byte
9 grid_i = Int32Col() # integer
10 grid_j = Int32Col() # integer
11 pressure = Float32Col() # float (single-precision)
12 energy = FloatCol() # double (double-precision)
13
14 filename = "test.h5"
15 # Open a file in "w"rite mode
16 h5file = openFile(filename, mode = "w", title = "Test file")
17 # Create a new group under "/" (root)
18 group = h5file.createGroup("/", 'detector', 'Detector information')
19 # Create one table on it
20 table = h5file.createTable(group, 'readout', Particle, "Readout example")
21 # Fill the table with 10 particles
22 particle = table.row
23 for i in xrange(10):
24 particle['name'] = 'Particle: %6d' % (i)
25 particle['TDCcount'] = i % 256
26 particle['ADCcount'] = (i * 256) % (1 << 16)
27 particle['grid_i'] = i
28 particle['grid_j'] = 10 - i
29 particle['pressure'] = float(i*i)
30 particle['energy'] = float(particle['pressure'] ** 4)
31 particle['idnumber'] = i * (2 ** 34)
32 # Insert a new particle record
33 particle.append()
34 # Close (and flush) the file
35 h5file.close()
Browsing the object tree
You can browse the contents of the file that we have created above. For this, we will use the convenient IPython shell.
In [1]:import tables
In [2]:f=tables.openFile("test.h5")
In [3]:f.root
Out[3]:
/ (RootGroup) 'Test file'
children := ['detector' (Group)]
In [4]:f.root.detector
Out[4]:
/detector (Group) 'Detector information'
children := ['readout' (Table)]
In [5]:f.root.detector.readout
Out[5]:
/detector/readout (Table(10L,)) 'Readout example'
description := {
"ADCcount": Col(dtype='UInt16', shape=1, dflt=0, pos=0, indexed=False),
"TDCcount": Col(dtype='UInt8', shape=1, dflt=0, pos=1, indexed=False),
"energy": Col(dtype='Float64', shape=1, dflt=0.0, pos=2, indexed=False),
"grid_i": Col(dtype='Int32', shape=1, dflt=0, pos=3, indexed=False),
"grid_j": Col(dtype='Int32', shape=1, dflt=0, pos=4, indexed=False),
"idnumber": Col(dtype='Int64', shape=1, dflt=0L, pos=5, indexed=False),
"name": StringCol(length=16, dflt=CharArray(['']), shape=1, pos=6, indexed=False),
"pressure": Col(dtype='Float32', shape=1, dflt=0.0, pos=7, indexed=False)}
byteorder := little
In [6]:f.root.detector.readout.attrs.TITLE
Out[6]:'Readout example'
Getting actual data
Here you can see how to get actual data in the readout table. Slicing and field selections is shown.
In [7]:p f.root.detector.readout[1]
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0)
In [8]:p f.root.detector.readout[1:3]
NestedRecArray[
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0),
(512, 2, 256.0, 2, 8, 34359738368L, 'Particle: 2', 4.0)
]
In [9]:p f.root.detector.readout[1::3]
NestedRecArray[
(256, 1, 1.0, 1, 9, 17179869184L, 'Particle: 1', 1.0),
(1024, 4, 65536.0, 4, 6, 68719476736L, 'Particle: 4', 16.0),
(1792, 7, 5764801.0, 7, 3, 120259084288L, 'Particle: 7', 49.0)
]
In [10]:p f.root.detector.readout[1::3].field('energy')
[ 1.00000000e+00 6.55360000e+04 5.76480100e+06]
In [11]:d.root.detector.readout.cols.energy[:]
Out[11]:
array([ 0.00000000e+00, 1.00000000e+00, 2.56000000e+02,
6.56100000e+03, 6.55360000e+04, 3.90625000e+05,
1.67961600e+06, 5.76480100e+06, 1.67772160e+07,
4.30467210e+07])
Selecting values
In [12]:p [row['energy'] for row in ro.where('pressure > 10')]
[65536.0, 390625.0, 1679616.0, 5764801.0, 16777216.0, 43046721.0]
In [13]:p [row['name'] for row in ro.where('energy < 10**6')]
['Particle: 0', 'Particle: 1', 'Particle: 2', 'Particle: 3', 'Particle: 4', 'Particle: 5']
In [14]:p [row['energy'] for row in ro.where('pressure > 10')]
[65536.0, 390625.0, 1679616.0, 5764801.0, 16777216.0, 43046721.0]
In [15]:sum(row['energy'] for row in ro.where('pressure > 10'))
Out[15]:67724515.0
In [16]:[row['energy'] for row in ro.where('pressure > 10')
....: if row['energy'] < 10**7 and row['TDCcount'] < 6 ]
Out[16]:[65536.0, 390625.0]
In [17]:sum(row['energy'] for row in ro.where('(pressure > 10) & (energy < 10**7)')
....: if row['TDCcount'] < 6 )
Out[17]:456161.0
In [18]:[row.nrow() for row in ro.where('(pressure > 10) & (energy < 10**7) | (TDCcount < 6)')]
....:
Out[18]:[4L, 5L]
Other sources for examples
The examples presented above show just a little amount of the full capabilities of PyTables. Please check out the documentation and the examples/ directory in the source package for more examples.
