Recording backend sionlib - Store data to an efficient binary format¶
Description¶
Availability
This recording backend is only available if NEST was compiled with support for MPI and SIONlib.
The sionlib recording backend writes collected data persistently to a binary container file (or to a rather small set of such files). This is especially useful for large-scale simulations running in a distributed way on many MPI processes/OpenMP threads. In such usage scenarios, writing to plain text files (see recording backend for ASCII files) would cause a large overhead because of the huge number of generated files and thus be very inefficient.
The implementation of the sionlib backend is based on the SIONlib library. Depending on the I/O architecture of the compute cluster or supercomputer and the global settings of the sionlib recording backend, either a single container file or a set of these files is created. In case of a single file, it is named according to the following pattern:
<data_path>/<data_prefix><filename>
In case of multiple files, this name is extended for each file by a
dot followed by a consecutive number. The properties data_path
and
data_prefix
are global kernel properties. They can for example be
set during repetitive simulation protocols to separate the data
originating from individual runs.
The life of a set of associated container files starts with the call
to Prepare
and ends with the call to Cleanup
. Data that is
produced during successive calls to Run
in between a pair of
Prepare
and Cleanup
calls will be written to the same file
set. When creating a new recording, if the filename already exists,
the Prepare
call will fail with a corresponding error message. To
instead overwrite the old file set, the kernel property
overwrite_files
can be set to True
using the corresponding kernel
attribute. An alternative way for avoiding name clashes is to set the
kernel attributes data_path
or data_prefix
, to write to a different file.
Data format¶
In contrast to other recording backends, the sionlib
backend
writes the data from all recorders using it to a single container
file(s). The file(s) contain the data in a custom binary format, which
is composed of a series of blocks in the following order:
The body block contains the actual data records; the layout of an individual record depends on the type of the device and is described by a corresponding entry in the device info block
The file info block keeps the file’s metadata, like version information and such
The device info block stores the properties and a data layout description for each device that uses the
sionlib
backendThe tail block contains pointers to the file info block
The data layout of the NEST SIONlib file format v2 is shown in the following figure.

NEST SIONlib binary file format.¶
Reading the data¶
As the binary format of the files produced by the sionlib
does not
conform to any standard, parsing them manually might be a bit
cumbersome. To ease this task, we provide a reader module for Python
that makes the files available in a convenient way. The source code
and further documentation for this module can be found in its own
repository.
Recorder-specific parameters¶
- label
A recorder-specific string (default: “”) that serves as alias name for the recording device, and which is stored in the metadata section of the container files.
Global parameters¶
These parameters can be set by assigning a nested dictionary to the
kernel attribute recording_backends
. The dictionary has to have
the form {'sionlib': {k_1: v_1, …, k_n: v_n}
with k_i
being
from the following list:
- filename
The filename (default: “output.sion”) part of the pattern according to which the full filename (incl. path) is generated (see above).
- sion_n_files
The number of container files (default: 1) used for storing the results of a single call to
Simulate
(or of a singlePrepare
-Run
-Cleanup
cycle). The default is one file. Using multiple files may have a performance advantage on large computing clusters, depending on how the (parallel) file system is accessed from the compute nodes.- sion_chunksize
In SIONlib nomenclature, a single OpenMP thread running on a single MPI process is called a task. For each task, a specific number of bytes is allocated in the container file(s) from the beginning. This number is set by the parameter
sion_chunksize
(default: 262144). If the number of bytes written by each task during the simulation is known in advance, it is advantageous to set the chunk size to this value. In this way, the size of the container files has not to be adjusted by SIONlib during the simulation. This yields a slight performance advantage. Choosing a value forsion_chunksize
which is too large does not hurt that much because SIONlib container files are sparse files (if supported by the underlying file system) which only use up the disk space which is actually required by the stored data.- buffer_size
The size of task-specific buffers (default: 1024) within the sionlib recording backend in bytes. These buffers are used to temporarily store data generated by the recording devices on each task. As soon as a buffer is full, its contents are written to the respective container file. To achieve optimum performance, the size of these buffers should at least amount to the size of the file system blocks.
- sion_collective
Flag (default: false) to enable the collective mode of SIONlib. In collective mode, recorded data is buffered completely during
Run
and only written at the very end ofRun
to the container files, all tasks acting synchronously. Furthermore, within SIONlib so-called collectors aggregate data from a specific number of tasks, and actually only these collectors directly access the container files, in this way minimizing load on the file system. The number of tasks per collector is determined automatically by SIONlib. However, collector size can also be set explicitly by the user via the environment variable SION_COLLSIZE before the start of NEST. On large simulations which also generate a large amount of data, collective mode can offer a performance advantage.