This is A PREVIEW for NEST 3.0 and NOT an OFFICIAL RELEASE! Some functionality may not be available and information may be incomplete!
Store data to an efficient binary format¶
This recording backend is only available if NEST was compiled with support for MPI and SIONlib.
The sionlib recording backend writes collected data persistently to a binary container file (or to a rather small set of such files). This is especially useful for large-scale simulations running in a distributed way on many MPI processes/OpenMP threads. In such usage scenarios, writing to plain text files (see recording backend for ASCII files) would cause a large overhead because of the huge number of generated files and thus be very inefficient.
The implementation of the sionlib backend is based on the SIONlib library. Depending on the I/O architecture of the compute cluster or supercomputer and the global settings of the sionlib recording backend, either a single container file or a set of these files is created. In case of a single file, it is named according to the following pattern:
In case of multiple files, this name is extended for each file by a
dot followed by a consecutive number. The properties
data_prefix are global kernel properties. They can for example be
set during repetitive simulation protocols to separate the data
originating from individual runs.
The life of a set of associated container files starts with the call
Prepare and ends with the call to
Cleanup. Data that is
produced during successive calls to
Run in between a pair of
Cleanup calls will be written to the same file
set. When creating a new recording, if the filename already exists,
Prepare call will fail with a corresponding error message. To
instead overwrite the old file set, the kernel property
overwrite_files can be set to true using
alternative way for avoiding name clashes is to re-set the kernel
data_prefix, so that another full
filename is composed.
In contrast to other recording backends, the
writes the data from all recorders using it to a single container
file(s). The file(s) contain the data in a custom binary format, which
is composed of a series of blocks in the following order:
The body block contains the actual data records; the layout of an individual record depends on the type of the device and is described by a corresponding entry in the device info block
The file info block keeps the file’s metadata, like version information and such
The device info block stores the properties and a data layout description for each device that uses the
The tail block contains pointers to the file info block
The data layout of the NEST SIONlib file format v2 is shown in the following figure.
Reading the data¶
As the binary format of the files produced by the
sionlib does not
conform to any standard, parsing them manually might be a bit
cumbersome. To ease this task, we provide a reader module for Python
that makes the files available in a convenient way. The source code
and further documentation for this module can be found in its own
A recorder-specific string (default: “”) that serves as alias name for the recording device, and which is stored in the metadata section of the container files.
Global parameters (to be set via
The filename (default: “output.sion”) part of the pattern according to which the full filename (incl. path) is generated (see above).
The number of container files (default: 1) used for storing the results of a single call to
Simulate(or of a single
Cleanupcycle). The default is one file. Using multiple files may have a performance advantage on large computing clusters, depending on how the (parallel) file system is accessed from the compute nodes.
In SIONlib nomenclature, a single OpenMP thread running on a single MPI process is called a task. For each task, a specific number of bytes is allocated in the container file(s) from the beginning. This number is set by the parameter
sion_chunksize(default: 262144). If the number of bytes written by each task during the simulation is known in advance, it is advantageous to set the chunk size to this value. In this way, the size of the container files has not to be adjusted by SIONlib during the simulation. This yields a slight performance advantage. Choosing a value for
sion_chunksizewhich is too large does not hurt that much because SIONlib container files are sparse files (if supported by the underlying file system) which only use up the disk space which is actually required by the stored data.
The size of task-specific buffers (default: 1024) within the sionlib recording backend in bytes. These buffers are used to temporarily store data generated by the recording devices on each task. As soon as a buffer is full, its contents are written to the respective container file. To achieve optimum performance, the size of these buffers should at least amount to the size of the file system blocks.
Flag (default: false) to enable the collective mode of SIONlib. In collective mode, recorded data is buffered completely during
Runand only written at the very end of
Runto the container files, all tasks acting synchronously. Furthermore, within SIONlib so-called collectors aggregate data from a specific number of tasks, and actually only these collectors directly access the container files, in this way minimizing load on the file system. The number of tasks per collector is determined automatically by SIONlib. However, collector size can also be set explicitly by the user via the environment variable SION_COLLSIZE before the start of NEST. On large simulations which also generate a large amount of data, collective mode can offer a performance advantage.