| ![]() |
This process comprises two stages: creating the tape volumes off-site, and importing these into our tape library.
Tapes that are to be added to Enstore must be properly labelled and written in a compatible format. Also, additional information (metadata) about the tape volumes and the files it contains - such as checksums - must be collected at the time the tapes are written and submitted to the Enstore administrators along with the tape volumes themselves.
A simple standalone program has been developed to facilitate the
process of creating the Enstore volumes and associated metadata files.
This has been developed in ANSI C for portability and to eliminate the
dependence on system utility programs (e.g. tar
, mt
, cpio
), the
behavior of which can vary from system to system. It is available as
a binary for Fermi-supported UNIX systems and as C source code for
other systems.
You must use approved tapes with barcode labels assigned by Fermilab.
The volume import software needs a directory to store its tape database. This database amasses information about files and volumes, and persists until the volumes are shipped to us. This persistent storage of metadata makes it possible to add files to a tape which had been started at an earlier date - it is not necessary to write the files to the volume all at one time.
The tape device and tape database directory can be specified on the
command line or as the environment variables TAPE_DEVICE
and TAPE_DB
.
Specifying this information as environment variables eases the use of
the software somewhat, since these values will then not need to be
typed on every command-line.
If the specified tape database directory is not present, it will be created when the software is run for the first time.
enstore_tape
program:enstore_tape
. It has four main modes of operation,
selected by the first (non-optional) command-line argument, which
must be one of --init
, --write
, --dump-db
, or --read
. The use of each of these options is explained in the following.
enstore_tape --init
Usage:
enstore_tape --init [--tape-device=devname] [--tape-db=dbdir] [--verbose] [--erase] --volume-label=labelIf
TAPE_DEVICE
or TAPE_DB
are set in the environment, the
corresponding command-line arguments are not necessary. $TAPE_DEVICE
must be a non-rewinding device, and $TAPE_DB
must be a path to a
directory (which will be created if needed) where the user has write
permission.The volume label must be a legal volume label, matching the external barcode label on the tape
If the tape is already labelled with a VOL1 header, or if the local
tape database already has an entry for the given volume label, then
enstore_tape --init
will refuse to relabel the tape. In order to
override this, use the --erase
option, which erases both the existing
tape label and the local database entries. Use this option with
caution.
enstore_tape --write
--write
mode of the enstore_tape
program.Usage:
enstore_tape --write [--tape-device=devname] [--tape-db=dbdir] [--verbose] --volume-label=label file_list [file_list...]
tape-device
, tape-db
, and volume-label
are as described above.
volume-label
must match a label already existing in the local database
(i.e. the tape must have been labeled by enstore_tape --init
).
Each file_list takes the form:
--pnfs-dir=path [--strip-path=path] filename [filename...]
--pnfs-dir
specifies the directory in the PNFS file space (i.e. the
file namespace within Enstore itself) where the files are to appear,
when the tape is actually added to the Enstore library. These paths
must start with /pnfs. The --pnfs-dir
argument is "sticky", that is,
it applies to all subsequent files until another --pnfs-dir
argument
is specified. --strip-path
specifies a leading pathname component
which is to be stripped from the filenames when they are imported into
Enstore. (This argument may be omitted). Finally, one or more
filenames are specified.A few examples may clarify this usage:
To specify all local files in the directory /tmp/sim/data starting with "MC", and cause them to be imported into the PNFS filesystem in the directory /pnfs/test/data, use
--pnfs-dir=/pnfs/test --strip-path=/tmp/sim/ /tmp/sim/data/MC*To specify all files in the current directory, and insert them into the PNFS file system in /pnfs/test
--pnfs-dir=/pnfs/test *Multiple file_lists may be specified.
Tapes need not be rewound after writing. This is convenient in the case that further files are to be appended to the tape.
Note that currently enstore_tape --write
does not descend into
subdirectories. All of the filenames specified on the command line
must be files rather than directories. If any of the filename
arguments are directories they will not be written to tape (and an
error message will be printed). This may be changed in a future
version of the program.
enstore_tape --dump-db
--dump-db
option of enstore_tape
accomplishes this task. Usage:
enstore_tape --dump-db [--tape-db=dbdir] > output_fileIf
tape-db
is not specified, the value of the environment variable
TAPE_DB
is used.
mt
, dd
, and cpio
to read Enstore
tapes. The GNU version of cpio
is suggested, although other
versions will probably work (the cpio flags in the example below will
need to be changed if you use a non-GNU cpio.)
Assuming that the tape device is /dev/tape
, to read back
the third file from a tape, you would use the following commands
#rewind the tape mt -f /dev/tape rewind #skip the VOL1 header and the first two files mt -f /dev/tape fsf 3 #extract the cpio archive contents dd bs=32768 if=/dev/tape | cpio -idv --no-absolute-filenamesAfter performing these steps, the tape will be positioned and ready for extracting the fourth file. To reposition the tape to read the n'th file repeat the
mt rewind
and mt
fsf
commands.
For simplicity, and to reduce the dependence on external utility
programs, an --read
option to enstore_tape
,
similar to the --write
option, is planned. This option
is not yet implemented.
TAPE_DB
database simply uses the Unix directory structure to
arrange keys, subkeys, and values as directories, subdirectories, and
files. This allows simple shell-scripts to be written to query the
local database.Tape volumes begin with a modified ANSI VOL1 header: 80 bytes of data, starting with "VOL1", followed by the volume label, padded by space characters up to a final character of ASCII "0" (not NUL). Files are written in variable-blocksize mode, with a default blocksize of 32768, in Posix cpio-odc format. Files are separated by a standard EOF marker. At the end of the tape come 2 EOF markers.
In order to make it easy to add files to an existing tape without rewinding and seeking to the end of data, an EOT header is written *after* the 2 EOF markers. The EOT header is similar to the VOL1 header, except that it starts with EOT and 7 ASCII digits giving the number of files already written to this tape. Following this is the volume label as in the VOL1 header.
After files are written to tape, the EOT header is written, and the
tape drive backspaces to the beginning of the EOT header. On
subsequent writes, the enstore_tape
program will check to see if
such an EOT header is present at the current tape location; if it is,
(and the volume names and file counts match) it is safe to continue
writing to this tape without rewinding and seeking to end of tape - we
simply skip backwards over one of the EOF markers preceding the EOT
header. If the EOT header is not found, the tape is rewound and the
VOL1 header is sought.
Checksums are generated using the Adler32 algorithm. In addition to a checksum of the entire file, a "sanity" checksum (for early error-detection) is generated using the first 65536 bytes of the file (if the length of the file exceeds 65536 bytes).