Set up local waveform archiving

You will …

  • Set up slarchive with its necessary bindings

  • Set up purge_datafiles in crontab

Pre-requisites for this tutorial:

Afterwards/Results/Outcomes:

  • Save real-time data in a local archive for later processing.

  • See miniSEED day files for GE stations in your local waveform archive.

Time range estimate:

  • 5 minutes

Related tutorial(s):


Motivation: Without activating archiving, your local Seedlink server will only keep waveforms for a short time. This makes it hard to review old events, for example.

In this example, we’ll arrange for keeping waveforms for one week. Before starting, you’ll need bindings for your stations; see the tutorials Add real-time stations from GEOFON or Get real-time data from a remote Seedlink server (single station).

The slarchive collects data and archives it locally using a SDS file system structure of nested subdirectories and systematically named files.

In scconfig

  1. Under the Modules tab, go to Acquisition, and select slarchive. Here you can see the default parameters used. By default, slarchive connects to your local Seedlink server, and archives to your local disk.

  2. Under the System tab, select the line for slarchive, and click “Enable module(s)” button at the top.

  3. Under Bindings: On RHS right-click “slarchive” to add an slarchive profile. Name it ‘week’, to keep waveforms for 7 days, and click ‘Ok’. The new profile appears in the (bottom right corner of scconfig. Double click on the profile to open its settings. Unlock the box labeled “keep”, and change the default from 30 to 7.

    Once you have a binding profile, drag it over all the stations it should apply to, under “Networks” on the left-hand side of the bindings tool.

Warning

The name ‘week’ is just a label. Its functionality comes from changing the value of the keep parameter. Changing the name of a binding profile does not change its function.

Note

You can also choose which channels should be archived, using the “selectors” box. For instance, you may collect data at several sample rates, and only wish to archive the highest rate. If you collect LH, BH, HH streams at 0.1, 20, and 100 samples per second, respectively, you might retain only the HH streams, by setting “selectors” to “HH”.

  1. Then return to System, and click ‘Update configuration’. Make sure the slarchive module, or no module, is selected.

  2. Restart slarchive.

  3. Adjust the RecordStream for making use of the archived waveforms from within a GUI or automatic data processing modules.

Command line

You will need to edit each of your top-level key files to refer to a new binding profile. e.g.:

$ cd ~/seiscomp/etc/key
$ vi station_GR_CLL

Add the line slarchive:week to whatever lines are already there. Afterwards it will look something like this:

# Binding references
global:BH
scautopick:default
seedlink:geofon
slarchive:week

Repeat this for the top-level key file of each station you wish this binding to apply to. Now create the binding profile in the key directory. This is a file with a name corresponding to the binding profile name; here: ‘week’

$ cd ~/seiscomp/etc/key
$ mkdir slarchive
$ vi slarchive/profile_week
# Number of days the data is kept in the archive. This requires purge_datafile
# to be run as cronjob.
keep = 7

$ seiscomp enable slarchive
$ seiscomp update-config slarchive
$ seiscomp restart slarchive
slarchive is not running
starting slarchive

Note

Left unattended, your disk will eventually fill up with archived data. To prevent this you will need a script like purge_database, which is provided with SeisComP. This can be run once per day using the cron feature of your system. The command:

$ seiscomp print crontab

will print a number of lines to the terminal. Type crontab -e and insert these lines into the crontab file for your user (typically sysop). Exit your crontab editor. Displaying your crontab should now show a line for purge_database.:

$ crontab -l
20 3 * * * /home/sysop/seiscomp/var/lib/slarchive/purge_datafiles >/dev/null 2>&1
[There may be other lines too.]

This shows you that the purge_datafiles script will run every day at 3:20 a.m.

Note

If you examine the purge_datafiles script, you will see that all it does is look for files with a last modified time older than a certain number of days ago. The number of days to keep can be set station-by-station using the ARCH_KEEP feature. A convenient way to do this for many stations is with multiple binding profiles, one for each length of time desired.

Checking archiving is functioning

  • If seedlink is configured correctly, a new station’s streams appears in output from slinktool:

    $ slinktool -Q : | grep CLL
    GR CLL      HHZ D 2020/04/01 01:11:57.6649  -  2020/04/01 07:28:49.0299
    GR CLL      HHE D 2020/04/01 01:11:57.6649  -  2020/04/01 07:28:45.0299
    GR CLL      HHN D 2020/04/01 01:11:57.6649  -  2020/04/01 07:28:39.2299
    

    This shows three streams being acquired from station ‘CLL’. The second time shown is the time of the most recent data for each stream.

  • If slarchive is configured correctly, waveform data for the station appears in slarchive’s SDS archive directory:

    $ ls -l seiscomp/var/lib/archive/2020/GR/CLL
    total 12
    drwxr-xr-x 2 user user 4096 Apr  1 06:30 HHE.D
    drwxr-xr-x 2 user user 4096 Apr  1 06:30 HHN.D
    drwxr-xr-x 2 user user 4096 Apr  1 06:30 HHZ.D
    
    $ ls -l seiscomp/var/lib/archive/2020/GR/CLL/HHZ.D/
    total 12728
    -rw-r--r-- 1 user user 5492224 Mar 31 00:04 GR.CLL..BHZ.D.2020.090
    -rw-r--r-- 1 user user 7531008 Apr  1 00:03 GR.CLL..BHZ.D.2020.091