Command line interfaces

Graphical user interface

!dss gui --help
usage: dss gui [-h] [--song-types-string SONG_TYPES_STRING]
               [--spec-freq-min SPEC_FREQ_MIN] [--spec-freq-max SPEC_FREQ_MAX]
               [--skip-dialog] [--no-skip-dialog]
               [source]

GUI for annotating song and training and using DeepSS networks.

positional arguments:
  source                Data source to load.
                        Optional - will open an empty GUI if omitted.
                        Source can be the path to:
                        - an audio file,
                        - a numpy file (npy or npz),
                        - an h5 file
                        - an xarray-behave dataset constructed from an ethodrome data folder saved as a zarr file,
                        - an ethodrome data folder (e.g. 'dat/localhost-xxx').

optional arguments:
  -h, --help            show this help message and exit
  --song-types-string SONG_TYPES_STRING
                        Initialize song types for annotations.
                        String of the form "song_name,song_category;song_name,song_category".
                        Avoid spaces or trailing ';'.
                        Need to wrap the string in "..." in the terminal
                        "song_name" can be any string w/o space, ",", or ";"
                        "song_category" can be "event" (e.g. pulse) or "segment" (sine, syllable)
  --spec-freq-min SPEC_FREQ_MIN
                        Smallest frequency displayed in the spectrogram view. Defaults to 0 Hz.
  --spec-freq-max SPEC_FREQ_MAX
                        Largest frequency displayed in the spectrogram view. Defaults to samplerate/2.
  --skip-dialog         If True, skips the loading dialog and goes straight to the data view.
  --no-skip-dialog

Train

!dss train --help
usage: dss train [-h] -d DATA_DIR [-y Y_SUFFIX] [--save-dir SAVE_DIR]
                 [--save-prefix SAVE_PREFIX] [-m MODEL_NAME]
                 [--nb-filters NB_FILTERS] [-k KERNEL_SIZE]
                 [--nb-conv NB_CONV] [-u [USE_SEPARABLE [USE_SEPARABLE ...]]]
                 [--nb-hist NB_HIST] [-i] [--no-ignore-boundaries]
                 [--batch-norm] [--no-batch-norm] [--nb-pre-conv NB_PRE_CONV]
                 [--pre-kernel-size PRE_KERNEL_SIZE]
                 [--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]
                 [-v VERBOSE] [--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]
                 [--learning-rate LEARNING_RATE] [--reduce-lr]
                 [--no-reduce-lr] [--reduce-lr-patience REDUCE_LR_PATIENCE]
                 [-f FRACTION_DATA] [--seed SEED] [--batch-level-subsampling]
                 [--no-batch-level-subsampling] [-t] [--no-tensorboard]
                 [--log-messages] [--no-log-messages] [--nb-stacks NB_STACKS]
                 [-w] [--no-with-y-hist] [-x X_SUFFIX]

Train a DeepSS network.

optional arguments:
  -h, --help            show this help message and exit
  -d DATA_DIR, --data-dir DATA_DIR
                        Path to the directory or file with the dataset for training.
                        Accepts npy-dirs (recommended), h5 files or zarr files.
                        See documentation for how the dataset should be organized.
  -y Y_SUFFIX, --y-suffix Y_SUFFIX
                        Select training target by suffix.
                        Song-type specific targets can be created with a training dataset,
                        Defaults to '' (will use the standard target 'y')
  --save-dir SAVE_DIR   Directory to save training outputs.
                        The path of output files will constructed from the SAVE_DIR, an optional prefix, and the time stamp of the start of training.
                        Defaults to current directory ('./').
  --save-prefix SAVE_PREFIX
                        Prepend to timestamp.
                        Name of files created will be SAVE_DIR/SAVE_PREFIX + "_" + TIMESTAMP
                        or SAVE_DIR/ TIMESTAMP if SAVE_PREFIX is empty.
                        Defaults to '' (empty).
  -m MODEL_NAME, --model-name MODEL_NAME
                        Network architecture to use.
                        Use "tcn" (TCN) or "tcn_stft" (TCN with STFT frontend).
                        See dss.models for a description of all models.
                        Defaults to 'tcn'.
  --nb-filters NB_FILTERS
                        Number of filters per layer.
                        Defaults to 16.
  -k KERNEL_SIZE, --kernel-size KERNEL_SIZE
                        Duration of the filters (=kernels) in samples.
                        Defaults to 16.
  --nb-conv NB_CONV     Number of TCN blocks in the network.
                        Defaults to 3.
  -u [USE_SEPARABLE [USE_SEPARABLE ...]], --use-separable [USE_SEPARABLE [USE_SEPARABLE ...]]
                        Specify which TCN blocks should use separable convolutions.
                        Provide as a space-separated sequence of "False" or "True.
                        For instance: "True False False" will set the first block in a
                        three-block (as given by nb_conv) network to use separable convolutions.
                        Defaults to False (no block uses separable convolution).
  --nb-hist NB_HIST     Number of samples processed at once by the network (a.k.a chunk size).
                        Defaults to 1024.
  -i, --ignore-boundaries
                        Minimize edge effects by discarding predictions at the edges of chunks.
                        Defaults to True.
  --no-ignore-boundaries
  --batch-norm          Batch normalize.
                        Defaults to True.
  --no-batch-norm
  --nb-pre-conv NB_PRE_CONV
                        Adds downsampling frontend.
                        TCN: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN - useful for reducing the sampling rate for USV.
                        TCN_STFT: stft
                        Defaults to 0 (no frontend).
  --pre-kernel-size PRE_KERNEL_SIZE
                        [description]. Defaults to 3.
  --pre-nb-filters PRE_NB_FILTERS
                        [description]. Defaults to 16.
  --pre-nb-conv PRE_NB_CONV
                        [description]. Defaults to 3.
  -v VERBOSE, --verbose VERBOSE
                        Verbosity of training output (0 - no output(?), 1 - progress bar, 2 - one line per epoch).
                        Defaults to 2.
  --batch-size BATCH_SIZE
                        Batch size
                        Defaults to 32.
  --nb-epoch NB_EPOCH   Maximal number of training epochs.
                        Training will stop early if validation loss did not decrease in the last 20 epochs.
                        Defaults to 400.
  --learning-rate LEARNING_RATE
                        Learning rate of the model. Defaults should work in most cases.
                        Values typically range between 0.1 and 0.00001.
                        If None, uses per model defaults: "tcn" 0.0001, "tcn_stft" 0.0005).
                        Defaults to None.
  --reduce-lr           Reduce learning rate on plateau.
                        Defaults to False.
  --no-reduce-lr
  --reduce-lr-patience REDUCE_LR_PATIENCE
                        Number of epochs w/o a reduction in validation loss after which to trigger a reduction in learning rate.
                        Defaults to 5.
  -f FRACTION_DATA, --fraction-data FRACTION_DATA
                        Fraction of training and validation to use for training.
                        Defaults to 1.0.
  --seed SEED           Random seed to reproducible select fractions of the data.
                        Defaults to None (no seed).
  --batch-level-subsampling
                        Select fraction of data for training from random subset of shuffled batches.
                        If False, select a continuous chunk of the recording.
                        Defaults to False.
  --no-batch-level-subsampling
  -t, --tensorboard     Write tensorboard logs to save_dir.
                        Defaults to False.
  --no-tensorboard
  --log-messages        Sets logging level to INFO.
                        Defaults to False (will follow existing settings).
  --no-log-messages
  --nb-stacks NB_STACKS
                        Unused if model name is "tcn" or "tcn_stft". Defaults to 2.
  -w, --with-y-hist     Unused if model name is "tcn" or "tcn_stft". Defaults to True.
  --no-with-y-hist
  -x X_SUFFIX, --x-suffix X_SUFFIX
                        Select specific training data based on suffix (e.g. x_suffix).
                        Defaults to '' (will use the standard data 'x')

Predict

!dss predict --help
usage: dss predict [-h] [--save-filename SAVE_FILENAME] [-v VERBOSE]
                   [-b BATCH_SIZE] [--event-thres EVENT_THRES]
                   [--event-dist EVENT_DIST] [--event-dist-min EVENT_DIST_MIN]
                   [--event-dist-max EVENT_DIST_MAX]
                   [--segment-thres SEGMENT_THRES]
                   [--segment-minlen SEGMENT_MINLEN]
                   [--segment-fillgap SEGMENT_FILLGAP]
                   recording_filename model_save_name

Predict song labels in a wav file. 
 
Saves hdf5 file with keys: events, segments, class_probabilities

positional arguments:
  recording_filename    path to the WAV file with the audio data.
  model_save_name       path with the trunk name of the model.

optional arguments:
  -h, --help            show this help message and exit
  --save-filename SAVE_FILENAME
                        path to save annotations to. [Optional] - will strip extension from recording_filename and add '_dss.h5'.
  -v VERBOSE, --verbose VERBOSE
                        display progress bar during prediction. Defaults to 1.
  -b BATCH_SIZE, --batch-size BATCH_SIZE
                        number of chunks processed at once . Defaults to None (the default used during training).
                        Larger batches lead to faster inference. Limited by memory size, in particular for GPUs which typically have 8GB.
  --event-thres EVENT_THRES
                        Confidence threshold for detecting peaks. Range 0..1. Defaults to 0.5.
  --event-dist EVENT_DIST
                        Minimal distance between adjacent events during thresholding.
                        Prevents detecting duplicate events when the confidence trace is a little noisy.
                        Defaults to 0.01.
  --event-dist-min EVENT_DIST_MIN
                        MINimal inter-event interval for the event filter run during post processing.
                        Defaults to 0.
  --event-dist-max EVENT_DIST_MAX
                        MAXimal inter-event interval for the event filter run during post processing.
                        Defaults to None (no upper limit).
  --segment-thres SEGMENT_THRES
                        Confidence threshold for detecting segments. Range 0..1. Defaults to 0.5.
  --segment-minlen SEGMENT_MINLEN
                        Minimal duration of a segment used for filtering out spurious detections. Defaults to None.
  --segment-fillgap SEGMENT_FILLGAP
                        Gap between adjacent segments to be filled. Useful for correcting brief lapses. Defaults to None.