Command line interfaces

Graphical user interface

!dss gui --help
usage: dss gui [-h] [--song-types-string SONG_TYPES_STRING]
               [--spec-freq-min SPEC_FREQ_MIN] [--spec-freq-max SPEC_FREQ_MAX]
               [--skip-dialog] [--no-skip-dialog]

GUI for annotating song and training and using DeepSS networks.

positional arguments:
  source                Data source to load.
                        Optional - will open an empty GUI if omitted.
                        Source can be the path to:
                        - an audio file,
                        - a numpy file (npy or npz),
                        - an h5 file
                        - an xarray-behave dataset constructed from an ethodrome data folder saved as a zarr file,
                        - an ethodrome data folder (e.g. 'dat/localhost-xxx').

optional arguments:
  -h, --help            show this help message and exit
  --song-types-string SONG_TYPES_STRING
                        Initialize song types for annotations.
                        String of the form "song_name,song_category;song_name,song_category".
                        Avoid spaces or trailing ';'.
                        Need to wrap the string in "..." in the terminal
                        "song_name" can be any string w/o space, ",", or ";"
                        "song_category" can be "event" (e.g. pulse) or "segment" (sine, syllable)
  --spec-freq-min SPEC_FREQ_MIN
                        Smallest frequency displayed in the spectrogram view. Defaults to 0 Hz.
  --spec-freq-max SPEC_FREQ_MAX
                        Largest frequency displayed in the spectrogram view. Defaults to samplerate/2.
  --skip-dialog         If True, skips the loading dialog and goes straight to the data view.


!dss train --help
usage: dss train [-h] -d DATA_DIR [-y Y_SUFFIX] [--save-dir SAVE_DIR]
                 [--save-prefix SAVE_PREFIX] [-m MODEL_NAME]
                 [--nb-filters NB_FILTERS] [-k KERNEL_SIZE]
                 [--nb-conv NB_CONV] [-u [USE_SEPARABLE [USE_SEPARABLE ...]]]
                 [--nb-hist NB_HIST] [-i] [--no-ignore-boundaries]
                 [--batch-norm] [--no-batch-norm] [--nb-pre-conv NB_PRE_CONV]
                 [--pre-kernel-size PRE_KERNEL_SIZE]
                 [--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]
                 [-v VERBOSE] [--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]
                 [--learning-rate LEARNING_RATE] [--reduce-lr]
                 [--no-reduce-lr] [--reduce-lr-patience REDUCE_LR_PATIENCE]
                 [-f FRACTION_DATA] [--seed SEED] [--batch-level-subsampling]
                 [--no-batch-level-subsampling] [-t] [--no-tensorboard]
                 [--log-messages] [--no-log-messages] [--nb-stacks NB_STACKS]
                 [-w] [--no-with-y-hist] [-x X_SUFFIX]

Train a DeepSS network.

optional arguments:
  -h, --help            show this help message and exit
  -d DATA_DIR, --data-dir DATA_DIR
                        Path to the directory or file with the dataset for training.
                        Accepts npy-dirs (recommended), h5 files or zarr files.
                        See documentation for how the dataset should be organized.
  -y Y_SUFFIX, --y-suffix Y_SUFFIX
                        Select training target by suffix.
                        Song-type specific targets can be created with a training dataset,
                        Defaults to '' (will use the standard target 'y')
  --save-dir SAVE_DIR   Directory to save training outputs.
                        The path of output files will constructed from the SAVE_DIR, an optional prefix, and the time stamp of the start of training.
                        Defaults to current directory ('./').
  --save-prefix SAVE_PREFIX
                        Prepend to timestamp.
                        Name of files created will be SAVE_DIR/SAVE_PREFIX + "_" + TIMESTAMP
                        or SAVE_DIR/ TIMESTAMP if SAVE_PREFIX is empty.
                        Defaults to '' (empty).
  -m MODEL_NAME, --model-name MODEL_NAME
                        Network architecture to use.
                        Use "tcn" (TCN) or "tcn_stft" (TCN with STFT frontend).
                        See dss.models for a description of all models.
                        Defaults to 'tcn'.
  --nb-filters NB_FILTERS
                        Number of filters per layer.
                        Defaults to 16.
  -k KERNEL_SIZE, --kernel-size KERNEL_SIZE
                        Duration of the filters (=kernels) in samples.
                        Defaults to 16.
  --nb-conv NB_CONV     Number of TCN blocks in the network.
                        Defaults to 3.
                        Specify which TCN blocks should use separable convolutions.
                        Provide as a space-separated sequence of "False" or "True.
                        For instance: "True False False" will set the first block in a
                        three-block (as given by nb_conv) network to use separable convolutions.
                        Defaults to False (no block uses separable convolution).
  --nb-hist NB_HIST     Number of samples processed at once by the network (a.k.a chunk size).
                        Defaults to 1024.
  -i, --ignore-boundaries
                        Minimize edge effects by discarding predictions at the edges of chunks.
                        Defaults to True.
  --batch-norm          Batch normalize.
                        Defaults to True.
  --nb-pre-conv NB_PRE_CONV
                        Adds downsampling frontend.
                        TCN: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN - useful for reducing the sampling rate for USV.
                        TCN_STFT: stft
                        Defaults to 0 (no frontend).
  --pre-kernel-size PRE_KERNEL_SIZE
                        [description]. Defaults to 3.
  --pre-nb-filters PRE_NB_FILTERS
                        [description]. Defaults to 16.
  --pre-nb-conv PRE_NB_CONV
                        [description]. Defaults to 3.
  -v VERBOSE, --verbose VERBOSE
                        Verbosity of training output (0 - no output(?), 1 - progress bar, 2 - one line per epoch).
                        Defaults to 2.
  --batch-size BATCH_SIZE
                        Batch size
                        Defaults to 32.
  --nb-epoch NB_EPOCH   Maximal number of training epochs.
                        Training will stop early if validation loss did not decrease in the last 20 epochs.
                        Defaults to 400.
  --learning-rate LEARNING_RATE
                        Learning rate of the model. Defaults should work in most cases.
                        Values typically range between 0.1 and 0.00001.
                        If None, uses per model defaults: "tcn" 0.0001, "tcn_stft" 0.0005).
                        Defaults to None.
  --reduce-lr           Reduce learning rate on plateau.
                        Defaults to False.
  --reduce-lr-patience REDUCE_LR_PATIENCE
                        Number of epochs w/o a reduction in validation loss after which to trigger a reduction in learning rate.
                        Defaults to 5.
  -f FRACTION_DATA, --fraction-data FRACTION_DATA
                        Fraction of training and validation to use for training.
                        Defaults to 1.0.
  --seed SEED           Random seed to reproducible select fractions of the data.
                        Defaults to None (no seed).
                        Select fraction of data for training from random subset of shuffled batches.
                        If False, select a continuous chunk of the recording.
                        Defaults to False.
  -t, --tensorboard     Write tensorboard logs to save_dir.
                        Defaults to False.
  --log-messages        Sets logging level to INFO.
                        Defaults to False (will follow existing settings).
  --nb-stacks NB_STACKS
                        Unused if model name is "tcn" or "tcn_stft". Defaults to 2.
  -w, --with-y-hist     Unused if model name is "tcn" or "tcn_stft". Defaults to True.
  -x X_SUFFIX, --x-suffix X_SUFFIX
                        Select specific training data based on suffix (e.g. x_suffix).
                        Defaults to '' (will use the standard data 'x')


!dss predict --help
usage: dss predict [-h] [--save-filename SAVE_FILENAME] [-v VERBOSE]
                   [-b BATCH_SIZE] [--event-thres EVENT_THRES]
                   [--event-dist EVENT_DIST] [--event-dist-min EVENT_DIST_MIN]
                   [--event-dist-max EVENT_DIST_MAX]
                   [--segment-thres SEGMENT_THRES]
                   [--segment-minlen SEGMENT_MINLEN]
                   [--segment-fillgap SEGMENT_FILLGAP]
                   recording_filename model_save_name

Predict song labels in a wav file. 
Saves hdf5 file with keys: events, segments, class_probabilities

positional arguments:
  recording_filename    path to the WAV file with the audio data.
  model_save_name       path with the trunk name of the model.

optional arguments:
  -h, --help            show this help message and exit
  --save-filename SAVE_FILENAME
                        path to save annotations to. [Optional] - will strip extension from recording_filename and add '_dss.h5'.
  -v VERBOSE, --verbose VERBOSE
                        display progress bar during prediction. Defaults to 1.
  -b BATCH_SIZE, --batch-size BATCH_SIZE
                        number of chunks processed at once . Defaults to None (the default used during training).
                        Larger batches lead to faster inference. Limited by memory size, in particular for GPUs which typically have 8GB.
  --event-thres EVENT_THRES
                        Confidence threshold for detecting peaks. Range 0..1. Defaults to 0.5.
  --event-dist EVENT_DIST
                        Minimal distance between adjacent events during thresholding.
                        Prevents detecting duplicate events when the confidence trace is a little noisy.
                        Defaults to 0.01.
  --event-dist-min EVENT_DIST_MIN
                        MINimal inter-event interval for the event filter run during post processing.
                        Defaults to 0.
  --event-dist-max EVENT_DIST_MAX
                        MAXimal inter-event interval for the event filter run during post processing.
                        Defaults to None (no upper limit).
  --segment-thres SEGMENT_THRES
                        Confidence threshold for detecting segments. Range 0..1. Defaults to 0.5.
  --segment-minlen SEGMENT_MINLEN
                        Minimal duration of a segment used for filtering out spurious detections. Defaults to None.
  --segment-fillgap SEGMENT_FILLGAP
                        Gap between adjacent segments to be filled. Useful for correcting brief lapses. Defaults to None.