Command line interfaces#
Graphical user interface#
!das gui --help
usage: das gui [-h] [--song-types-string SONG_TYPES_STRING]
[--spec-freq-min SPEC_FREQ_MIN] [--spec-freq-max SPEC_FREQ_MAX]
[--skip-dialog | --no-skip-dialog]
[source]
GUI for annotating song and training and using das networks.
positional arguments:
source Data source to load.
Optional - will open an empty GUI if omitted.
Source can be the path to:
- an audio file,
- a numpy file (npy or npz),
- an h5 file
- an xarray-behave dataset constructed from an ethodrome data folder saved as a zarr file,
- an ethodrome data folder (e.g. 'dat/localhost-xxx').
optional arguments:
-h, --help show this help message and exit
--song-types-string SONG_TYPES_STRING
Initialize song types for annotations.
String of the form "song_name,song_category;song_name,song_category".
Avoid spaces or trailing ';'.
Need to wrap the string in "..." in the terminal
"song_name" can be any string w/o space, ",", or ";"
"song_category" can be "event" (e.g. pulse) or "segment" (sine, syllable)
--spec-freq-min SPEC_FREQ_MIN
Smallest frequency displayed in the spectrogram view. Defaults to 0 Hz.
--spec-freq-max SPEC_FREQ_MAX
Largest frequency displayed in the spectrogram view. Defaults to samplerate/2.
--skip-dialog, --no-skip-dialog
If True, skips the loading dialog and goes straight to the data view.
Train#
!das train --help
usage: das train [-h] --data-dir DATA_DIR [-x X_SUFFIX] [-y Y_SUFFIX]
[--save-dir SAVE_DIR] [--save-prefix SAVE_PREFIX]
[--save-name SAVE_NAME] [--model-name MODEL_NAME]
[--nb-filters NB_FILTERS] [-k KERNEL_SIZE]
[--nb-conv NB_CONV] [--use-separable [USE_SEPARABLE ...]]
[--nb-hist NB_HIST]
[-i | --ignore-boundaries | --no-ignore-boundaries]
[--batch-norm | --no-batch-norm] [--nb-pre-conv NB_PRE_CONV]
[--pre-nb-dft PRE_NB_DFT] [--pre-kernel-size PRE_KERNEL_SIZE]
[--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]
[--upsample | --no-upsample] [--dilations [DILATIONS ...]]
[--nb-lstm-units NB_LSTM_UNITS] [--verbose VERBOSE]
[--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]
[--learning-rate LEARNING_RATE]
[--reduce-lr | --no-reduce-lr]
[--reduce-lr-patience REDUCE_LR_PATIENCE]
[--fraction-data FRACTION_DATA]
[--first-sample-train FIRST_SAMPLE_TRAIN]
[--last-sample-train LAST_SAMPLE_TRAIN]
[--first-sample-val FIRST_SAMPLE_VAL]
[--last-sample-val LAST_SAMPLE_VAL] [--seed SEED]
[--batch-level-subsampling | --no-batch-level-subsampling]
[-a AUGMENTATIONS] [-t | --tensorboard | --no-tensorboard]
[--wandb-api-token WANDB_API_TOKEN]
[--wandb-project WANDB_PROJECT] [--wandb-entity WANDB_ENTITY]
[--log-messages | --no-log-messages] [--nb-stacks NB_STACKS]
[--with-y-hist | --no-with-y-hist] [--balance | --no-balance]
[--version-data | --no-version-data]
[--post-opt | --no-post-opt] [--fill-gaps-min FILL_GAPS_MIN]
[--fill-gaps-max FILL_GAPS_MAX]
[--fill-gaps-steps FILL_GAPS_STEPS]
[--min-len-min MIN_LEN_MIN] [--min-len-max MIN_LEN_MAX]
[--min-len-steps MIN_LEN_STEPS]
[--resnet-compute | --no-resnet-compute]
[--resnet-train | --no-resnet-train]
Train a DAS network.
optional arguments:
-h, --help show this help message and exit
--data-dir DATA_DIR Path to the directory or file with the dataset for training.
Accepts npy-dirs (recommended), h5 files or zarr files.
See documentation for how the dataset should be organized.
-x X_SUFFIX, --x-suffix X_SUFFIX
Select dataset used for training in the data_dir by suffix (y_ + X_SUFFIX).
Defaults to '' (will use the standard data 'x')
-y Y_SUFFIX, --y-suffix Y_SUFFIX
Select dataset used as a training target in the data_dir by suffix (y_ + Y_SUFFIX).
Song-type specific targets can be created with a training dataset,
Defaults to '' (will use the standard target 'y')
--save-dir SAVE_DIR Directory to save training outputs.
The path of output files will constructed from the SAVE_DIR, an optional SAVE_PREFIX,
and the time stamp of the start of training.
Defaults to the current directory ('./').
--save-prefix SAVE_PREFIX
Prepend to timestamp.
Name of files created will be start with SAVE_DIR/SAVE_PREFIX + "_" + TIMESTAMP
or with SAVE_DIR/TIMESTAMP if SAVE_PREFIX is empty.
Defaults to '' (empty).
--save-name SAVE_NAME
Append to prefix.
Name of files created will be start with SAVE_DIR/SAVE_PREFIX + "_" + SAVE_NAME
or with SAVE_DIR/SAVE_NAME if SAVE_PREFIX is empty.
Defaults to the timestamp YYYYMMDD_hhmmss.
--model-name MODEL_NAME
Network architecture to use.
See das.models for a description of all models.
Defaults to tcn.
--nb-filters NB_FILTERS
Number of filters per layer.
Defaults to 16.
-k KERNEL_SIZE, --kernel-size KERNEL_SIZE
Duration of the filters (=kernels) in samples.
Defaults to 16.
--nb-conv NB_CONV Number of TCN blocks in the network.
Defaults to 3.
--use-separable [USE_SEPARABLE ...]
Specify which TCN blocks should use separable convolutions.
Provide as a space-separated sequence of "False" or "True.
For instance: "True False False" will set the first block in a
three-block (as given by nb_conv) network to use separable convolutions.
Defaults to False (no block uses separable convolutions).
--nb-hist NB_HIST Number of samples processed at once by the network (a.k.a chunk duration).
Defaults to 1024 samples.
-i, --ignore-boundaries, --no-ignore-boundaries
Minimize edge effects by discarding predictions at the edges of chunks.
Defaults to True.
--batch-norm, --no-batch-norm
Batch normalize.
Defaults to True.
--nb-pre-conv NB_PRE_CONV
Adds fronted with downsampling. The downsampling factor is 2**nb_pre_conv.
The type of frontend depends on the model:
if model is tcn: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.
if model is tcn_tcn: adds a frontend of N TCN blocks to the TCN.
if model is tcn_stft: adds a trainable STFT frontend.
Defaults to 0 (no frontend, no downsampling).
--pre-nb-dft PRE_NB_DFT
Duration of filters (in samples) for the STFT frontend.
Number of filters is pre_nb_dft // 2 + 1.
Defaults to 64.
--pre-kernel-size PRE_KERNEL_SIZE
Duration of filters (=kernels) in samples in the pre-processing TCN.
Defaults to 3.
--pre-nb-filters PRE_NB_FILTERS
Number of filters per layer in the pre-processing TCN.
Defaults to 16.
--pre-nb-conv PRE_NB_CONV
--upsample, --no-upsample
whether or not to restore the model output to the input samplerate.
Should generally be True during training and evaluation but my speed up inference.
Defaults to True.
--dilations [DILATIONS ...]
List of dilation rate, defaults to [1, 2, 4, 8, 16] (5 layer with 2x dilation per TCN block)
--nb-lstm-units NB_LSTM_UNITS
If >0, adds LSTM with nb_lstm_units LSTM units to the output of the stack of TCN blocks.
Defaults to 0 (no LSTM layer).
--verbose VERBOSE Verbosity of training output (0 - no output during training, 1 - progress bar, 2 - one line per epoch).
Defaults to 2.
--batch-size BATCH_SIZE
Batch size
Defaults to 32.
--nb-epoch NB_EPOCH Maximal number of training epochs.
Training will stop early if validation loss did not decrease in the last 20 epochs.
Defaults to 400.
--learning-rate LEARNING_RATE
Learning rate of the model. Defaults should work in most cases.
Values typically range between 0.1 and 0.00001.
If None, uses model specific defaults: tcn 0.0001, tcn_stft and tcn_tcn 0.0005.
Defaults to None.
--reduce-lr, --no-reduce-lr
Reduce learning rate when the validation loss plateaus.
Defaults to False.
--reduce-lr-patience REDUCE_LR_PATIENCE
Number of epochs w/o a reduction in validation loss after which
to trigger a reduction in learning rate.
Defaults to 5 epochs.
--fraction-data FRACTION_DATA
Fraction of training and validation data to use.
Defaults to 1.0.
Overriden by setting all four _sample_ args.
--first-sample-train FIRST_SAMPLE_TRAIN
Defaults to 0 (first sample in training dataset).
Note 1: all four _sample_ args must be set - otherwise they will be ignored.
Note 2: Overrides fraction_data.
--last-sample-train LAST_SAMPLE_TRAIN
Defaults to None (use last sample in training dataset).
--first-sample-val FIRST_SAMPLE_VAL
Defaults to 0 (first sample in validation dataset).
--last-sample-val LAST_SAMPLE_VAL
Defaults to None (use last sample in validation dataset).
--seed SEED Random seed to reproducibly select fractions of the data.
Defaults to None (no seed).
--batch-level-subsampling, --no-batch-level-subsampling
Select fraction of data for training from random subset of shuffled batches.
If False, select a continuous chunk of the recording.
Defaults to False.
-a AUGMENTATIONS, --augmentations AUGMENTATIONS
yaml file with augmentations. Defaults to None (no augmentations).
-t, --tensorboard, --no-tensorboard
Write tensorboard logs to save_dir. Defaults to False.
--wandb-api-token WANDB_API_TOKEN
API token for logging to wandb.
Defaults to None (no logging to wandb).
--wandb-project WANDB_PROJECT
Project to log to for wandb.
Defaults to None (no logging to wandb).
--wandb-entity WANDB_ENTITY
Entity to log to for wandb.
Defaults to None (no logging to wandb).
--log-messages, --no-log-messages
Sets terminal logging level to INFO.
Defaults to False (will follow existing settings).
--nb-stacks NB_STACKS
Unused if model name is tcn, tcn_tcn, or tcn_stft. Defaults to 2.
--with-y-hist, --no-with-y-hist
Unused if model name is tcn, tcn_tcn, or tcn_stft. Defaults to True.
--balance, --no-balance
Balance data. Weights class-wise errors by the inverse of the class frequencies.
Defaults to False.
--version-data, --no-version-data
Save MD5 hash of the data_dir to log and params.yaml.
Defaults to True (set to False for large datasets since it can be slow).
--post-opt, --no-post-opt
Optimize post processing (delete short detections, fill brief gaps).
Defaults to False.
--fill-gaps-min FILL_GAPS_MIN
Defaults to 0.0005 seconds.
--fill-gaps-max FILL_GAPS_MAX
Defaults to 1 second.
--fill-gaps-steps FILL_GAPS_STEPS
Defaults to 20.
--min-len-min MIN_LEN_MIN
Defaults to 0.0005 seconds.
--min-len-max MIN_LEN_MAX
Defaults to 1 second.
--min-len-steps MIN_LEN_STEPS
Defaults to 20.
--resnet-compute, --no-resnet-compute
Defaults to False.
--resnet-train, --no-resnet-train
Defaults to False.
Tune#
!das tune --help
usage: das tune [-h] --data-dir DATA_DIR [-x X_SUFFIX] [-y Y_SUFFIX]
[--save-dir SAVE_DIR] [--save-prefix SAVE_PREFIX]
[--save-name SAVE_NAME] [-m MODEL_NAME]
[--nb-filters NB_FILTERS] [-k KERNEL_SIZE] [--nb-conv NB_CONV]
[--use-separable [USE_SEPARABLE ...]] [--nb-hist NB_HIST]
[-i | --ignore-boundaries | --no-ignore-boundaries]
[--batch-norm | --no-batch-norm] [--nb-pre-conv NB_PRE_CONV]
[--pre-nb-dft PRE_NB_DFT] [--pre-kernel-size PRE_KERNEL_SIZE]
[--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]
[--upsample | --no-upsample] [--dilations [DILATIONS ...]]
[--nb-lstm-units NB_LSTM_UNITS] [--verbose VERBOSE]
[--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]
[--learning-rate LEARNING_RATE] [--reduce-lr | --no-reduce-lr]
[--reduce-lr-patience REDUCE_LR_PATIENCE] [-f FRACTION_DATA]
[--seed SEED]
[--batch-level-subsampling | --no-batch-level-subsampling]
[-a AUGMENTATIONS] [--tensorboard | --no-tensorboard]
[--wandb-api-token WANDB_API_TOKEN]
[--wandb-project WANDB_PROJECT] [--wandb-entity WANDB_ENTITY]
[--log-messages | --no-log-messages] [--nb-stacks NB_STACKS]
[--with-y-hist | --no-with-y-hist] [--balance | --no-balance]
[--version-data | --no-version-data]
[--tune-config TUNE_CONFIG] [--nb-tune-trials NB_TUNE_TRIALS]
Tune the hyperparameters of a DAS network.
optional arguments:
-h, --help show this help message and exit
--data-dir DATA_DIR Path to the directory or file with the dataset for training.
Accepts npy-dirs (recommended), h5 files or zarr files.
See documentation for how the dataset should be organized.
-x X_SUFFIX, --x-suffix X_SUFFIX
Select dataset used for training in the data_dir by suffix (y_ + X_SUFFIX).
Defaults to '' (will use the standard data 'x')
-y Y_SUFFIX, --y-suffix Y_SUFFIX
Select dataset used as a training target in the data_dir by suffix (y_ + Y_SUFFIX).
Song-type specific targets can be created with a training dataset,
Defaults to '' (will use the standard target 'y')
--save-dir SAVE_DIR Directory to save training outputs.
The path of output files will constructed from the SAVE_DIR, an optional SAVE_PREFIX,
and the time stamp of the start of training.
Defaults to the current directory ('./').
--save-prefix SAVE_PREFIX
Prepend to timestamp.
Name of files created will be start with SAVE_DIR/SAVE_PREFIX + "_" + TIMESTAMP
or with SAVE_DIR/TIMESTAMP if SAVE_PREFIX is empty.
Defaults to '' (empty).
--save-name SAVE_NAME
Append to prefix.
Name of files created will be start with SAVE_DIR/SAVE_PREFIX + "_" + SAVE_NAME
or with SAVE_DIR/SAVE_NAME if SAVE_PREFIX is empty.
Defaults to TIMESTAMP.
-m MODEL_NAME, --model-name MODEL_NAME
Network architecture to use.
Use tcn (TCN) or tcn_stft (TCN with STFT frontend).
See das.models for a description of all models.
Defaults to tcn.
--nb-filters NB_FILTERS
Number of filters per layer.
Defaults to 16.
-k KERNEL_SIZE, --kernel-size KERNEL_SIZE
Duration of the filters (=kernels) in samples.
Defaults to 16.
--nb-conv NB_CONV Number of TCN blocks in the network.
Defaults to 3.
--use-separable [USE_SEPARABLE ...]
Specify which TCN blocks should use separable convolutions.
Provide as a space-separated sequence of "False" or "True.
For instance: "True False False" will set the first block in a
three-block (as given by nb_conv) network to use separable convolutions.
Defaults to False (no block uses separable convolutions).
--nb-hist NB_HIST Number of samples processed at once by the network (a.k.a chunk duration).
Defaults to 1024 samples.
-i, --ignore-boundaries, --no-ignore-boundaries
Minimize edge effects by discarding predictions at the edges of chunks.
Defaults to True.
--batch-norm, --no-batch-norm
Batch normalize.
Defaults to True.
--nb-pre-conv NB_PRE_CONV
Adds fronted with downsampling. The downsampling factor is 2**nb_pre_conv.
The type of frontend depends on the model:
if model is tcn: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.
if model is tcn_tcn: adds a frontend of N TCN blocks to the TCN.
if model is tcn_stft: adds a trainable STFT frontend.
Defaults to 0 (no frontend, no downsampling).
--pre-nb-dft PRE_NB_DFT
Duration of filters (in samples) for the STFT frontend.
Number of filters is pre_nb_dft // 2 + 1.
Defaults to 64.
--pre-kernel-size PRE_KERNEL_SIZE
Duration of filters (=kernels) in samples in the pre-processing TCN.
Defaults to 3.
--pre-nb-filters PRE_NB_FILTERS
Number of filters per layer in the pre-processing TCN.
Defaults to 16.
--pre-nb-conv PRE_NB_CONV
--upsample, --no-upsample
whether or not to restore the model output to the input samplerate.
Should generally be True during training and evaluation but my speed up inference.
Defaults to True.
--dilations [DILATIONS ...]
List of dilation rate, defaults to [1, 2, 4, 8, 16] (5 layer with 2x dilation per TCN block)
--nb-lstm-units NB_LSTM_UNITS
If >0, adds LSTM with nb_lstm_units LSTM units to the output of the stack of TCN blocks.
Defaults to 0 (no LSTM layer).
--verbose VERBOSE Verbosity of training output (0 - no output during training, 1 - progress bar, 2 - one line per epoch).
Defaults to 2.
--batch-size BATCH_SIZE
Batch size
Defaults to 32.
--nb-epoch NB_EPOCH Maximal number of training epochs.
Training will stop early if validation loss did not decrease in the last 20 epochs.
Defaults to 400.
--learning-rate LEARNING_RATE
Learning rate of the model. Defaults should work in most cases.
Values typically range between 0.1 and 0.00001.
If None, uses model specific defaults: tcn 0.0001, tcn_stft and tcn_tcn 0.0005.
Defaults to None.
--reduce-lr, --no-reduce-lr
Reduce learning rate when the validation loss plateaus.
Defaults to False.
--reduce-lr-patience REDUCE_LR_PATIENCE
Number of epochs w/o a reduction in validation loss after which
to trigger a reduction in learning rate.
Defaults to 5 epochs.
-f FRACTION_DATA, --fraction-data FRACTION_DATA
Fraction of training and validation data to use.
Defaults to 1.0.
--seed SEED Random seed to reproducibly select fractions of the data.
Defaults to None (no seed).
--batch-level-subsampling, --no-batch-level-subsampling
Select fraction of data for training from random subset of shuffled batches.
If False, select a continuous chunk of the recording.
Defaults to False.
-a AUGMENTATIONS, --augmentations AUGMENTATIONS
--tensorboard, --no-tensorboard
Write tensorboard logs to save_dir. Defaults to False.
--wandb-api-token WANDB_API_TOKEN
API token for logging to wandb.
Defaults to None (no logging to wandb).
--wandb-project WANDB_PROJECT
Project to log to for wandb.
Defaults to None (no logging to wandb).
--wandb-entity WANDB_ENTITY
Entity (user or team) to log to for wandb.
Defaults to None (no logging to wandb).
--log-messages, --no-log-messages
Sets terminal logging level to INFO.
Defaults to False (will follow existing settings).
--nb-stacks NB_STACKS
Unused if model name is tcn, tcn_tcn, or tcn_stft. Defaults to 2.
--with-y-hist, --no-with-y-hist
Unused if model name is tcn, tcn_tcn, or tcn_stft. Defaults to True.
--balance, --no-balance
Balance data. Weights class-wise errors by the inverse of the class frequencies.
Defaults to False.
--version-data, --no-version-data
Save MD5 hash of the data_dir to log and params.yaml.
Defaults to True (set to False for large datasets since it can be slow).
--tune-config TUNE_CONFIG
Yaml file with key:value pairs defining the search space for tuning.
Keys are parameter names, values are lists of possible parameter values.
--nb-tune-trials NB_TUNE_TRIALS
Number of model variants to test during hyper parameter tuning. Defaults to 1_000.
Predict#
!das predict --help
usage: das predict [-h] [--save-filename SAVE_FILENAME]
[--save-format SAVE_FORMAT] [-v VERBOSE] [-b BATCH_SIZE]
[--event-thres EVENT_THRES] [--event-dist EVENT_DIST]
[--event-dist-min EVENT_DIST_MIN]
[--event-dist-max EVENT_DIST_MAX]
[--segment-thres SEGMENT_THRES]
[--segment-use-optimized | --no-segment-use-optimized]
[--segment-minlen SEGMENT_MINLEN]
[--segment-fillgap SEGMENT_FILLGAP]
[-r | --resample | --no-resample]
path model_save_name
Predict song labels for a wav file or a folder of wav files.
Saves hdf5 files with keys: events, segments, class_probabilities
OR csv files with columns: label/start_seconds/stop_seconds
positional arguments:
path Path to a single WAV file with the audio data or to a folder with WAV files.
model_save_name Stem of the path for the model (and parameters). File to load will be MODEL_SAVE_NAME + _model.h5.
optional arguments:
-h, --help show this help message and exit
--save-filename SAVE_FILENAME
Path to save annotations to.
If omitted, will construct save_filename by
stripping the extension from recording_filename and adding '_das.h5' or '_annotations.csv'.
Will be ignored if path is a folder.
--save-format SAVE_FORMAT
'csv' or 'h5'.
csv: tabular text file with label, start and end seconds for each predicted song.
h5: same information as in csv plus confidence values for each sample and song type.
Defaults to 'csv'.
-v VERBOSE, --verbose VERBOSE
Display progress bar during prediction. Defaults to 1.
-b BATCH_SIZE, --batch-size BATCH_SIZE
Number of chunks processed at once.
Defaults to None (the default used during training).
--event-thres EVENT_THRES
Confidence threshold for detecting events. Range 0..1. Defaults to 0.5.
--event-dist EVENT_DIST
Minimal distance between adjacent events during thresholding.
Prevents detecting duplicate events when the confidence trace is a little noisy.
Defaults to 0.01.
--event-dist-min EVENT_DIST_MIN
MINimal inter-event interval for the event filter run during post processing.
Defaults to 0.
--event-dist-max EVENT_DIST_MAX
MAXimal inter-event interval for the event filter run during post processing.
Defaults to None (no upper limit).
--segment-thres SEGMENT_THRES
Confidence threshold for detecting segments. Range 0..1. Defaults to 0.5.
--segment-use-optimized, --no-segment-use-optimized
Use minlen and fillgap values from param file if they exist.
If segment_minlen and segment_fillgap are provided,
then they will override the values from the param file.
Defaults to True.
--segment-minlen SEGMENT_MINLEN
Minimal duration of a segment used for filtering out spurious detections.
Defaults to None (keep all segments).
--segment-fillgap SEGMENT_FILLGAP
Gap between adjacent segments to be filled. Useful for correcting brief lapses.
Defaults to None (do not fill gaps).
-r, --resample, --no-resample
Resample audio data to the rate expected by the model. Defaults to True.
Version information#
The output of this will depend on the specifics of your system and installation.
!das version
INFO:das.cli: macOS-12.4-arm64-arm-64bit
INFO:das.cli: DAS v0.26.9
INFO:das.cli: GUI is available.
INFO:das.cli: xarray-behave v0.33.1
INFO:das.cli: pyqtgraph v0.12.4
INFO:das.cli: PyQt5 vNone
INFO:das.cli: Qt vNone
INFO:das.cli:
INFO:das.cli: tensorflow v2.8.0
INFO:das.cli: keras v2.8.0
INFO:das.cli: GPU is available.
INFO:das.cli:
INFO:das.cli: python v3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33)
[Clang 13.0.1 ]
INFO:das.cli: pandas v1.4.2
INFO:das.cli: numpy v1.22.4
INFO:das.cli: h5py v3.6.0
INFO:das.cli: scipy v1.8.1
INFO:das.cli: scikit-learn v1.1.1
INFO:das.cli: xarray v2022.3.0