{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Command line interfaces\n", "\n", "## Graphical user interface" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: das gui [-h] [--song-types-string SONG_TYPES_STRING]\n", " [--spec-freq-min SPEC_FREQ_MIN] [--spec-freq-max SPEC_FREQ_MAX]\n", " [--skip-dialog | --no-skip-dialog]\n", " [source]\n", "\n", "GUI for annotating song and training and using das networks.\n", "\n", "positional arguments:\n", " source Data source to load.\n", " Optional - will open an empty GUI if omitted.\n", " Source can be the path to:\n", " - an audio file,\n", " - a numpy file (npy or npz),\n", " - an h5 file\n", " - an xarray-behave dataset constructed from an ethodrome data folder saved as a zarr file,\n", " - an ethodrome data folder (e.g. 'dat/localhost-xxx').\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --song-types-string SONG_TYPES_STRING\n", " Initialize song types for annotations.\n", " String of the form \"song_name,song_category;song_name,song_category\".\n", " Avoid spaces or trailing ';'.\n", " Need to wrap the string in \"...\" in the terminal\n", " \"song_name\" can be any string w/o space, \",\", or \";\"\n", " \"song_category\" can be \"event\" (e.g. pulse) or \"segment\" (sine, syllable)\n", " --spec-freq-min SPEC_FREQ_MIN\n", " Smallest frequency displayed in the spectrogram view. Defaults to 0 Hz.\n", " --spec-freq-max SPEC_FREQ_MAX\n", " Largest frequency displayed in the spectrogram view. Defaults to samplerate/2.\n", " --skip-dialog, --no-skip-dialog\n", " If True, skips the loading dialog and goes straight to the data view.\n", "\u001b[0m" ] } ], "source": [ "!das gui --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: das train [-h] --data-dir DATA_DIR [-x X_SUFFIX] [-y Y_SUFFIX]\n", " [--save-dir SAVE_DIR] [--save-prefix SAVE_PREFIX]\n", " [--save-name SAVE_NAME] [--model-name MODEL_NAME]\n", " [--nb-filters NB_FILTERS] [-k KERNEL_SIZE]\n", " [--nb-conv NB_CONV] [--use-separable [USE_SEPARABLE ...]]\n", " [--nb-hist NB_HIST]\n", " [-i | --ignore-boundaries | --no-ignore-boundaries]\n", " [--batch-norm | --no-batch-norm] [--nb-pre-conv NB_PRE_CONV]\n", " [--pre-nb-dft PRE_NB_DFT] [--pre-kernel-size PRE_KERNEL_SIZE]\n", " [--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]\n", " [--upsample | --no-upsample] [--dilations [DILATIONS ...]]\n", " [--nb-lstm-units NB_LSTM_UNITS] [--verbose VERBOSE]\n", " [--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]\n", " [--learning-rate LEARNING_RATE]\n", " [--reduce-lr | --no-reduce-lr]\n", " [--reduce-lr-patience REDUCE_LR_PATIENCE]\n", " [--fraction-data FRACTION_DATA]\n", " [--first-sample-train FIRST_SAMPLE_TRAIN]\n", " [--last-sample-train LAST_SAMPLE_TRAIN]\n", " [--first-sample-val FIRST_SAMPLE_VAL]\n", " [--last-sample-val LAST_SAMPLE_VAL] [--seed SEED]\n", " [--batch-level-subsampling | --no-batch-level-subsampling]\n", " [-a AUGMENTATIONS] [-t | --tensorboard | --no-tensorboard]\n", " [--wandb-api-token WANDB_API_TOKEN]\n", " [--wandb-project WANDB_PROJECT] [--wandb-entity WANDB_ENTITY]\n", " [--log-messages | --no-log-messages] [--nb-stacks NB_STACKS]\n", " [--with-y-hist | --no-with-y-hist] [--balance | --no-balance]\n", " [--version-data | --no-version-data]\n", " [--post-opt | --no-post-opt] [--fill-gaps-min FILL_GAPS_MIN]\n", " [--fill-gaps-max FILL_GAPS_MAX]\n", " [--fill-gaps-steps FILL_GAPS_STEPS]\n", " [--min-len-min MIN_LEN_MIN] [--min-len-max MIN_LEN_MAX]\n", " [--min-len-steps MIN_LEN_STEPS]\n", " [--resnet-compute | --no-resnet-compute]\n", " [--resnet-train | --no-resnet-train]\n", "\n", "Train a DAS network.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --data-dir DATA_DIR Path to the directory or file with the dataset for training.\n", " Accepts npy-dirs (recommended), h5 files or zarr files.\n", " See documentation for how the dataset should be organized.\n", " -x X_SUFFIX, --x-suffix X_SUFFIX\n", " Select dataset used for training in the data_dir by suffix (\u001b[4my_\u001b[0m + X_SUFFIX).\n", " Defaults to '' (will use the standard data 'x')\n", " -y Y_SUFFIX, --y-suffix Y_SUFFIX\n", " Select dataset used as a training target in the data_dir by suffix (\u001b[4my_\u001b[0m + Y_SUFFIX).\n", " Song-type specific targets can be created with a training dataset,\n", " Defaults to '' (will use the standard target 'y')\n", " --save-dir SAVE_DIR Directory to save training outputs.\n", " The path of output files will constructed from the SAVE_DIR, an optional SAVE_PREFIX,\n", " and the time stamp of the start of training.\n", " Defaults to the current directory ('./').\n", " --save-prefix SAVE_PREFIX\n", " Prepend to timestamp.\n", " Name of files created will be start with SAVE_DIR/SAVE_PREFIX + \"_\" + TIMESTAMP\n", " or with SAVE_DIR/TIMESTAMP if SAVE_PREFIX is empty.\n", " Defaults to '' (empty).\n", " --save-name SAVE_NAME\n", " Append to prefix.\n", " Name of files created will be start with SAVE_DIR/SAVE_PREFIX + \"_\" + SAVE_NAME\n", " or with SAVE_DIR/SAVE_NAME if SAVE_PREFIX is empty.\n", " Defaults to the timestamp YYYYMMDD_hhmmss.\n", " --model-name MODEL_NAME\n", " Network architecture to use.\n", " See das.models for a description of all models.\n", " Defaults to \u001b[4mtcn\u001b[0m.\n", " --nb-filters NB_FILTERS\n", " Number of filters per layer.\n", " Defaults to 16.\n", " -k KERNEL_SIZE, --kernel-size KERNEL_SIZE\n", " Duration of the filters (=kernels) in samples.\n", " Defaults to 16.\n", " --nb-conv NB_CONV Number of TCN blocks in the network.\n", " Defaults to 3.\n", " --use-separable [USE_SEPARABLE ...]\n", " Specify which TCN blocks should use separable convolutions.\n", " Provide as a space-separated sequence of \"False\" or \"True.\n", " For instance: \"True False False\" will set the first block in a\n", " three-block (as given by nb_conv) network to use separable convolutions.\n", " Defaults to False (no block uses separable convolutions).\n", " --nb-hist NB_HIST Number of samples processed at once by the network (a.k.a chunk duration).\n", " Defaults to 1024 samples.\n", " -i, --ignore-boundaries, --no-ignore-boundaries\n", " Minimize edge effects by discarding predictions at the edges of chunks.\n", " Defaults to True.\n", " --batch-norm, --no-batch-norm\n", " Batch normalize.\n", " Defaults to True.\n", " --nb-pre-conv NB_PRE_CONV\n", " Adds fronted with downsampling. The downsampling factor is \u001b[4m2**nb_pre_conv\u001b[0m.\n", " The type of frontend depends on the model:\n", " if model is \u001b[4mtcn\u001b[0m: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.\n", " if model is \u001b[4mtcn_tcn\u001b[0m: adds a frontend of N TCN blocks to the TCN.\n", " if model is \u001b[4mtcn_stft\u001b[0m: adds a trainable STFT frontend.\n", " Defaults to 0 (no frontend, no downsampling).\n", " --pre-nb-dft PRE_NB_DFT\n", " Duration of filters (in samples) for the STFT frontend.\n", " Number of filters is pre_nb_dft // 2 + 1.\n", " Defaults to 64.\n", " --pre-kernel-size PRE_KERNEL_SIZE\n", " Duration of filters (=kernels) in samples in the pre-processing TCN.\n", " Defaults to 3.\n", " --pre-nb-filters PRE_NB_FILTERS\n", " Number of filters per layer in the pre-processing TCN.\n", " Defaults to 16.\n", " --pre-nb-conv PRE_NB_CONV\n", " --upsample, --no-upsample\n", " whether or not to restore the model output to the input samplerate.\n", " Should generally be True during training and evaluation but my speed up inference.\n", " Defaults to True.\n", " --dilations [DILATIONS ...]\n", " List of dilation rate, defaults to [1, 2, 4, 8, 16] (5 layer with 2x dilation per TCN block)\n", " --nb-lstm-units NB_LSTM_UNITS\n", " If >0, adds LSTM with \u001b[4mnb_lstm_units\u001b[0m LSTM units to the output of the stack of TCN blocks.\n", " Defaults to 0 (no LSTM layer).\n", " --verbose VERBOSE Verbosity of training output (0 - no output during training, 1 - progress bar, 2 - one line per epoch).\n", " Defaults to 2.\n", " --batch-size BATCH_SIZE\n", " Batch size\n", " Defaults to 32.\n", " --nb-epoch NB_EPOCH Maximal number of training epochs.\n", " Training will stop early if validation loss did not decrease in the last 20 epochs.\n", " Defaults to 400.\n", " --learning-rate LEARNING_RATE\n", " Learning rate of the model. Defaults should work in most cases.\n", " Values typically range between 0.1 and 0.00001.\n", " If None, uses model specific defaults: \u001b[4mtcn\u001b[0m 0.0001, \u001b[4mtcn_stft\u001b[0m and \u001b[4mtcn_tcn\u001b[0m 0.0005.\n", " Defaults to None.\n", " --reduce-lr, --no-reduce-lr\n", " Reduce learning rate when the validation loss plateaus.\n", " Defaults to False.\n", " --reduce-lr-patience REDUCE_LR_PATIENCE\n", " Number of epochs w/o a reduction in validation loss after which\n", " to trigger a reduction in learning rate.\n", " Defaults to 5 epochs.\n", " --fraction-data FRACTION_DATA\n", " Fraction of training and validation data to use.\n", " Defaults to 1.0.\n", " Overriden by setting all four \u001b[3m_sample_\u001b[0m args.\n", " --first-sample-train FIRST_SAMPLE_TRAIN\n", " Defaults to 0 (first sample in training dataset).\n", " Note 1: all four \u001b[3m_sample_\u001b[0m args must be set - otherwise they will be ignored.\n", " Note 2: Overrides fraction_data.\n", " --last-sample-train LAST_SAMPLE_TRAIN\n", " Defaults to None (use last sample in training dataset).\n", " --first-sample-val FIRST_SAMPLE_VAL\n", " Defaults to 0 (first sample in validation dataset).\n", " --last-sample-val LAST_SAMPLE_VAL\n", " Defaults to None (use last sample in validation dataset).\n", " --seed SEED Random seed to reproducibly select fractions of the data.\n", " Defaults to None (no seed).\n", " --batch-level-subsampling, --no-batch-level-subsampling\n", " Select fraction of data for training from random subset of shuffled batches.\n", " If False, select a continuous chunk of the recording.\n", " Defaults to False.\n", " -a AUGMENTATIONS, --augmentations AUGMENTATIONS\n", " yaml file with augmentations. Defaults to None (no augmentations).\n", " -t, --tensorboard, --no-tensorboard\n", " Write tensorboard logs to save_dir. Defaults to False.\n", " --wandb-api-token WANDB_API_TOKEN\n", " API token for logging to wandb.\n", " Defaults to None (no logging to wandb).\n", " --wandb-project WANDB_PROJECT\n", " Project to log to for wandb.\n", " Defaults to None (no logging to wandb).\n", " --wandb-entity WANDB_ENTITY\n", " Entity to log to for wandb.\n", " Defaults to None (no logging to wandb).\n", " --log-messages, --no-log-messages\n", " Sets terminal logging level to INFO.\n", " Defaults to False (will follow existing settings).\n", " --nb-stacks NB_STACKS\n", " Unused if model name is \u001b[4mtcn\u001b[0m, \u001b[4mtcn_tcn\u001b[0m, or \u001b[4mtcn_stft\u001b[0m. Defaults to 2.\n", " --with-y-hist, --no-with-y-hist\n", " Unused if model name is \u001b[4mtcn\u001b[0m, \u001b[4mtcn_tcn\u001b[0m, or \u001b[4mtcn_stft\u001b[0m. Defaults to True.\n", " --balance, --no-balance\n", " Balance data. Weights class-wise errors by the inverse of the class frequencies.\n", " Defaults to False.\n", " --version-data, --no-version-data\n", " Save MD5 hash of the data_dir to log and params.yaml.\n", " Defaults to True (set to False for large datasets since it can be slow).\n", " --post-opt, --no-post-opt\n", " Optimize post processing (delete short detections, fill brief gaps).\n", " Defaults to False.\n", " --fill-gaps-min FILL_GAPS_MIN\n", " Defaults to 0.0005 seconds.\n", " --fill-gaps-max FILL_GAPS_MAX\n", " Defaults to 1 second.\n", " --fill-gaps-steps FILL_GAPS_STEPS\n", " Defaults to 20.\n", " --min-len-min MIN_LEN_MIN\n", " Defaults to 0.0005 seconds.\n", " --min-len-max MIN_LEN_MAX\n", " Defaults to 1 second.\n", " --min-len-steps MIN_LEN_STEPS\n", " Defaults to 20.\n", " --resnet-compute, --no-resnet-compute\n", " Defaults to False.\n", " --resnet-train, --no-resnet-train\n", " Defaults to False.\n", "\u001b[0m" ] } ], "source": [ "!das train --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tune" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: das tune [-h] --data-dir DATA_DIR [-x X_SUFFIX] [-y Y_SUFFIX]\n", " [--save-dir SAVE_DIR] [--save-prefix SAVE_PREFIX]\n", " [--save-name SAVE_NAME] [-m MODEL_NAME]\n", " [--nb-filters NB_FILTERS] [-k KERNEL_SIZE] [--nb-conv NB_CONV]\n", " [--use-separable [USE_SEPARABLE ...]] [--nb-hist NB_HIST]\n", " [-i | --ignore-boundaries | --no-ignore-boundaries]\n", " [--batch-norm | --no-batch-norm] [--nb-pre-conv NB_PRE_CONV]\n", " [--pre-nb-dft PRE_NB_DFT] [--pre-kernel-size PRE_KERNEL_SIZE]\n", " [--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]\n", " [--upsample | --no-upsample] [--dilations [DILATIONS ...]]\n", " [--nb-lstm-units NB_LSTM_UNITS] [--verbose VERBOSE]\n", " [--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]\n", " [--learning-rate LEARNING_RATE] [--reduce-lr | --no-reduce-lr]\n", " [--reduce-lr-patience REDUCE_LR_PATIENCE] [-f FRACTION_DATA]\n", " [--seed SEED]\n", " [--batch-level-subsampling | --no-batch-level-subsampling]\n", " [-a AUGMENTATIONS] [--tensorboard | --no-tensorboard]\n", " [--wandb-api-token WANDB_API_TOKEN]\n", " [--wandb-project WANDB_PROJECT] [--wandb-entity WANDB_ENTITY]\n", " [--log-messages | --no-log-messages] [--nb-stacks NB_STACKS]\n", " [--with-y-hist | --no-with-y-hist] [--balance | --no-balance]\n", " [--version-data | --no-version-data]\n", " [--tune-config TUNE_CONFIG] [--nb-tune-trials NB_TUNE_TRIALS]\n", "\n", "Tune the hyperparameters of a DAS network.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --data-dir DATA_DIR Path to the directory or file with the dataset for training.\n", " Accepts npy-dirs (recommended), h5 files or zarr files.\n", " See documentation for how the dataset should be organized.\n", " -x X_SUFFIX, --x-suffix X_SUFFIX\n", " Select dataset used for training in the data_dir by suffix (\u001b[4my_\u001b[0m + X_SUFFIX).\n", " Defaults to '' (will use the standard data 'x')\n", " -y Y_SUFFIX, --y-suffix Y_SUFFIX\n", " Select dataset used as a training target in the data_dir by suffix (\u001b[4my_\u001b[0m + Y_SUFFIX).\n", " Song-type specific targets can be created with a training dataset,\n", " Defaults to '' (will use the standard target 'y')\n", " --save-dir SAVE_DIR Directory to save training outputs.\n", " The path of output files will constructed from the SAVE_DIR, an optional SAVE_PREFIX,\n", " and the time stamp of the start of training.\n", " Defaults to the current directory ('./').\n", " --save-prefix SAVE_PREFIX\n", " Prepend to timestamp.\n", " Name of files created will be start with SAVE_DIR/SAVE_PREFIX + \"_\" + TIMESTAMP\n", " or with SAVE_DIR/TIMESTAMP if SAVE_PREFIX is empty.\n", " Defaults to '' (empty).\n", " --save-name SAVE_NAME\n", " Append to prefix.\n", " Name of files created will be start with SAVE_DIR/SAVE_PREFIX + \"_\" + SAVE_NAME\n", " or with SAVE_DIR/SAVE_NAME if SAVE_PREFIX is empty.\n", " Defaults to TIMESTAMP.\n", " -m MODEL_NAME, --model-name MODEL_NAME\n", " Network architecture to use.\n", " Use \u001b[4mtcn\u001b[0m (TCN) or \u001b[4mtcn_stft\u001b[0m (TCN with STFT frontend).\n", " See das.models for a description of all models.\n", " Defaults to \u001b[4mtcn\u001b[0m.\n", " --nb-filters NB_FILTERS\n", " Number of filters per layer.\n", " Defaults to 16.\n", " -k KERNEL_SIZE, --kernel-size KERNEL_SIZE\n", " Duration of the filters (=kernels) in samples.\n", " Defaults to 16.\n", " --nb-conv NB_CONV Number of TCN blocks in the network.\n", " Defaults to 3.\n", " --use-separable [USE_SEPARABLE ...]\n", " Specify which TCN blocks should use separable convolutions.\n", " Provide as a space-separated sequence of \"False\" or \"True.\n", " For instance: \"True False False\" will set the first block in a\n", " three-block (as given by nb_conv) network to use separable convolutions.\n", " Defaults to False (no block uses separable convolutions).\n", " --nb-hist NB_HIST Number of samples processed at once by the network (a.k.a chunk duration).\n", " Defaults to 1024 samples.\n", " -i, --ignore-boundaries, --no-ignore-boundaries\n", " Minimize edge effects by discarding predictions at the edges of chunks.\n", " Defaults to True.\n", " --batch-norm, --no-batch-norm\n", " Batch normalize.\n", " Defaults to True.\n", " --nb-pre-conv NB_PRE_CONV\n", " Adds fronted with downsampling. The downsampling factor is \u001b[4m2**nb_pre_conv\u001b[0m.\n", " The type of frontend depends on the model:\n", " if model is \u001b[4mtcn\u001b[0m: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.\n", " if model is \u001b[4mtcn_tcn\u001b[0m: adds a frontend of N TCN blocks to the TCN.\n", " if model is \u001b[4mtcn_stft\u001b[0m: adds a trainable STFT frontend.\n", " Defaults to 0 (no frontend, no downsampling).\n", " --pre-nb-dft PRE_NB_DFT\n", " Duration of filters (in samples) for the STFT frontend.\n", " Number of filters is pre_nb_dft // 2 + 1.\n", " Defaults to 64.\n", " --pre-kernel-size PRE_KERNEL_SIZE\n", " Duration of filters (=kernels) in samples in the pre-processing TCN.\n", " Defaults to 3.\n", " --pre-nb-filters PRE_NB_FILTERS\n", " Number of filters per layer in the pre-processing TCN.\n", " Defaults to 16.\n", " --pre-nb-conv PRE_NB_CONV\n", " --upsample, --no-upsample\n", " whether or not to restore the model output to the input samplerate.\n", " Should generally be True during training and evaluation but my speed up inference.\n", " Defaults to True.\n", " --dilations [DILATIONS ...]\n", " List of dilation rate, defaults to [1, 2, 4, 8, 16] (5 layer with 2x dilation per TCN block)\n", " --nb-lstm-units NB_LSTM_UNITS\n", " If >0, adds LSTM with \u001b[4mnb_lstm_units\u001b[0m LSTM units to the output of the stack of TCN blocks.\n", " Defaults to 0 (no LSTM layer).\n", " --verbose VERBOSE Verbosity of training output (0 - no output during training, 1 - progress bar, 2 - one line per epoch).\n", " Defaults to 2.\n", " --batch-size BATCH_SIZE\n", " Batch size\n", " Defaults to 32.\n", " --nb-epoch NB_EPOCH Maximal number of training epochs.\n", " Training will stop early if validation loss did not decrease in the last 20 epochs.\n", " Defaults to 400.\n", " --learning-rate LEARNING_RATE\n", " Learning rate of the model. Defaults should work in most cases.\n", " Values typically range between 0.1 and 0.00001.\n", " If None, uses model specific defaults: \u001b[4mtcn\u001b[0m 0.0001, \u001b[4mtcn_stft\u001b[0m and \u001b[4mtcn_tcn\u001b[0m 0.0005.\n", " Defaults to None.\n", " --reduce-lr, --no-reduce-lr\n", " Reduce learning rate when the validation loss plateaus.\n", " Defaults to False.\n", " --reduce-lr-patience REDUCE_LR_PATIENCE\n", " Number of epochs w/o a reduction in validation loss after which\n", " to trigger a reduction in learning rate.\n", " Defaults to 5 epochs.\n", " -f FRACTION_DATA, --fraction-data FRACTION_DATA\n", " Fraction of training and validation data to use.\n", " Defaults to 1.0.\n", " --seed SEED Random seed to reproducibly select fractions of the data.\n", " Defaults to None (no seed).\n", " --batch-level-subsampling, --no-batch-level-subsampling\n", " Select fraction of data for training from random subset of shuffled batches.\n", " If False, select a continuous chunk of the recording.\n", " Defaults to False.\n", " -a AUGMENTATIONS, --augmentations AUGMENTATIONS\n", " --tensorboard, --no-tensorboard\n", " Write tensorboard logs to save_dir. Defaults to False.\n", " --wandb-api-token WANDB_API_TOKEN\n", " API token for logging to wandb.\n", " Defaults to None (no logging to wandb).\n", " --wandb-project WANDB_PROJECT\n", " Project to log to for wandb.\n", " Defaults to None (no logging to wandb).\n", " --wandb-entity WANDB_ENTITY\n", " Entity (user or team) to log to for wandb.\n", " Defaults to None (no logging to wandb).\n", " --log-messages, --no-log-messages\n", " Sets terminal logging level to INFO.\n", " Defaults to False (will follow existing settings).\n", " --nb-stacks NB_STACKS\n", " Unused if model name is \u001b[4mtcn\u001b[0m, \u001b[4mtcn_tcn\u001b[0m, or \u001b[4mtcn_stft\u001b[0m. Defaults to 2.\n", " --with-y-hist, --no-with-y-hist\n", " Unused if model name is \u001b[4mtcn\u001b[0m, \u001b[4mtcn_tcn\u001b[0m, or \u001b[4mtcn_stft\u001b[0m. Defaults to True.\n", " --balance, --no-balance\n", " Balance data. Weights class-wise errors by the inverse of the class frequencies.\n", " Defaults to False.\n", " --version-data, --no-version-data\n", " Save MD5 hash of the data_dir to log and params.yaml.\n", " Defaults to True (set to False for large datasets since it can be slow).\n", " --tune-config TUNE_CONFIG\n", " Yaml file with key:value pairs defining the search space for tuning.\n", " Keys are parameter names, values are lists of possible parameter values.\n", " --nb-tune-trials NB_TUNE_TRIALS\n", " Number of model variants to test during hyper parameter tuning. Defaults to 1_000.\n", "\u001b[0m" ] } ], "source": [ "!das tune --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Predict" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: das predict [-h] [--save-filename SAVE_FILENAME]\n", " [--save-format SAVE_FORMAT] [-v VERBOSE] [-b BATCH_SIZE]\n", " [--event-thres EVENT_THRES] [--event-dist EVENT_DIST]\n", " [--event-dist-min EVENT_DIST_MIN]\n", " [--event-dist-max EVENT_DIST_MAX]\n", " [--segment-thres SEGMENT_THRES]\n", " [--segment-use-optimized | --no-segment-use-optimized]\n", " [--segment-minlen SEGMENT_MINLEN]\n", " [--segment-fillgap SEGMENT_FILLGAP]\n", " [-r | --resample | --no-resample]\n", " path model_save_name\n", "\n", "Predict song labels for a wav file or a folder of wav files.\n", "\n", "Saves hdf5 files with keys: events, segments, class_probabilities\n", "OR csv files with columns: label/start_seconds/stop_seconds\n", "\n", "positional arguments:\n", " path Path to a single WAV file with the audio data or to a folder with WAV files.\n", " model_save_name Stem of the path for the model (and parameters). File to load will be MODEL_SAVE_NAME + _model.h5.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --save-filename SAVE_FILENAME\n", " Path to save annotations to.\n", " If omitted, will construct save_filename by\n", " stripping the extension from recording_filename and adding '_das.h5' or '_annotations.csv'.\n", " Will be ignored if \u001b[4mpath\u001b[0m is a folder.\n", " --save-format SAVE_FORMAT\n", " 'csv' or 'h5'.\n", " csv: tabular text file with label, start and end seconds for each predicted song.\n", " h5: same information as in csv plus confidence values for each sample and song type.\n", " Defaults to 'csv'.\n", " -v VERBOSE, --verbose VERBOSE\n", " Display progress bar during prediction. Defaults to 1.\n", " -b BATCH_SIZE, --batch-size BATCH_SIZE\n", " Number of chunks processed at once.\n", " Defaults to None (the default used during training).\n", " --event-thres EVENT_THRES\n", " Confidence threshold for detecting events. Range 0..1. Defaults to 0.5.\n", " --event-dist EVENT_DIST\n", " Minimal distance between adjacent events during thresholding.\n", " Prevents detecting duplicate events when the confidence trace is a little noisy.\n", " Defaults to 0.01.\n", " --event-dist-min EVENT_DIST_MIN\n", " MINimal inter-event interval for the event filter run during post processing.\n", " Defaults to 0.\n", " --event-dist-max EVENT_DIST_MAX\n", " MAXimal inter-event interval for the event filter run during post processing.\n", " Defaults to None (no upper limit).\n", " --segment-thres SEGMENT_THRES\n", " Confidence threshold for detecting segments. Range 0..1. Defaults to 0.5.\n", " --segment-use-optimized, --no-segment-use-optimized\n", " Use minlen and fillgap values from param file if they exist.\n", " If segment_minlen and segment_fillgap are provided,\n", " then they will override the values from the param file.\n", " Defaults to True.\n", " --segment-minlen SEGMENT_MINLEN\n", " Minimal duration of a segment used for filtering out spurious detections.\n", " Defaults to None (keep all segments).\n", " --segment-fillgap SEGMENT_FILLGAP\n", " Gap between adjacent segments to be filled. Useful for correcting brief lapses.\n", " Defaults to None (do not fill gaps).\n", " -r, --resample, --no-resample\n", " Resample audio data to the rate expected by the model. Defaults to True.\n", "\u001b[0m" ] } ], "source": [ "!das predict --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Version information\n", "The output of this will depend on the specifics of your system and installation." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:das.cli: macOS-12.4-arm64-arm-64bit\n", "INFO:das.cli: DAS v0.26.9\n", "INFO:das.cli: GUI is available.\n", "INFO:das.cli: xarray-behave v0.33.1\n", "INFO:das.cli: pyqtgraph v0.12.4\n", "INFO:das.cli: PyQt5 vNone\n", "INFO:das.cli: Qt vNone\n", "INFO:das.cli:\n", "INFO:das.cli: tensorflow v2.8.0\n", "INFO:das.cli: keras v2.8.0\n", "INFO:das.cli: GPU is available.\n", "INFO:das.cli:\n", "INFO:das.cli: python v3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33) \n", "[Clang 13.0.1 ]\n", "INFO:das.cli: pandas v1.4.2\n", "INFO:das.cli: numpy v1.22.4\n", "INFO:das.cli: h5py v3.6.0\n", "INFO:das.cli: scipy v1.8.1\n", "INFO:das.cli: scikit-learn v1.1.1\n", "INFO:das.cli: xarray v2022.3.0\n", "\u001b[0m" ] } ], "source": [ "!das version" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.13 ('dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "vscode": { "interpreter": { "hash": "97ead24bb190f3c846c8ccaf2b3dc07d0b052bfb8e69b5aeffb27504f0f68adb" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }