das.train#
Code for training networks.
- das.train.train(*, data_dir: str, x_suffix: str = '', y_suffix: str = '', save_dir: str = './', save_prefix: Optional[str] = None, save_name: Optional[str] = None, model_name: str = 'tcn', nb_filters: int = 16, nb_kernels: Optional[int] = None, kernel_size: int = 16, nb_conv: int = 3, use_separable: List[bool] = False, nb_hist: int = 1024, ignore_boundaries: bool = True, batch_norm: bool = True, nb_pre_conv: int = 0, pre_nb_conv: Optional[int] = None, pre_nb_dft: int = 64, pre_kernel_size: int = 3, pre_nb_filters: int = 16, pre_nb_kernels: Optional[int] = None, upsample: bool = True, dilations: Optional[List[int]] = None, nb_lstm_units: int = 0, verbose: int = 2, batch_size: int = 32, nb_epoch: int = 400, learning_rate: Optional[float] = None, reduce_lr: bool = False, reduce_lr_patience: int = 5, fraction_data: Optional[float] = None, first_sample_train: Optional[int] = 0, last_sample_train: Optional[int] = None, first_sample_val: Optional[int] = 0, last_sample_val: Optional[int] = None, seed: Optional[int] = None, batch_level_subsampling: bool = False, augmentations: Optional[str] = None, tensorboard: bool = False, wandb_api_token: Optional[str] = None, wandb_project: Optional[str] = None, wandb_entity: Optional[str] = None, log_messages: bool = False, nb_stacks: int = 2, with_y_hist: bool = True, balance: bool = False, version_data: bool = True, post_opt: bool = False, post_opt_nb_workers: int = - 1, post_opt_fill_gaps_min: float = 0.0005, post_opt_fill_gaps_max: float = 1.0, post_opt_fill_gaps_steps: int = 20, post_opt_min_len_min: float = 0.0005, post_opt_min_len_max: float = 1.0, post_opt_min_len_steps: int = 20, morph_kernel_duration: int = 32, morph_nb_kernels: int = 0, resnet_compute: bool = False, resnet_train: bool = False, tmse_weight: float = 0.0, _qt_progress: bool = False) Tuple[keras.src.engine.training.Model, Dict[str, Any], keras.src.callbacks.History] [source]#
Train a DAS network.
- Parameters
data_dir (str) – Path to the directory or file with the dataset for training. Accepts npy-dirs (recommended), h5 files or zarr files. See documentation for how the dataset should be organized.
x_suffix (str) – Select dataset used for training in the data_dir by suffix (
y_
+ X_SUFFIX). Defaults to ‘’ (will use the standard data ‘x’)y_suffix (str) – Select dataset used as a training target in the data_dir by suffix (
y_
+ Y_SUFFIX). Song-type specific targets can be created with a training dataset, Defaults to ‘’ (will use the standard target ‘y’)save_dir (str) – Directory to save training outputs. The path of output files will constructed from the SAVE_DIR, an optional SAVE_PREFIX, and the time stamp of the start of training. Defaults to the current directory (‘./’).
save_prefix (Optional[str]) – Prepend to timestamp. Name of files created will be start with SAVE_DIR/SAVE_PREFIX + “_” + TIMESTAMP or with SAVE_DIR/TIMESTAMP if SAVE_PREFIX is empty. Defaults to ‘’ (empty).
save_name (Optional[str]) – Append to prefix. Name of files created will be start with SAVE_DIR/SAVE_PREFIX + “_” + SAVE_NAME or with SAVE_DIR/SAVE_NAME if SAVE_PREFIX is empty. Defaults to the timestamp YYYYMMDD_hhmmss.
model_name (str) – Network architecture to use. See das.models for a description of all models. Defaults to
tcn
.nb_filters (int) – Number of filters per layer. Defaults to 16.
kernel_size (int) – Duration of the filters (=kernels) in samples. Defaults to 16.
nb_conv (int) – Number of TCN blocks in the network. Defaults to 3.
use_separable (List[bool]) – Specify which TCN blocks should use separable convolutions. Provide as a space-separated sequence of “False” or “True. For instance: “True False False” will set the first block in a three-block (as given by nb_conv) network to use separable convolutions. Defaults to False (no block uses separable convolutions).
nb_hist (int) – Number of samples processed at once by the network (a.k.a chunk duration). Defaults to 1024 samples.
ignore_boundaries (bool) – Minimize edge effects by discarding predictions at the edges of chunks. Defaults to True.
batch_norm (bool) – Batch normalize. Defaults to True.
nb_pre_conv (int) – Adds fronted with downsampling. The downsampling factor is
2**nb_pre_conv
. The type of frontend depends on the model: if model istcn
: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN. if model istcn_tcn
: adds a frontend of N TCN blocks to the TCN. if model istcn_stft
: adds a trainable STFT frontend. Defaults to 0 (no frontend, no downsampling).pre_nb_dft (int) – Duration of filters (in samples) for the STFT frontend. Number of filters is pre_nb_dft // 2 + 1. Defaults to 64.
pre_nb_filters (int) – Number of filters per layer in the pre-processing TCN. Defaults to 16. Deprecated.
pre_kernel_size (int) – Duration of filters (=kernels) in samples in the pre-processing TCN. Defaults to 3. Deprecated.
upsample (bool) – whether or not to restore the model output to the input samplerate. Should generally be True during training and evaluation but my speed up inference. Defaults to True.
dilations (List[int]) – List of dilation rate, defaults to [1, 2, 4, 8, 16] (5 layer with 2x dilation per TCN block)
nb_lstm_units (int) – If >0, adds LSTM with
nb_lstm_units
LSTM units to the output of the stack of TCN blocks. Defaults to 0 (no LSTM layer).verbose (int) – Verbosity of training output (0 - no output during training, 1 - progress bar, 2 - one line per epoch). Defaults to 2.
batch_size (int) – Batch size Defaults to 32.
nb_epoch (int) – Maximal number of training epochs. Training will stop early if validation loss did not decrease in the last 20 epochs. Defaults to 400.
learning_rate (Optional[float]) – Learning rate of the model. Defaults should work in most cases. Values typically range between 0.1 and 0.00001. If None, uses model specific defaults:
tcn
0.0001,tcn_stft
andtcn_tcn
0.0005. Defaults to None.reduce_lr (bool) – Reduce learning rate when the validation loss plateaus. Defaults to False.
reduce_lr_patience (int) – Number of epochs w/o a reduction in validation loss after which to trigger a reduction in learning rate. Defaults to 5 epochs.
fraction_data (Optional[float]) – Fraction of training and validation data to use. Defaults to 1.0. Overriden by setting all four _sample_ args.
first_sample_train (Optional[int]) – Defaults to 0 (first sample in training dataset). Note 1: all four _sample_ args must be set - otherwise they will be ignored. Note 2: Overrides fraction_data.
last_sample_train (Optional[int]) – Defaults to None (use last sample in training dataset).
first_sample_val (Optional[int]) – Defaults to 0 (first sample in validation dataset).
last_sample_val (Optional[int]) – Defaults to None (use last sample in validation dataset).
seed (Optional[int]) – Random seed to reproducibly select fractions of the data. Defaults to None (no seed).
batch_level_subsampling (bool) – Select fraction of data for training from random subset of shuffled batches. If False, select a continuous chunk of the recording. Defaults to False.
augmentations (Optional[str]) – Path to yaml file or dictionary with the specification of augmentations. Defaults to None (no augmentations).
tensorboard (bool) – Write tensorboard logs to save_dir. Defaults to False.
wandb_api_token (Optional[str]) – API token for logging to wandb. Defaults to None (no logging to wandb).
wandb_project (Optional[str]) – Project to log to for wandb. Defaults to None (no logging to wandb).
wandb_entity (Optional[str]) – Entity to log to for wandb. Defaults to None (no logging to wandb).
log_messages (bool) – Sets terminal logging level to INFO. Defaults to False (will follow existing settings).
nb_stacks (int) – Unused if model name is
tcn
,tcn_tcn
, ortcn_stft
. Defaults to 2.with_y_hist (bool) – Unused if model name is
tcn
,tcn_tcn
, ortcn_stft
. Defaults to True.balance (bool) – Balance data. Weights class-wise errors by the inverse of the class frequencies. Defaults to False.
version_data (bool) – Save MD5 hash of the data_dir to log and params.yaml. Defaults to True (set to False for large datasets since it can be slow).
post_opt (bool) – Optimize post processing (delete short detections, fill brief gaps). Defaults to False.
post_opt_nb_workers (int) – Number of parallel processes during post_opt. Defaults to -1 (same number as cores).
post_opt_fill_gaps_min (float) – Defaults to 0.0005 seconds.
post_opt_fill_gaps_max (float) – Defaults to 1 second.
post_opt_fill_gaps_steps (int) – Defaults to 20.
post_opt_min_len_min (float) – Defaults to 0.0005 seconds.
post_opt_min_len_max (float) – Defaults to 1 second.
post_opt_min_len_steps (int) – Defaults to 20.
morph_nb_kernels (int) – Defaults to 0 (do not add morphological kernels).
morph_kernel_duration (int) – Defaults to 32.
resnet_compute (bool) – Defaults to False.
resnet_train (bool) – Defaults to False.
tmse_weight (float) – Defaults to 0.0.
- Returns
model (keras.Model) params (Dict[str, Any]) history (keras.callbacks.History)