{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Train \n", "The network can be trained using three interfaces:\n", "\n", "- python, via `das.train.train`\n", "- the command-line interface `das train`.\n", "- the GUI - see the [GUI tutorial](/tutorials_gui/train)\n", "\n", "Training will:\n", "\n", "- load train/val/test data form a dataset\n", "- initialize the network\n", "- save all parameters for reproducibility\n", "- train the network and save the best network to disk\n", "- run inference and evaluate the network using the test data.\n", "\n", "The names of files created during training start with an optional prefix and the time stamp of the start time of training, as in `my-awesome-prefix_20192310_091032`. Typically, three files are created:\n", "- `*_params.yaml` - training parameters etc.\n", "- `*_model.h5` - model architecture and weights\n", "- `*_results.h5` - predictions and evaluation results for the test set (only created if the training dataset contains a test set)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training using python\n", "Training is done using the `train` function in the `das.train` module:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function train in module das.train:\n", "\n", "train(*, data_dir: str, y_suffix: str = '', save_dir: str = './', save_prefix: Union[str, NoneType] = None, model_name: str = 'tcn', nb_filters: int = 16, kernel_size: int = 16, nb_conv: int = 3, use_separable: List[bool] = False, nb_hist: int = 1024, ignore_boundaries: bool = True, batch_norm: bool = True, nb_pre_conv: int = 0, pre_nb_dft: int = 64, pre_kernel_size: int = 3, pre_nb_filters: int = 16, pre_nb_conv: int = 2, nb_lstm_units: int = 0, verbose: int = 2, batch_size: int = 32, nb_epoch: int = 400, learning_rate: Union[float, NoneType] = None, reduce_lr: bool = False, reduce_lr_patience: int = 5, fraction_data: Union[float, NoneType] = None, seed: Union[int, NoneType] = None, batch_level_subsampling: bool = False, tensorboard: bool = False, neptune_api_token: Union[str, NoneType] = None, neptune_project: Union[str, NoneType] = None, log_messages: bool = False, nb_stacks: int = 2, with_y_hist: bool = True, x_suffix: str = '', balance: bool = False, version_data: bool = True, _qt_progress: bool = False) -> Tuple[keras.engine.training.Model, Dict[str, Any]]\n", " Train a DeepSS network.\n", " \n", " Args:\n", " data_dir (str): Path to the directory or file with the dataset for training.\n", " Accepts npy-dirs (recommended), h5 files or zarr files.\n", " See documentation for how the dataset should be organized.\n", " y_suffix (str): Select training target by suffix.\n", " Song-type specific targets can be created with a training dataset,\n", " Defaults to '' (will use the standard target 'y')\n", " save_dir (str): Directory to save training outputs.\n", " The path of output files will constructed from the SAVE_DIR, an optional prefix, and the time stamp of the start of training.\n", " Defaults to current directory ('./').\n", " save_prefix (Optional[str]): Prepend to timestamp.\n", " Name of files created will be SAVE_DIR/SAVE_PREFIX + \"_\" + TIMESTAMP\n", " or SAVE_DIR/ TIMESTAMP if SAVE_PREFIX is empty.\n", " Defaults to '' (empty).\n", " model_name (str): Network architecture to use.\n", " Use \"tcn\" (TCN) or \"tcn_stft\" (TCN with STFT frontend).\n", " See das.models for a description of all models.\n", " Defaults to 'tcn'.\n", " nb_filters (int): Number of filters per layer.\n", " Defaults to 16.\n", " kernel_size (int): Duration of the filters (=kernels) in samples.\n", " Defaults to 16.\n", " nb_conv (int): Number of TCN blocks in the network.\n", " Defaults to 3.\n", " use_separable (List[bool]): Specify which TCN blocks should use separable convolutions.\n", " Provide as a space-separated sequence of \"False\" or \"True.\n", " For instance: \"True False False\" will set the first block in a\n", " three-block (as given by nb_conv) network to use separable convolutions.\n", " Defaults to False (no block uses separable convolution).\n", " nb_hist (int): Number of samples processed at once by the network (a.k.a chunk size).\n", " Defaults to 1024.\n", " ignore_boundaries (bool): Minimize edge effects by discarding predictions at the edges of chunks.\n", " Defaults to True.\n", " batch_norm (bool): Batch normalize.\n", " Defaults to True.\n", " nb_pre_conv (int): Downsampling rate. Adds downsampling frontend if not 0.\n", " TCN_TCN: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.\n", " TCN_STFT: adds a trainable STFT frontend.\n", " Defaults to 0 (no frontend).\n", " pre_nb_dft (int): Number of filters (roughly corresponding to filters) in the STFT frontend.\n", " Defaults to 64.\n", " pre_nb_filters (int): Number of filters per layer in the pre-processing TCN.\n", " Defaults to 16.\n", " pre_kernel_size (int): Duration of filters (=kernels) in samples in the pre-processing TCN.\n", " Defaults to 3.\n", " nb_lstm_units (int): If >0, adds LSTM with given number of units to the output of the stack of TCN blocks.\n", " Defaults to 0 (no LSTM layer).\n", " verbose (int): Verbosity of training output (0 - no output(?), 1 - progress bar, 2 - one line per epoch).\n", " Defaults to 2.\n", " batch_size (int): Batch size\n", " Defaults to 32.\n", " nb_epoch (int): Maximal number of training epochs.\n", " Training will stop early if validation loss did not decrease in the last 20 epochs.\n", " Defaults to 400.\n", " learning_rate (Optional[float]): Learning rate of the model. Defaults should work in most cases.\n", " Values typically range between 0.1 and 0.00001.\n", " If None, uses per model defaults: \"tcn\" 0.0001, \"tcn_stft\" 0.0005).\n", " Defaults to None.\n", " reduce_lr (bool): Reduce learning rate on plateau.\n", " Defaults to False.\n", " reduce_lr_patience (int): Number of epochs w/o a reduction in validation loss after which to trigger a reduction in learning rate.\n", " Defaults to 5.\n", " fraction_data (Optional[float]): Fraction of training and validation to use for training.\n", " Defaults to 1.0.\n", " seed (Optional[int]): Random seed to reproducible select fractions of the data.\n", " Defaults to None (no seed).\n", " batch_level_subsampling (bool): Select fraction of data for training from random subset of shuffled batches.\n", " If False, select a continuous chunk of the recording.\n", " Defaults to False.\n", " tensorboard (bool): Write tensorboard logs to save_dir. Defaults to False.\n", " neptune_api_token (Optional[str]): API token for logging to neptune.ai. Defaults to None (no logging).\n", " neptune_project (Optional[str]): Project to log to for neptune.ai. Defaults to None (no logging).\n", " log_messages (bool): Sets logging level to INFO.\n", " Defaults to False (will follow existing settings).\n", " nb_stacks (int): Unused if model name is \"tcn\" or \"tcn_stft\". Defaults to 2.\n", " with_y_hist (bool): Unused if model name is \"tcn\" or \"tcn_stft\". Defaults to True.\n", " x_suffix (str): Select specific training data based on suffix (e.g. x_suffix).\n", " Defaults to '' (will use the standard data 'x')\n", " balance (bool): Balance data. Weights class-wise errors by the inverse of the class frequencies.\n", " Defaults to False.\n", " version_data (bool): Save MD5 hash of the data_dir to log and params.yaml.\n", " Defaults to True (set to False for large datasets since it can be slow).\n", " \n", " Returns\n", " model (keras.Model)\n", " params (Dict[str, Any])\n", "\n" ] } ], "source": [ "import das.train\n", "help(das.train.train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calling the `train` function produce fairly verbose logging messages, to help with troubleshooting:\n", "\n", "- run time parameters\n", "- information on the size of the training and validation data\n", "- network architecture\n", "- training progress (training and validation loss)\n", "- after training, a classification report for the test data (if test data exist in the dataset)\n", "\n", "When done, `train` returns the trained keras model and a parameter dictionary with all arguments required to reproduce the model.\n", "\n", "To demonstrate the outputs of `train`, the following trains a small network on a small dataset to annotate pulse and sine song from _Drosophila melanogater_. Expected performance (f1-score) is about 75%." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:root:Loading data from tutorial_dataset.npy.\n", "INFO:root:Version of the data:\n", "INFO:root: MD5 hash of tutorial_dataset.npy is\n", "INFO:root: 34876fb30412a444e444a8e1f5312126\n", "INFO:root:Parameters:\n", "INFO:root:{'data_dir': 'tutorial_dataset.npy', 'y_suffix': '', 'save_dir': 'res', 'save_prefix': '', 'model_name': 'tcn', 'nb_filters': 16, 'kernel_size': 16, 'nb_conv': 3, 'use_separable': False, 'nb_hist': 256, 'ignore_boundaries': True, 'batch_norm': True, 'nb_pre_conv': 0, 'pre_nb_dft': 64, 'pre_kernel_size': 3, 'pre_nb_filters': 16, 'pre_nb_conv': 2, 'nb_lstm_units': 0, 'verbose': 1, 'batch_size': 32, 'nb_epoch': 4, 'reduce_lr': False, 'reduce_lr_patience': 5, 'fraction_data': None, 'seed': None, 'batch_level_subsampling': False, 'tensorboard': False, 'neptune_api_token': None, 'neptune_project': None, 'log_messages': True, 'nb_stacks': 2, 'with_y_hist': True, 'x_suffix': '', 'balance': False, 'version_data': True, 'sample_weight_mode': 'temporal', 'data_padding': 48, 'return_sequences': True, 'stride': 160, 'y_offset': 0, 'output_stride': 1, 'class_names': ['noise', 'pulse', 'sine'], 'class_names_pulse': ['noise', 'pulse'], 'class_names_sine': ['noise', 'sine'], 'class_types': ['segment', 'event', 'segment'], 'class_types_pulse': ['segment', 'event'], 'class_types_sine': ['segment', 'segment'], 'filename_endsample_test': [], 'filename_endsample_train': [], 'filename_endsample_val': [], 'filename_startsample_test': [], 'filename_startsample_train': [], 'filename_startsample_val': [], 'filename_train': [], 'filename_val': [], 'samplerate_x_Hz': 10000, 'samplerate_y_Hz': 10000, 'filename_test': [], 'data_hash': '34876fb30412a444e444a8e1f5312126', 'nb_freq': 1, 'nb_channels': 1, 'nb_classes': 3, 'first_sample_train': 0, 'last_sample_train': None, 'first_sample_val': 0, 'last_sample_val': None}\n", "INFO:root:Preparing data\n", "INFO:root:Training data:\n", "INFO:root: AudioSequence with 3992 batches each with 32 items.\n", " Total of 20440005 samples with\n", " each x=(1,) and\n", " each y=(3,)\n", "INFO:root:Validation data:\n", "INFO:root: AudioSequence with 812 batches each with 32 items.\n", " Total of 4160001 samples with\n", " each x=(1,) and\n", " each y=(3,)\n", "INFO:root:building network\n", "/Users/janc/miniconda3/lib/python3.8/site-packages/keras/optimizer_v2/optimizer_v2.py:355: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.\n", " warnings.warn(\n", "INFO:root:None\n", "INFO:root:Will save to res/20210924_220702.\n", "INFO:root:start training\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Model: \"TCN\"\n", "__________________________________________________________________________________________________\n", "Layer (type) Output Shape Param # Connected to \n", "==================================================================================================\n", "input_1 (InputLayer) [(None, 256, 1)] 0 \n", "__________________________________________________________________________________________________\n", "conv1d (Conv1D) (None, 256, 16) 32 input_1[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_1 (Conv1D) (None, 256, 16) 4112 conv1d[0][0] \n", "__________________________________________________________________________________________________\n", "activation (Activation) (None, 256, 16) 0 conv1d_1[0][0] \n", "__________________________________________________________________________________________________\n", "lambda (Lambda) (None, 256, 16) 0 activation[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d (SpatialDropo (None, 256, 16) 0 lambda[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_2 (Conv1D) (None, 256, 16) 272 spatial_dropout1d[0][0] \n", "__________________________________________________________________________________________________\n", "add (Add) (None, 256, 16) 0 conv1d[0][0] \n", " conv1d_2[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_3 (Conv1D) (None, 256, 16) 4112 add[0][0] \n", "__________________________________________________________________________________________________\n", "activation_1 (Activation) (None, 256, 16) 0 conv1d_3[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_1 (Lambda) (None, 256, 16) 0 activation_1[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_1 (SpatialDro (None, 256, 16) 0 lambda_1[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_4 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_1[0][0] \n", "__________________________________________________________________________________________________\n", "add_1 (Add) (None, 256, 16) 0 add[0][0] \n", " conv1d_4[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_5 (Conv1D) (None, 256, 16) 4112 add_1[0][0] \n", "__________________________________________________________________________________________________\n", "activation_2 (Activation) (None, 256, 16) 0 conv1d_5[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_2 (Lambda) (None, 256, 16) 0 activation_2[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_2 (SpatialDro (None, 256, 16) 0 lambda_2[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_6 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_2[0][0] \n", "__________________________________________________________________________________________________\n", "add_2 (Add) (None, 256, 16) 0 add_1[0][0] \n", " conv1d_6[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_7 (Conv1D) (None, 256, 16) 4112 add_2[0][0] \n", "__________________________________________________________________________________________________\n", "activation_3 (Activation) (None, 256, 16) 0 conv1d_7[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_3 (Lambda) (None, 256, 16) 0 activation_3[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_3 (SpatialDro (None, 256, 16) 0 lambda_3[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_8 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_3[0][0] \n", "__________________________________________________________________________________________________\n", "add_3 (Add) (None, 256, 16) 0 add_2[0][0] \n", " conv1d_8[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_9 (Conv1D) (None, 256, 16) 4112 add_3[0][0] \n", "__________________________________________________________________________________________________\n", "activation_4 (Activation) (None, 256, 16) 0 conv1d_9[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_4 (Lambda) (None, 256, 16) 0 activation_4[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_4 (SpatialDro (None, 256, 16) 0 lambda_4[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_10 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_4[0][0] \n", "__________________________________________________________________________________________________\n", "add_4 (Add) (None, 256, 16) 0 add_3[0][0] \n", " conv1d_10[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_11 (Conv1D) (None, 256, 16) 4112 add_4[0][0] \n", "__________________________________________________________________________________________________\n", "activation_5 (Activation) (None, 256, 16) 0 conv1d_11[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_5 (Lambda) (None, 256, 16) 0 activation_5[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_5 (SpatialDro (None, 256, 16) 0 lambda_5[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_12 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_5[0][0] \n", "__________________________________________________________________________________________________\n", "add_5 (Add) (None, 256, 16) 0 add_4[0][0] \n", " conv1d_12[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_13 (Conv1D) (None, 256, 16) 4112 add_5[0][0] \n", "__________________________________________________________________________________________________\n", "activation_6 (Activation) (None, 256, 16) 0 conv1d_13[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_6 (Lambda) (None, 256, 16) 0 activation_6[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_6 (SpatialDro (None, 256, 16) 0 lambda_6[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_14 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_6[0][0] \n", "__________________________________________________________________________________________________\n", "add_6 (Add) (None, 256, 16) 0 add_5[0][0] \n", " conv1d_14[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_15 (Conv1D) (None, 256, 16) 4112 add_6[0][0] \n", "__________________________________________________________________________________________________\n", "activation_7 (Activation) (None, 256, 16) 0 conv1d_15[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_7 (Lambda) (None, 256, 16) 0 activation_7[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_7 (SpatialDro (None, 256, 16) 0 lambda_7[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_16 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_7[0][0] \n", "__________________________________________________________________________________________________\n", "add_7 (Add) (None, 256, 16) 0 add_6[0][0] \n", " conv1d_16[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_17 (Conv1D) (None, 256, 16) 4112 add_7[0][0] \n", "__________________________________________________________________________________________________\n", "activation_8 (Activation) (None, 256, 16) 0 conv1d_17[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_8 (Lambda) (None, 256, 16) 0 activation_8[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_8 (SpatialDro (None, 256, 16) 0 lambda_8[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_18 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_8[0][0] \n", "__________________________________________________________________________________________________\n", "add_8 (Add) (None, 256, 16) 0 add_7[0][0] \n", " conv1d_18[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_19 (Conv1D) (None, 256, 16) 4112 add_8[0][0] \n", "__________________________________________________________________________________________________\n", "activation_9 (Activation) (None, 256, 16) 0 conv1d_19[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_9 (Lambda) (None, 256, 16) 0 activation_9[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_9 (SpatialDro (None, 256, 16) 0 lambda_9[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_20 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_9[0][0] \n", "__________________________________________________________________________________________________\n", "add_9 (Add) (None, 256, 16) 0 add_8[0][0] \n", " conv1d_20[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_21 (Conv1D) (None, 256, 16) 4112 add_9[0][0] \n", "__________________________________________________________________________________________________\n", "activation_10 (Activation) (None, 256, 16) 0 conv1d_21[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_10 (Lambda) (None, 256, 16) 0 activation_10[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_10 (SpatialDr (None, 256, 16) 0 lambda_10[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_22 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_10[0][0] \n", "__________________________________________________________________________________________________\n", "add_10 (Add) (None, 256, 16) 0 add_9[0][0] \n", " conv1d_22[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_23 (Conv1D) (None, 256, 16) 4112 add_10[0][0] \n", "__________________________________________________________________________________________________\n", "activation_11 (Activation) (None, 256, 16) 0 conv1d_23[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_11 (Lambda) (None, 256, 16) 0 activation_11[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_11 (SpatialDr (None, 256, 16) 0 lambda_11[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_24 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_11[0][0] \n", "__________________________________________________________________________________________________\n", "add_11 (Add) (None, 256, 16) 0 add_10[0][0] \n", " conv1d_24[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_25 (Conv1D) (None, 256, 16) 4112 add_11[0][0] \n", "__________________________________________________________________________________________________\n", "activation_12 (Activation) (None, 256, 16) 0 conv1d_25[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_12 (Lambda) (None, 256, 16) 0 activation_12[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_12 (SpatialDr (None, 256, 16) 0 lambda_12[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_26 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_12[0][0] \n", "__________________________________________________________________________________________________\n", "add_12 (Add) (None, 256, 16) 0 add_11[0][0] \n", " conv1d_26[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_27 (Conv1D) (None, 256, 16) 4112 add_12[0][0] \n", "__________________________________________________________________________________________________\n", "activation_13 (Activation) (None, 256, 16) 0 conv1d_27[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_13 (Lambda) (None, 256, 16) 0 activation_13[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_13 (SpatialDr (None, 256, 16) 0 lambda_13[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_28 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_13[0][0] \n", "__________________________________________________________________________________________________\n", "add_13 (Add) (None, 256, 16) 0 add_12[0][0] \n", " conv1d_28[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_29 (Conv1D) (None, 256, 16) 4112 add_13[0][0] \n", "__________________________________________________________________________________________________\n", "activation_14 (Activation) (None, 256, 16) 0 conv1d_29[0][0] \n", "__________________________________________________________________________________________________\n", "lambda_14 (Lambda) (None, 256, 16) 0 activation_14[0][0] \n", "__________________________________________________________________________________________________\n", "spatial_dropout1d_14 (SpatialDr (None, 256, 16) 0 lambda_14[0][0] \n", "__________________________________________________________________________________________________\n", "conv1d_30 (Conv1D) (None, 256, 16) 272 spatial_dropout1d_14[0][0] \n", "__________________________________________________________________________________________________\n", "add_15 (Add) (None, 256, 16) 0 conv1d_2[0][0] \n", " conv1d_4[0][0] \n", " conv1d_6[0][0] \n", " conv1d_8[0][0] \n", " conv1d_10[0][0] \n", " conv1d_12[0][0] \n", " conv1d_14[0][0] \n", " conv1d_16[0][0] \n", " conv1d_18[0][0] \n", " conv1d_20[0][0] \n", " conv1d_22[0][0] \n", " conv1d_24[0][0] \n", " conv1d_26[0][0] \n", " conv1d_28[0][0] \n", " conv1d_30[0][0] \n", "__________________________________________________________________________________________________\n", "activation_15 (Activation) (None, 256, 16) 0 add_15[0][0] \n", "__________________________________________________________________________________________________\n", "dense (Dense) (None, 256, 3) 51 activation_15[0][0] \n", "__________________________________________________________________________________________________\n", "activation_16 (Activation) (None, 256, 3) 0 dense[0][0] \n", "==================================================================================================\n", "Total params: 65,843\n", "Trainable params: 65,843\n", "Non-trainable params: 0\n", "__________________________________________________________________________________________________\n", "Epoch 1/4\n", "1000/1000 [==============================] - ETA: 0s - batch: 499.5000 - size: 32.0000 - loss: 0.1143" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/janc/miniconda3/lib/python3.8/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.\n", " warnings.warn('`Model.state_updates` will be removed in a future version. '\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Epoch 00001: val_loss improved from inf to 0.11043, saving model to res/20210924_220702_model.h5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/janc/miniconda3/lib/python3.8/site-packages/keras/utils/generic_utils.py:494: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.\n", " warnings.warn('Custom mask layers require a config and must override '\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "1000/1000 [==============================] - 241s 236ms/step - batch: 499.5000 - size: 32.0000 - loss: 0.1143 - val_loss: 0.1104\n", "Epoch 2/4\n", "1000/1000 [==============================] - ETA: 0s - batch: 499.5000 - size: 32.0000 - loss: 0.0841\n", "Epoch 00002: val_loss improved from 0.11043 to 0.10770, saving model to res/20210924_220702_model.h5\n", "1000/1000 [==============================] - 226s 226ms/step - batch: 499.5000 - size: 32.0000 - loss: 0.0841 - val_loss: 0.1077\n", "Epoch 3/4\n", "1000/1000 [==============================] - ETA: 0s - batch: 499.5000 - size: 32.0000 - loss: 0.0867\n", "Epoch 00003: val_loss did not improve from 0.10770\n", "1000/1000 [==============================] - 215s 215ms/step - batch: 499.5000 - size: 32.0000 - loss: 0.0867 - val_loss: 0.1103\n", "Epoch 4/4\n", "1000/1000 [==============================] - ETA: 0s - batch: 499.5000 - size: 32.0000 - loss: 0.0823\n", "Epoch 00004: val_loss improved from 0.10770 to 0.09988, saving model to res/20210924_220702_model.h5\n", "1000/1000 [==============================] - 218s 218ms/step - batch: 499.5000 - size: 32.0000 - loss: 0.0823 - val_loss: 0.0999\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:root:re-loading last best model\n", "INFO:root:predicting\n", "INFO:root:evaluating\n", "INFO:root:[[3545939 7799 38820]\n", " [ 10658 33510 140]\n", " [ 99569 58 241747]]\n", "INFO:root:{'noise': {'precision': 0.9698517518077681, 'recall': 0.9870234523701497, 'f1-score': 0.9783622607234045, 'support': 3592558}, 'pulse': {'precision': 0.8100659946334035, 'recall': 0.7562968312720051, 'f1-score': 0.7822585351619492, 'support': 44308}, 'sine': {'precision': 0.8612075936830931, 'recall': 0.7081587935812335, 'f1-score': 0.7772203298284307, 'support': 341374}, 'accuracy': 0.960524251930502, 'macro avg': {'precision': 0.8803751133747548, 'recall': 0.8171596924077962, 'f1-score': 0.8459470419045948, 'support': 3978240}, 'weighted avg': {'precision': 0.9587493351198522, 'recall': 0.960524251930502, 'f1-score': 0.958918087071358, 'support': 3978240}}\n", "INFO:root:saving to res/20210924_220702_results.h5.\n", "/Users/janc/miniconda3/lib/python3.8/site-packages/tables/attributeset.py:464: NaturalNameWarning: object name is not a valid Python identifier: 'f1-score'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though\n", " check_attribute_name(name)\n", "/Users/janc/miniconda3/lib/python3.8/site-packages/tables/path.py:155: NaturalNameWarning: object name is not a valid Python identifier: 'macro avg'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though\n", " check_attribute_name(name)\n", "/Users/janc/miniconda3/lib/python3.8/site-packages/tables/path.py:155: NaturalNameWarning: object name is not a valid Python identifier: 'weighted avg'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though\n", " check_attribute_name(name)\n", "INFO:root:DONE.\n" ] } ], "source": [ "model, params = das.train.train(model_name='tcn', # see `das.models` for valid model_names\n", " data_dir='tutorial_dataset.npy', \n", " save_dir='res',\n", " nb_hist=256,\n", " kernel_size=16,\n", " nb_filters=16,\n", " ignore_boundaries=True,\n", " verbose=1,\n", " nb_epoch=4,\n", " log_messages=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training using command-line scripts\n", "The training function `das.train.train` and all its arguments are also accessible from the command line via `das train` for use on the terminal. See [here](/technical/cli#train) for a description of all command-line arguments. The command-line interface is generated with [defopt](https://defopt.readthedocs.io/en/stable/index.html).\n", "\n", "For instance, training command above can be invoked from the command line:\n", "```shell\n", "das train --data-dir dat/dmel_single_raw.npy --save-dir res --model-name tcn --kernel-size 16 --nb-filters 16 --nb-hist 512 --nb-epoch 20 -i\n", "```\n", "\n", "Shell scripts are particularly useful if you want to fit the network with with different configurations to optimize [structural parameters](/tutorials/structparams). For instance, this script will fit networks with different numbers of TCN blocks (`nb_conv`) and filters (`nb_filters`):\n", "```shell\n", "#!/bin/bash\n", "conda activate das\n", "\n", "YSUFFIX=\"pulse\"\n", "MODELNAME='tcn'\n", "DATADIR='../dat/dmel_single.npy'\n", "SAVEDIR=\"res\"\n", "\n", "NB_HIST=2048\n", "KERNEL_SIZE=32\n", "NB_FILTERS=32\n", "NB_CONV=3\n", "\n", "for NB_CONV in 2 3 4\n", "do\n", " for NB_FILTERS in 16 32 64\n", " do\n", " das train -i --nb-filters $NB_FILTERS --kernel-size $KERNEL_SIZE --nb-conv $NB_CONV --nb-hist $NB_HIST --save-dir $SAVEDIR --y-suffix $YSUFFIX --data-dir $DATADIR --model-name $MODELNAME\n", " done\n", "done\n", "```\n", "\n", "A description of all command line arguments can be obtained by typing `das train --help` in a terminal:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: das train [-h] -d DATA_DIR [-y Y_SUFFIX] [--save-dir SAVE_DIR]\n", " [--save-prefix SAVE_PREFIX] [-m MODEL_NAME]\n", " [--nb-filters NB_FILTERS] [-k KERNEL_SIZE]\n", " [--nb-conv NB_CONV] [-u [USE_SEPARABLE [USE_SEPARABLE ...]]]\n", " [--nb-hist NB_HIST]\n", " [-i | --ignore-boundaries | --no-ignore-boundaries]\n", " [--batch-norm | --no-batch-norm] [--nb-pre-conv NB_PRE_CONV]\n", " [--pre-nb-dft PRE_NB_DFT] [--pre-kernel-size PRE_KERNEL_SIZE]\n", " [--pre-nb-filters PRE_NB_FILTERS] [--pre-nb-conv PRE_NB_CONV]\n", " [--nb-lstm-units NB_LSTM_UNITS] [--verbose VERBOSE]\n", " [--batch-size BATCH_SIZE] [--nb-epoch NB_EPOCH]\n", " [--learning-rate LEARNING_RATE]\n", " [--reduce-lr | --no-reduce-lr]\n", " [--reduce-lr-patience REDUCE_LR_PATIENCE] [-f FRACTION_DATA]\n", " [--seed SEED]\n", " [--batch-level-subsampling | --no-batch-level-subsampling]\n", " [-t | --tensorboard | --no-tensorboard]\n", " [--neptune-api-token NEPTUNE_API_TOKEN]\n", " [--neptune-project NEPTUNE_PROJECT]\n", " [--log-messages | --no-log-messages] [--nb-stacks NB_STACKS]\n", " [-w | --with-y-hist | --no-with-y-hist] [-x X_SUFFIX]\n", " [--balance | --no-balance]\n", " [--version-data | --no-version-data]\n", "\n", "Train a DeepSS network.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " -d DATA_DIR, --data-dir DATA_DIR\n", " Path to the directory or file with the dataset for training.\n", " Accepts npy-dirs (recommended), h5 files or zarr files.\n", " See documentation for how the dataset should be organized.\n", " -y Y_SUFFIX, --y-suffix Y_SUFFIX\n", " Select training target by suffix.\n", " Song-type specific targets can be created with a training dataset,\n", " Defaults to '' (will use the standard target 'y')\n", " --save-dir SAVE_DIR Directory to save training outputs.\n", " The path of output files will constructed from the SAVE_DIR, an optional prefix, and the time stamp of the start of training.\n", " Defaults to current directory ('./').\n", " --save-prefix SAVE_PREFIX\n", " Prepend to timestamp.\n", " Name of files created will be SAVE_DIR/SAVE_PREFIX + \"_\" + TIMESTAMP\n", " or SAVE_DIR/ TIMESTAMP if SAVE_PREFIX is empty.\n", " Defaults to '' (empty).\n", " -m MODEL_NAME, --model-name MODEL_NAME\n", " Network architecture to use.\n", " Use \"tcn\" (TCN) or \"tcn_stft\" (TCN with STFT frontend).\n", " See das.models for a description of all models.\n", " Defaults to 'tcn'.\n", " --nb-filters NB_FILTERS\n", " Number of filters per layer.\n", " Defaults to 16.\n", " -k KERNEL_SIZE, --kernel-size KERNEL_SIZE\n", " Duration of the filters (=kernels) in samples.\n", " Defaults to 16.\n", " --nb-conv NB_CONV Number of TCN blocks in the network.\n", " Defaults to 3.\n", " -u [USE_SEPARABLE [USE_SEPARABLE ...]], --use-separable [USE_SEPARABLE [USE_SEPARABLE ...]]\n", " Specify which TCN blocks should use separable convolutions.\n", " Provide as a space-separated sequence of \"False\" or \"True.\n", " For instance: \"True False False\" will set the first block in a\n", " three-block (as given by nb_conv) network to use separable convolutions.\n", " Defaults to False (no block uses separable convolution).\n", " --nb-hist NB_HIST Number of samples processed at once by the network (a.k.a chunk size).\n", " Defaults to 1024.\n", " -i, --ignore-boundaries, --no-ignore-boundaries\n", " Minimize edge effects by discarding predictions at the edges of chunks.\n", " Defaults to True.\n", " --batch-norm, --no-batch-norm\n", " Batch normalize.\n", " Defaults to True.\n", " --nb-pre-conv NB_PRE_CONV\n", " Downsampling rate. Adds downsampling frontend if not 0.\n", " TCN_TCN: adds a frontend of N conv blocks (conv-relu-batchnorm-maxpool2) to the TCN.\n", " TCN_STFT: adds a trainable STFT frontend.\n", " Defaults to 0 (no frontend).\n", " --pre-nb-dft PRE_NB_DFT\n", " Number of filters (roughly corresponding to filters) in the STFT frontend.\n", " Defaults to 64.\n", " --pre-kernel-size PRE_KERNEL_SIZE\n", " Duration of filters (=kernels) in samples in the pre-processing TCN.\n", " Defaults to 3.\n", " --pre-nb-filters PRE_NB_FILTERS\n", " Number of filters per layer in the pre-processing TCN.\n", " Defaults to 16.\n", " --pre-nb-conv PRE_NB_CONV\n", " --nb-lstm-units NB_LSTM_UNITS\n", " If >0, adds LSTM with given number of units to the output of the stack of TCN blocks.\n", " Defaults to 0 (no LSTM layer).\n", " --verbose VERBOSE Verbosity of training output (0 - no output(?), 1 - progress bar, 2 - one line per epoch).\n", " Defaults to 2.\n", " --batch-size BATCH_SIZE\n", " Batch size\n", " Defaults to 32.\n", " --nb-epoch NB_EPOCH Maximal number of training epochs.\n", " Training will stop early if validation loss did not decrease in the last 20 epochs.\n", " Defaults to 400.\n", " --learning-rate LEARNING_RATE\n", " Learning rate of the model. Defaults should work in most cases.\n", " Values typically range between 0.1 and 0.00001.\n", " If None, uses per model defaults: \"tcn\" 0.0001, \"tcn_stft\" 0.0005).\n", " Defaults to None.\n", " --reduce-lr, --no-reduce-lr\n", " Reduce learning rate on plateau.\n", " Defaults to False.\n", " --reduce-lr-patience REDUCE_LR_PATIENCE\n", " Number of epochs w/o a reduction in validation loss after which to trigger a reduction in learning rate.\n", " Defaults to 5.\n", " -f FRACTION_DATA, --fraction-data FRACTION_DATA\n", " Fraction of training and validation to use for training.\n", " Defaults to 1.0.\n", " --seed SEED Random seed to reproducible select fractions of the data.\n", " Defaults to None (no seed).\n", " --batch-level-subsampling, --no-batch-level-subsampling\n", " Select fraction of data for training from random subset of shuffled batches.\n", " If False, select a continuous chunk of the recording.\n", " Defaults to False.\n", " -t, --tensorboard, --no-tensorboard\n", " Write tensorboard logs to save_dir. Defaults to False.\n", " --neptune-api-token NEPTUNE_API_TOKEN\n", " API token for logging to neptune.ai. Defaults to None (no logging).\n", " --neptune-project NEPTUNE_PROJECT\n", " Project to log to for neptune.ai. Defaults to None (no logging).\n", " --log-messages, --no-log-messages\n", " Sets logging level to INFO.\n", " Defaults to False (will follow existing settings).\n", " --nb-stacks NB_STACKS\n", " Unused if model name is \"tcn\" or \"tcn_stft\". Defaults to 2.\n", " -w, --with-y-hist, --no-with-y-hist\n", " Unused if model name is \"tcn\" or \"tcn_stft\". Defaults to True.\n", " -x X_SUFFIX, --x-suffix X_SUFFIX\n", " Select specific training data based on suffix (e.g. x_suffix).\n", " Defaults to '' (will use the standard data 'x')\n", " --balance, --no-balance\n", " Balance data. Weights class-wise errors by the inverse of the class frequencies.\n", " Defaults to False.\n", " --version-data, --no-version-data\n", " Save MD5 hash of the data_dir to log and params.yaml.\n", " Defaults to True (set to False for large datasets since it can be slow).\n", "\u001b[0m" ] } ], "source": [ "!das train --help" ] } ], "metadata": { "file_extension": ".py", "interpreter": { "hash": "97e399eb41f39eece155bd3046c19e1bcac896c178036c5a0e917146c5ea4385" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.11" }, "mimetype": "text/x-python", "name": "python", "npconvert_exporter": "python", "pygments_lexer": "ipython3", "version": 3 }, "nbformat": 4, "nbformat_minor": 4 }