das.make_dataset

das.make_dataset.blur_events(event_trace: numpy.ndarray, event_std_seconds: float, samplerate: float) numpy.ndarray[source]

Blur event trace with a gaussian.

Parameters
  • event_trace (np.ndarray) – shape (N,)

  • event_std_seconds (float) – With of the Gaussian in seconds

  • samplerate (float) – sample rate of event_trace

Returns

blurred event trace

Return type

np.ndarray

das.make_dataset.events_to_probabilities(eventsamples: List[int], desired_len: Optional[int] = None, extent: int = 61)[source]

Converts list of events to one-hot-encoded probability vectors.

Parameters
  • eventsamples (List[int]) – List of event “times” in samples.

  • desired_len (float, optional) – Length of the probability vector. Events exceeding desired_len will be ignored. Defaults to max(eventsamples) + extent.

  • extent (int, optional) – Temporal extent of an event in the probability vector. Each event will be represented as a box with a duration exent samples centered on the event. Defaults to 61 samples (+/-30 samples).

Returns

np.array with shape [desired_len, 2]

where probabilities[:, 0] corresponds to the probability of no event and probabilities[:, 0] corresponds to the probability of an event.

Return type

probabilities

das.make_dataset.infer_class_info(df: pandas.core.frame.DataFrame)[source]

[summary]

Parameters

df ([type]) – [description]

Returns

[description]

Return type

[type]

das.make_dataset.init_store(nb_channels: int, nb_classes: int, samplerate: Optional[float] = None, make_single_class_datasets: bool = False, class_names: Optional[List[str]] = None, class_types: Optional[List[str]] = None, store_type=<class 'zarr.storage.TempStore'>, store_name: str = 'store.zarr', chunk_len: int = 1000000)[source]

[summary]

Parameters
  • nb_channels (int) – [description]

  • nb_classes (int) – [description] <- should infer from class_names!

  • samplerate (float, optional) – [description]. Defaults to None.

  • make_single_class_datasets (bool, optional) – make y_suffix and attrs[‘class_names/types_suffix’]. Defaults to None.

  • class_names (List[str], optional) – [description]. Defaults to None.

  • class_types (List[str], optional) – ‘event’ or ‘segment’. Defaults to None.

  • store_type ([type], optional) – [description]. Defaults to zarr.TemporaryStore.

  • store_name (str, optional) – [description]. Defaults to ‘store.zarr’.

  • chunk_len (int, optional) – [description]. Defaults to 1_000_000.

Raises
  • ValueError – [description]

  • ValueError – [description]

Returns

[description]

Return type

[type]

das.make_dataset.make_annotation_matrix(df: pandas.core.frame.DataFrame, nb_samples: int, samplerate: float, class_names: Optional[List[str]] = None) numpy.ndarray[source]

One-hot encode a list of song timings to a binary matrix.

Parameters
  • df (pd.DataFrame) – DataFrame with the following columns: - name: class name of the syllable/song event - start_seconds: start of the song event in the audio recording in seconds. - stop_seconds: stop of the song event in the audio recording in seconds.

  • nb_samples ([type]) – Length of the annotation matrix in samples.

  • samplerate (float) – Sample rate for the annotation matrix in Hz.

  • class_names (List[str], optional) – List of class names. If provided, the annotation matrix will be built only for the events in class_names. Otherwise, the matrix will be build for all class names in the df. Order in class_names determines order in class_matrix

Returns

Binary matrix [nb_samples, nb_classes]

with 1 indicating the presence of a class at a specific sample.

Return type

nd.array

das.make_dataset.make_gaps(y: numpy.ndarray, gap_seconds: float, samplerate: float, start_seconds: Optional[List[float]] = None, stop_seconds: Optional[List[float]] = None) numpy.ndarray[source]

[summary]

0011112222000111100 -> 0011100222000111100 (gap_fullwidth=2)

Parameters
  • y (np.ndarray) – One-hot encoded labels [T, nb_labels]

  • gap_seconds (float) – [description]

  • samplerate (float) – [description]

  • start_seconds

  • stop_seconds

Returns

[description]

Return type

np.ndarray

das.make_dataset.normalize_probabilities(p: numpy.ndarray) numpy.ndarray[source]

[summary]

Parameters

p (np.ndarray) – [description]

Returns

[description]

Return type

np.ndarray