das.make_dataset#

das.make_dataset.blur_events(event_trace: numpy.ndarray, event_std_seconds: float, samplerate: float) → numpy.ndarray[source]#

Blur event trace with a gaussian.

Parameters

event_trace (np.ndarray) – shape (N,)
event_std_seconds (float) – With of the Gaussian in seconds
samplerate (float) – sample rate of event_trace

Returns

blurred event trace

Return type

np.ndarray

das.make_dataset.events_to_probabilities(eventsamples: List[int], desired_len: Optional[int] = None, extent: int = 61)[source]#

Converts list of events to one-hot-encoded probability vectors.

Parameters

eventsamples (List[int]) – List of event “times” in samples.
desired_len (float, optional) – Length of the probability vector. Events exceeding desired_len will be ignored. Defaults to max(eventsamples) + extent.
extent (int, optional) – Temporal extent of an event in the probability vector. Each event will be represented as a box with a duration exent samples centered on the event. Defaults to 61 samples (+/-30 samples).

Returns

np.array with shape [desired_len, 2]: where probabilities[:, 0] corresponds to the probability of no event and probabilities[:, 0] corresponds to the probability of an event.

Return type

probabilities

das.make_dataset.infer_class_info(df: pandas.core.frame.DataFrame)[source]#

[summary]

Parameters: df ([type]) – [description]
Returns: [description]
Return type: [type]

das.make_dataset.init_store(nb_channels: int, nb_classes: int, store, samplerate: Optional[float] = None, make_single_class_datasets: bool = False, class_names: List[str] = None, class_types: List[str] = None, chunk_len: int = 1000000)[source]#

[summary]

Parameters

nb_channels (int) – [description]
nb_classes (int) – [description] <- should infer from class_names!
store – zarr store
samplerate (float, optional) – [description]. Defaults to None.
make_single_class_datasets (bool, optional) – make y_suffix and attrs[‘class_names/types_suffix’]. Defaults to None.
class_names (List[str], optional) – [description]. Defaults to None.
class_types (List[str], optional) – ‘event’ or ‘segment’. Defaults to None.
chunk_len (int, optional) – [description]. Defaults to 1_000_000.

Raises

ValueError – [description]
ValueError – [description]

Returns

[description]

Return type

[type]

das.make_dataset.make_annotation_matrix(df: pandas.core.frame.DataFrame, nb_samples: int, samplerate: float, class_names: Optional[List[str]] = None) → numpy.ndarray[source]#

One-hot encode a list of song timings to a binary matrix.

Parameters

df (pd.DataFrame) – DataFrame with the following columns: - name: class name of the syllable/song event - start_seconds: start of the song event in the audio recording in seconds. - stop_seconds: stop of the song event in the audio recording in seconds.
nb_samples ([type]) – Length of the annotation matrix in samples.
samplerate (float) – Sample rate for the annotation matrix in Hz.
class_names (List[str], optional) – List of class names. If provided, the annotation matrix will be built only for the events in class_names. Otherwise, the matrix will be build for all class names in the df. Order in class_names determines order in class_matrix

Returns

Binary matrix [nb_samples, nb_classes]: with 1 indicating the presence of a class at a specific sample.

Return type

nd.array

das.make_dataset.make_gaps(y: numpy.ndarray, gap_seconds: float, samplerate: float, start_seconds: Optional[List[float]] = None, stop_seconds: Optional[List[float]] = None) → numpy.ndarray[source]#

[summary]

0011112222000111100 -> 0011100222000111100 (gap_fullwidth=2)

Parameters

y (np.ndarray) – One-hot encoded labels [T, nb_labels]
gap_seconds (float) – [description]
samplerate (float) – [description]
start_seconds –
stop_seconds –

Returns

[description]

Return type

np.ndarray

das.make_dataset.normalize_probabilities(p: numpy.ndarray) → numpy.ndarray[source]#

[summary]

Parameters: p (np.ndarray) – [description]
Returns: [description]
Return type: np.ndarray