das.make_dataset#
- das.make_dataset.blur_events(event_trace: numpy.ndarray, event_std_seconds: float, samplerate: float) numpy.ndarray [source]#
Blur event trace with a gaussian.
- Parameters
event_trace (np.ndarray) – shape (N,)
event_std_seconds (float) – With of the Gaussian in seconds
samplerate (float) – sample rate of event_trace
- Returns
blurred event trace
- Return type
np.ndarray
- das.make_dataset.events_to_probabilities(eventsamples: List[int], desired_len: Optional[int] = None, extent: int = 61)[source]#
Converts list of events to one-hot-encoded probability vectors.
- Parameters
eventsamples (List[int]) – List of event “times” in samples.
desired_len (float, optional) – Length of the probability vector. Events exceeding
desired_len
will be ignored. Defaults tomax(eventsamples) + extent
.extent (int, optional) – Temporal extent of an event in the probability vector. Each event will be represented as a box with a duration
exent
samples centered on the event. Defaults to 61 samples (+/-30 samples).
- Returns
- np.array with shape [desired_len, 2]
where
probabilities[:, 0]
corresponds to the probability of no event andprobabilities[:, 0]
corresponds to the probability of an event.
- Return type
probabilities
- das.make_dataset.infer_class_info(df: pandas.core.frame.DataFrame)[source]#
[summary]
- Parameters
df ([type]) – [description]
- Returns
[description]
- Return type
[type]
- das.make_dataset.init_store(nb_channels: int, nb_classes: int, store, samplerate: Optional[float] = None, make_single_class_datasets: bool = False, class_names: List[str] = None, class_types: List[str] = None, chunk_len: int = 1000000)[source]#
[summary]
- Parameters
nb_channels (int) – [description]
nb_classes (int) – [description] <- should infer from class_names!
store – zarr store
samplerate (float, optional) – [description]. Defaults to None.
make_single_class_datasets (bool, optional) – make y_suffix and attrs[‘class_names/types_suffix’]. Defaults to None.
class_names (List[str], optional) – [description]. Defaults to None.
class_types (List[str], optional) – ‘event’ or ‘segment’. Defaults to None.
chunk_len (int, optional) – [description]. Defaults to 1_000_000.
- Raises
ValueError – [description]
ValueError – [description]
- Returns
[description]
- Return type
[type]
- das.make_dataset.make_annotation_matrix(df: pandas.core.frame.DataFrame, nb_samples: int, samplerate: float, class_names: Optional[List[str]] = None) numpy.ndarray [source]#
One-hot encode a list of song timings to a binary matrix.
- Parameters
df (pd.DataFrame) – DataFrame with the following columns: - name: class name of the syllable/song event - start_seconds: start of the song event in the audio recording in seconds. - stop_seconds: stop of the song event in the audio recording in seconds.
nb_samples ([type]) – Length of the annotation matrix in samples.
samplerate (float) – Sample rate for the annotation matrix in Hz.
class_names (List[str], optional) – List of class names. If provided, the annotation matrix will be built only for the events in class_names. Otherwise, the matrix will be build for all class names in the df. Order in class_names determines order in class_matrix
- Returns
- Binary matrix [nb_samples, nb_classes]
with 1 indicating the presence of a class at a specific sample.
- Return type
nd.array
- das.make_dataset.make_gaps(y: numpy.ndarray, gap_seconds: float, samplerate: float, start_seconds: Optional[List[float]] = None, stop_seconds: Optional[List[float]] = None) numpy.ndarray [source]#
[summary]
0011112222000111100 -> 0011100222000111100 (gap_fullwidth=2)
- Parameters
y (np.ndarray) – One-hot encoded labels [T, nb_labels]
gap_seconds (float) – [description]
samplerate (float) – [description]
start_seconds –
stop_seconds –
- Returns
[description]
- Return type
np.ndarray