das.segment_utils#
Segment (syllable) utilities.
- das.segment_utils.fill_gaps(labels: numpy.ndarray, gap_dur: int = 100) numpy.ndarray [source]#
Fill short gaps in a sequence of labelled samples.
---111111-1111---
->---11111111111---
- Parameters
labels (np.ndarray) – Sequence of labelled samples.
gap_dur (int, optional) – Minimal gap duration, in samples. Defaults to 100.
- Returns
Labelled samples with short gaps filled.
- Return type
np.ndarray
- das.segment_utils.label_syllables_by_majority(labels: numpy.ndarray, onsets_seconds: numpy.ndarray, offsets_seconds: numpy.ndarray, samplerate: float) Tuple[numpy.ndarray, numpy.ndarray] [source]#
Label syllables by a majority vote.
- Parameters
labels (np.ndarray) – Sequence of dirty, per sample, labels.
onsets_seconds (List[float]) – Onset of each syllable in
labels
, in seconds.offsets_seconds (List[float]) – Offset of each syllable in
labels
, in seconds.samplerate (float) – Samplerate of
labels
, in Hz.
- Returns
Sequence of syllables, clean sequence of per-sample labels.
- Return type
Tuple[np.ndarray, np.ndarray]
- das.segment_utils.levenshtein(seq1: str, seq2: str) float [source]#
Compute the Levenshtein edit distance between two strings.
Corresponds to the minimal number of insertions, deletions, and subsitutions required to transform
seq1
intoseq2
.- Parameters
seq1 (str) –
seq2 (str) –
- Returns
The Levenshtein distance between seq1 and seq1.
- Return type
float
- das.segment_utils.remove_short(labels: numpy.ndarray, min_len: int = 100) numpy.ndarray [source]#
Remove short syllables from sequence of labelled samples.
---1111-1---1--
->---1111--------
- Parameters
labels (np.ndarray) – Sequence of labelled samples.
min_len (int, optional) – Minimal segment (syllable) duration, in samples. Defaults to 100.
- Returns
Labelled samples with short syllables removed.
- Return type
np.ndarray
- das.segment_utils.syllable_error_rate(true: str, pred: str) float [source]#
Compute the Levenshtein edit distance normalized by length of
true
.- Parameters
true (str) – Ground truth labels for a sequence of syllables. For instance, ‘ABCDAAE’.
pred (str) – Predicted labels for a sequence of syllables.
- Raises
TypeError – if either input is not a
str
- Returns
Levenshtein distance normalized by length of
true
.- Return type
float