das.segment_utils¶
Segment (syllable) utilities.
- das.segment_utils.fill_gaps(labels: numpy.array, gap_dur: int = 100) numpy.array [source]¶
Fill short gaps in a sequence of labelled samples.
---111111-1111---
->---11111111111---
- Parameters
labels (np.array) – Sequence of labelled samples.
gap_dur (int, optional) – Minimal gap duration, in samples. Defaults to 100.
- Returns
Labelled samples with short gaps filled.
- Return type
np.array
- das.segment_utils.label_syllables_by_majority(labels: numpy.array, onsets_seconds: List[float], offsets_seconds: List[float], samplerate: float) Tuple[numpy.array, numpy.array] [source]¶
Label syllables by a majority vote.
- Parameters
labels (np.array) – Sequence of dirty, per sample, labels.
onsets_seconds (List[float]) – Onset of each syllable in
labels
, in seconds.offsets_seconds (List[float]) – Offset of each syllable in
labels
, in seconds.samplerate (float) – Samplerate of
labels
, in Hz.
- Returns
Sequence of syllables, clean sequence of per-sample labels.
- Return type
Tuple[np.array, np.array]
- das.segment_utils.levenshtein(seq1: str, seq2: str) float [source]¶
Compute the Levenshtein edit distance between two strings.
Corresponds to the minimal number of insertions, deletions, and subsitutions required to transform
seq1
intoseq2
.- Parameters
seq1 (str) –
seq2 (str) –
- Returns
The Levenshtein distance between seq1 and seq1.
- Return type
float
- das.segment_utils.remove_short(labels: numpy.array, min_len: int = 100) numpy.array [source]¶
Remove short syllables from sequence of labelled samples.
---1111-1---1--
->---1111--------
- Parameters
labels (np.array) – Sequence of labelled samples.
min_len (int, optional) – Minimal segment (syllable) duration, in samples. Defaults to 100.
- Returns
Labelled samples with short syllables removed.
- Return type
np.array
- das.segment_utils.syllable_error_rate(true: str, pred: str) float [source]¶
Compute the Levenshtein edit distance normalized by length of
true
.- Parameters
true (str) – Ground truth labels for a sequence of syllables.
pred (str) – Predicted labels for a sequence of syllables.
- Raises
TypeError – if either input is not a
str
- Returns
Levenshtein distance normalized by length of
true
.- Return type
float