Embedder

class music_embedding.embedder.embedder(pianoroll=None, intervals=None, default_velocity=100, origin=60, pixels_per_bar=96)[source]

Bases: object

A class for embedding musical data, providing functionalities to convert between pianoroll representations and interval-based representations.

This class handles various operations related to musical data manipulation, including extracting notes from pianorolls, converting pianoroll data to melodic, harmonic, and barwise intervals, and vice versa. Additionally, it supports Run-Length Encoding (RLE) compression for intervals.

The constant NOTES_IN_MIDI is set to 128, reflecting the total number of MIDI notes in the standard MIDI range. This constant is used throughout the class to standardize the size of the second dimension in pianoroll arrays, ensuring they conform to MIDI standards. The pianoroll arrays are therefore structured with a shape of (?, 128), where each column represents a possible MIDI note, allowing for a consistent representation of musical data.

See also

interval: Class used for interval-related calculations.

Attributes:

pianorollndarray, dtype=uint8, shape=(?, 128), optional: Pianoroll representation of musical data. The first dimension represents timesteps, and the second dimension has a fixed size of 128, corresponding to MIDI standards.
intervalsndarray, dtype=int8, shape=(?, interval.feature_dimensions), optional: Interval representation of musical data. The first dimension represents timesteps, and the second dimension corresponds to interval features.
default_velocityint: Default velocity used for notes in pianoroll representation. Defaults to 100.
originint: Reference note for melody, used as the starting note when decoding a melody. Defaults to 60 (Middle C in MIDI)
pixels_per_barint: Number of pixels representing each bar in a pianoroll, calculated as the time signature’s numerator multiplied by the resolution per pixel. Defaults to 96.

Methods

`chunk_sequence_of_intervals`([intervals, ...])	Breaks a long sequence of intervals into chunks of a specified length.
`extract_highest_pitch_notes_from_pianoroll`([...])	Extracts the highest pitch note at each timestep from the pianoroll attribute.
`get_RLE_from_intervals`([intervals])	Compresses a sequence of intervals using Run-Length Encoding (RLE).
`get_RLE_from_intervals_bulk`(bulk_intervals)	Bulk compresses a sequence of intervals using Run-Length Encoding (RLE).
`get_barwise_intervals_from_pianoroll`([...])	Creates a sequence of barwise intervals from a pianoroll, calculating intervals with respect to the first note of each bar.
`get_harmonic_intervals_from_pianoroll`(...[, ...])	Creates a sequence of harmonic intervals from the pianoroll relative to a reference pianoroll.
`get_intervals_from_RLE`(RLE_data)	Uncompresses a Run-Length Encoded sequence of intervals.
`get_intervals_from_RLE_bulk`(bulk_RLE_data)	Bulk uncompresses a sequence of Run-Length Encoded intervals.
`get_melodic_intervals_from_pianoroll`([pianoroll])	Creates a sequence of melodic intervals from a pianoroll.
`get_pianoroll_from_barwise_intervals`([...])	Creates a pianoroll from a sequence of barwise intervals.
`get_pianoroll_from_harmonic_intervals`([...])	Creates a pianoroll from a sequence of harmonic intervals.
`get_pianoroll_from_melodic_intervals`([...])	Creates a pianoroll from a sequence of melodic intervals.
`merge_chunked_intervals`(chunked_intervals)	Merges chunks of interval sequences into a single sequence.

chunk_sequence_of_intervals(intervals: ndarray | None = None, pixels_per_chunk: int | None = None) → ndarray[source]

Breaks a long sequence of intervals into chunks of a specified length.

This method divides a sequence of intervals into smaller, equally-sized chunks, which can be useful for processing or analyzing data in segments. If intervals is None, the method works on self.intervals.

Parameters:

intervalsndarray, dtype=int8, shape=(?, interval.feature_dimensions): Sequence of intervals to be chunked. If None, uses self.intervals.
pixels_per_chunkint: Number of pixels in each chunk. Defaults to self.pixels_per_bar if None.

Returns:

ndarray, dtype=int8: Array of chunked intervals. Shape is (?, pixels_per_chunk, interval.feature_dimensions), where ? is the number of chunks.

Raises:

TypeError: If both intervals argument and self.intervals are None.
IndexError: If intervals shape’s second dimension is not equal to interval.feature_dimensions.
ValueError: If pixels_per_chunk is less than 1 or if it’s None and self.pixels_per_bar is less than 1.

extract_highest_pitch_notes_from_pianoroll(preserve_pianoroll: bool = True) → ndarray[source]

Extracts the highest pitch note at each timestep from the pianoroll attribute.

This method processes the self.pianoroll array to find the highest pitch note for each timestep. It can operate in either a non-destructive mode, preserving the original pianoroll, or a faster, destructive mode that alters the original data.

Example: Given the pianoroll of an SATB choir, returns Soprano notes.

Parameters:

preserve_pianorollbool, optional: Determines if self.pianoroll should be preserved. Setting it to False increases performance by avoiding data copying. Default is True.

Returns:

ndarray, dtype=int64, shape=(?): An array containing the highest pitch note at each timestep. Indicates silence with a value of 0.

Raises:

TypeError: If self.pianoroll is None.
IndexError: If self.pianoroll does not have the second dimension size of 128.

Notes

The method operates on self.pianoroll and requires it to be filled before calling. The pianoroll format is expected to conform to MIDI standards with 128 pitches.

get_RLE_from_intervals(intervals: ndarray | None = None) → ndarray[source]

Compresses a sequence of intervals using Run-Length Encoding (RLE).

This method takes a sequence of intervals and compresses it using RLE, which is useful for reducing the size of repetitive data. The output is an array where each row represents a compressed sequence of intervals, and the last column in each row indicates the number of repetitions.

Parameters:

intervalsndarray, dtype=int8, shape=(?, interval.feature_dimensions) | None, optional: The sequence of intervals to be compressed. If None, uses self.intervals.

Returns:

ndarray, dtype=int32, shape=(?, interval.feature_dimensions + 1): The RLE compressed intervals. Each row contains the compressed interval data with the last element indicating the count of repetitions.

Raises:

TypeError: If both intervals argument and self.intervals are None.
IndexError: If intervals.shape[1] != interval.feature_dimensions (if intervals is None, then self.intervals.shape[1] is checked).

get_RLE_from_intervals_bulk(bulk_intervals: ndarray) → List[ndarray][source]

Bulk compresses a sequence of intervals using Run-Length Encoding (RLE).

This method processes multiple sequences of intervals simultaneously, applying RLE compression to each sequence in the bulk data. It is useful for handling large datasets where individual processing would be inefficient.

Parameters:

bulk_intervalsndarray: An array containing multiple sequences of intervals, with shape (n_chunks, chunk_size, interval.feature_dimensions). Each sequence (chunk) in the first dimension will be compressed using RLE.

Returns:

List[ndarray]: A list of RLE-compressed interval sequences, where each element in the list corresponds to the RLE representation of a chunk in bulk_intervals. Each ndarray in the list has shape (?, interval.feature_dimensions + 1), where the last dimension includes the run lengths.