Data Functions

Here can be found the implementation of the functions to read the data for the arrythmia classification.

TSFEDL.data.get_mit_bih_segments(data, ...)

It generates the segments of uninterrupted sequences of arrythmia beats into the corresponding arrythmia groups in labels.

TSFEDL.data.read_mit_bih(path[, labels, ...])

It reads the MIT-BIH Arrythmia X with the specified default configuration of the work presented at: Oh, Shu Lih, et al. "Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats." Computers in biology and medicine 102 (2018): 278-287.

TSFEDL.data.MIT_BIH(path[, labels, length, ...])

Reads the MIT-BIH datasets and return a data loader with Shape (N, C, L) where N is the batch size, C is the number of channels (1 in this dataset) and L is the length of the time series (1000 by default).

TSFEDL.data.get_mit_bih_segments(data: Record, annotations: Annotation, labels: ndarray, left_offset: int = 99, right_offset: int = 160, fixed_length: Optional[int] = None) Tuple[ndarray, ndarray][source]

It generates the segments of uninterrupted sequences of arrythmia beats into the corresponding arrythmia groups in labels.

Parameters:
  • data (wfdb.Record) – The arrythmia signal as a wfdb Record class

  • annotations (wfdb.Annotation) – The set of annotations as a wfdb Annotation class

  • labels (array-like) – The set of valid labels for the different segments. Segments with different labels are discarded

  • left_offset (int) – The number of instance at the left of the first R peak of the segment. Default to 99

  • right_offset (int) – The number of instances at the right of the last R peak of the segment. Default to 160

  • fixed_length (int, optional) – Should the segments have a fixed length? If fixed_length is a number, then the segments will have the specified length. If the segment length is greater than fixed_length, it is truncated or padded with zeros otherwise. Default to None.

Returns:

  • A tuple that contains the data and the associated labels. Data has a shape of (N, T, V)

  • where N is the number of segments (or instances), V is the number of variables (1 in this case)

  • and T is the number of timesteps of each segment. Labels are numerically encoded according to the

  • value passed in the (parameter labels param.)

TSFEDL.data.read_mit_bih(path: str, labels: ndarray = array(['N', 'L', 'R', 'A', 'V'], dtype='<U1'), left_offset: int = 99, right_offset: int = 160, fixed_length: Optional[int] = 1000) Tuple[ndarray, ndarray][source]

It reads the MIT-BIH Arrythmia X with the specified default configuration of the work presented at: Oh, Shu Lih, et al. “Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats.” Computers in biology and medicine 102 (2018): 278-287.

Parameters:
  • labels (array-like) – The labels of the different types of arrythmia to be employed

  • path (str) – The path of the directory where the X files are stored. Note: The X and annotations files must have the same name, but different extension (annotations must have .atr extension)

  • left_offset (int) – The number of instances at the left of the first R peak of the segment. Defaults to 99

  • right_offset (int) – The number of instances at the right of the last R peak of the segment. Defaults to 160

  • fixed_length (int, optional) – If different to None, the segment will have the specified number of instances. Note that if the segment length > fixed_length it will be truncate or padded with zeros otherwise.

Returns:

  • A tuple that contains the data and the associated labels as an ndarray. Data has a shape of (N, T, V)

  • where N is the number of segments (or instances), V is the number of variables (1 in this case)

  • and T is the number of timesteps of each segment. Labels are numerically encoded according to the

  • value passed in the (parameter labels param.)

TSFEDL.data.MIT_BIH(*args, **kwds)[source]

Reads the MIT-BIH datasets and return a data loader with Shape (N, C, L) where N is the batch size, C is the number of channels (1 in this dataset) and L is the length of the time series (1000 by default).

Parameters:
  • labels (array-like) – The labels of the different types of arrythmia to be employed

  • path (str) – The path of the directory where the X files are stored. Note: The X and annotations files must have the same name, but different extension (annotations must have .atr extension)

  • left_offset (int) – The number of instances at the left of the first R peak of the segment. Defaults to 99

  • right_offset (int) – The number of instances at the right of the last R peak of the segment. Defaults to 160

  • return_hot_coded (bool) – Wether to return the raw labels or hot-encoded ones.

Returns:

  • A tuple that contains the data and the associated labels as an ndarray. Data has a shape of (N, T, V)

  • where N is the number of segments (or instances), V is the number of variables (1 in this case)

  • and T is the number of timesteps of each segment. Labels are numerically encoded according to the

  • value passed in the (parameter labels param.)