nnsa.edfreadpy.io package

Submodules

nnsa.edfreadpy.io.config module

This module contains constants and defaults related to reading data files.

Functions:

`default_edf_file_header`()	Return the default file header for EDF file.
`default_edf_signal_header`()	Return the default signal header for EDF file.

nnsa.edfreadpy.io.config.default_edf_file_header()[source]

Return the default file header for EDF file.

Returns:: (dict) dictionary populated with default file header keys with value None.

nnsa.edfreadpy.io.config.default_edf_signal_header()[source]

Return the default signal header for EDF file.

Returns:: (dict) dictionary populated with default signal header keys with value None.

nnsa.edfreadpy.io.reader module

Module for reading EDF(+) files.

Classes:

`BaseReader`()	Abstract base class for readers of time series data.
`EdfReader`(filepath)	High-level interface for reading EDF(+) files.

class nnsa.edfreadpy.io.reader.BaseReader[source]

Bases: ABC

Abstract base class for readers of time series data.

Methods:

close_file()

Close the data file (if opened).

abstract close_file()[source]: Close the data file (if opened).

class nnsa.edfreadpy.io.reader.EdfReader(filepath)[source]

Bases: BaseReader

High-level interface for reading EDF(+) files.

Parameters:: filepath (str) – path to the EDF(+) file to read.

Attributes:

`additional_info`	Return additional file and signal info.
`encoding`	Encoding is ascii by EDF convention.
`file_header`	Return the file header (as a dictionary).
`is_anonymized`	Check if EDF is anonymized.
`is_discontinuous`	Check if EDF is discontinuous.
`signal_headers`	Return the signals headers (as a dictionary of lists).
`size`	Return the filesize in bytes.
`total_duration`	Return the total duration of the recording (in seconds).

Methods:

`anonymize`(seed_offset[, extract_patient_id_fun])	Anonymize the header information in place (does not save to a new EDF file, but adapt the info in memory).
`anonymize_and_save`(filepath_out, seed_offset)	Anonymize the header information in place and save to a new EDF file.
`append_and_save`(filepath_out, *args[, ...])	Append signals to the EDF and save.
`close`()	Close the EDF file (if opened).
`close_file`()	Close the EDF file (if opened).
`extract_epoch_and_save`(filepath_out[, ...])	Read the data, extract an epoch (e.g.
`flush_all_digital_data`()	Do no longer store the raw digital data values in memory.
`insert_annotations_and_save`(filepath_out, ...)	Inserts annotations in the EDF and save.
`read_annotations`([efficiency, offset, ...])	Read annotations in an EDF+ file.
`read_signal`(channel[, start, stop, ...])	Read a (part of a) signal from the EDF file.
`reset_annotations_and_save`(filepath_out[, ...])	Reset annotations in the EDF and save.

property additional_info

Return additional file and signal info.

Returns:: (dict) – Dictionary with additional file and signal info.

anonymize(seed_offset, extract_patient_id_fun=<function extract_patient_file_id>, **kwargs)[source]

Anonymize the header information in place (does not save to a new EDF file, but adapt the info in memory).

Changes the startdate, patient_id and in case of EDF+ also the startdate in recording_id.

Parameters:

seed_offset (int) – this seed offset will be added to the seed of the random generator when changing the dates. Therefore, you can only trace back the original date if you know what the seed_offset was when this function was called (such that you cannot trace the original date back from just this code).
extract_patient_id_fun (function) – optional function that take in the (absolute) filepath of the EDF and **kwargs, and returns the id of the patient. This ID will then be used to seed the random generator when changing the date to a random date, such that the date change is the same for files from the same patient. If set to False or None, the date randomnization is completely random.
**kwargs – for extract_patient_id_fun() (if specified).

anonymize_and_save(filepath_out, seed_offset, check_is_anonymized=True, skip_anonymized=False, **kwargs)[source]

Anonymize the header information in place and save to a new EDF file.

Changes the startdate, patient_id and in case of EDF+ also the startdate in recording_id.

Parameters:

filepath_out (str) – file path for the new anonymized EDF file.
seed_offset (int) – see self.anonymize().
check_is_anonymized (bool) – if True, only anonymizes the header and date if not self.is_anonymized().
skip_anonymized (bool) – if True, checks if anonymized and only saves new files for EDFs that were not yet anonymized. If False, a new file is always saved to filepath_out (also when the file already was anonymized, i.e., a copy is made).
**kwargs – for self.anonymize().

append_and_save(filepath_out, *args, allow_duplicates=False, overwrite=False, hdr_updates_bytes=None, sig_hdr_updates=None, verbose=1)[source]

Append signals to the EDF and save.

Parameters:

filepath_out (str) – filepath to save to.
*args (dict) –
tuple of dicts with data for signals to append (one dict for each signal). The dicts should have required fields:

”signal”, “fs”, “label”.

Optional fileds are:
”transducer”, “physical_dimension”, “physical_min”, “physical_max”, “digital_min”, digital_max”, “prefilter”, “reserved”.

See the EDF specs on their website for meaning of the fields.
allow_duplicates (bool) – if False, raises an error if signals are appended with labels that already exist in the EDF file.
overwrite (bool) – if False, raises an error if output file already exist. If True, overwrites any existing EDF file with same name.
hdr_updates_bytes (dict) – optional updates for the file header (in bytes).
verbose (int) – verbosity level.

Examples

filepath = ‘<filepath>.EDF’

filepath_out = ‘test.EDF’ signals = (

dict(signal=np.random.rand(15000), fs=10, label=’test’), dict(signal=np.random.randint(0, 4, 1500), fs=1, label=’test2’),

)

with EdfReader(filepath) as r:: r.append_and_save(filepath_out, *signals)

close()[source]: Close the EDF file (if opened).

close_file()[source]: Close the EDF file (if opened).

property encoding: Encoding is ascii by EDF convention.

extract_epoch_and_save(filepath_out, begin=0, end=None, overwrite=False, verbose=1)[source]

Read the data, extract an epoch (e.g. the first 4 hours), and save the epoch to an EDF file.

Parameters:

filepath_out (str) – filepath to save to.
begin (float) – starttime of epoch in seconds, relative to the start of the recording.
end (float, None) – endtime of epoch in secpnds, relative to the start of the recording. If None, takes the end of the recording.
overwrite (bool) – If True, overwrites any existing EDF file with same name. If False, raise error if filepath already exist.
verbose (int) – verbosity level.

property file_header

Return the file header (as a dictionary).

Returns:: (dict) – file header.

flush_all_digital_data()[source]: Do no longer store the raw digital data values in memory.

insert_annotations_and_save(filepath_out, annotations, reset_annotations=False, overwrite=False, verbose=1)[source]

Inserts annotations in the EDF and save.

If there already exists an EDF Annotations signal, the new annotations are added (unless reset_annotations is set to True, see below). If there does not exists an EDF Annotations signal, annotataions are created.

Parameters:

filepath_out (str) – filepath to save to.
annotations (AnnotationSet, pd.DataFrame) – AnnotationSet or pandas DataFrame with the following columns: ‘onset’: the starttime of the annotation (in seconds with repect to the start of recording). ‘duration’: the duration of the annotation (specify -1 if not applicable). ‘text’: annotation text.
reset_annotations (bool) – if True, any existing annotations will be removed. If False not (new annotations will be appended to the existing ones).
overwrite (bool) – If True, overwrites any existing EDF file with same name. If False, raise error if filepath_out already exist.
verbose (int) – verbosity level.

Examples

filepath = ‘<filepath>.EDF’

filepath_out = ‘test.EDF’

annotations = pd.DataFrame({: ‘onset’: [43.9], ‘duration’: [20], ‘text’: [‘Hello there’]})
with EdfReader(filepath) as r:: r.insert_annotations_and_save(filepath_out=filepath_out, annotations=annotations)

property is_anonymized

Check if EDF is anonymized.

Returns:: (bool) – True if anonymization of EDF file is detected, False if not.

property is_discontinuous

Check if EDF is discontinuous.

Returns:: (bool) – True if discontinuous, False if not.

read_annotations(efficiency='speed', offset=0, annotations_label=None)[source]

Read annotations in an EDF+ file.

Note: EDF+ only. Will raise an error if no EDF Annotations channel is present in the file.

Note: by default the fractional offset of the start of the recording is subtracted from the annotation onset times, assuming the time array of the loaded signals will start at zero exactly (whereas in the file, the signals might start a fraction of a second later than reported by the starttime in the file header).

Parameters:

efficiency (str, optional) – Specify which algorithm to use: ‘speed’ uses an algorithm optimized for speed when reading annotations from a large file (see _read_annotations_max_speed), ‘memory’ uses an algorithm that requires the least amount of memory (see _read_annotations_min_memory).
offset (float, optional) –
This offset value will be subtracted from the onset time of each annotation. If None, this offset is inferred from the EDF file, such that the start of recording corresponds to time 0 s (the offset is the start offset of recording, read from the annotations). By default, the offset is 0, such that the times are used as in the annotations. This means that the start of recording is not exactly at 0 s, but may lie between 0 and 1 s. If synchronization between annotations and signals is important, either do one of the two: 1. Read the signal as a TimeSeries object (use the extension form the nnsa package) and read the

annotations with offset = 0. The time series object will contain the time_offset in its time array (it does not start at exactly 0) and this time array will correspond to the onset times in the annotation set.

Without using nnsa package: 2. For continuous signals:

Read the signal as an array with self.read_signal and read annotations with offset = None. The time array of the signal will start at 0 seconds. Using the sampling frequency and starttime = 0, you can create the time array that is compatible with the onset times in the annotation set. NOTE: This appraoch does not work for discontinuous signals.
1. For discontinuous signals:
  Read the signals as with self.read_signal with discontinuous_mode to ‘all’, and read annotations with offset = 0. Use self._get_discontinuous_timestamps() to get the starttime of each signal in the returned list. Using the sampling frequency and these starttimes, you can create the time arrays that are compatible with the onset times in the annotation set.
annotations_label (float, optional) – Specify a label for the AnnotationSet that will be created. By default the name of the investigator as saved in the EDF+ header is used.

Returns:

annotation_set (edfreadpy.AnnotationSet) – Collection of annotations, which are stored as edfreadpy.Annotation objects.

read_signal(channel, start=0, stop=None, discontinuous_mode='longest', efficiency='speed', verbose=0)[source]

Read a (part of a) signal from the EDF file.

Parameters:

channel (int or string) – Specify which signal to read, by specifying its channel index (int) or channel label (str).
start (int, optional) – Specify the sample to start reading from (counting from 0).
stop (int, optional) – Specify the sample to stop reading (the specified sample will not be read, but note counting is from 0).
discontinuous_mode (str, optional) – see self._handle_discontinuous_signal()
efficiency (str, optional) – the algorithm to use for reading: ‘speed’ uses an algorithm optimized for speed when loading a large portion of the signal (see _read_digital_data_max_speed), ‘memory’ uses an algorithm that requires the least amount of memory (see _read_digital_data_min_memory). Note that the ‘memory’ option may be faster than the ‘speed’ option when reading only a small part of the signal. However, when reading multiple times from the same file (e.g. read multiple signals), ‘speed’ is probably fastest, even when reading only small parts, since this algorithm stores the raw data of the entire file in memory the first time it’s called.
verbose (int) – verbosity level (when efficiency if ‘memory’). If 1, shows a progress bar.

Returns:

signal_data (np.ndarray) – Array holding the physical values of the specified signal.

reset_annotations_and_save(filepath_out, overwrite=False, verbose=1)[source]

Reset annotations in the EDF and save.

If there exists EDF Annotations, only the time keeping TAL is retained and any other annotations are removed. If there does not exists an EDF Annotations field, a EDF Annotations field is created with time-kepping TAL, with offset 0 s.

Parameters:

filepath_out (str) – filepath to save to.
overwrite (bool) – if False, raises an error if output file already exist. If True, overwrites any existing EDF file with same name.
verbose (int) – verbosity level.

property signal_headers

Return the signals headers (as a dictionary of lists). E.g. the label of signal 3 is in self.signal_header[‘label’][3]

Returns:: (dict) – signal headers. Each value in the dict is a list, corresponding to the signals.

property size: Return the filesize in bytes.

property total_duration: Return the total duration of the recording (in seconds).

nnsa.edfreadpy.io.utils module

Functions:

`check_unit`(unit)	Check if the unit is a valid standardized unit.
`standardize_and_check_ecg_label`(label)	Check if the label is a valid signal label for ECG signals and convert a non-standardazied, but valid ECG label to a standardized ECG label.
`standardize_and_check_eeg_label`(label)	Check if the label is a valid signal label for EEG signals and convert a non-standardazied, but valid EEG label to a standardized EEG label.
`standardize_and_check_label`(label)	Check if the label is a valid standardized signal label according to EDF+ standards.

nnsa.edfreadpy.io.utils.check_unit(unit)[source]

Check if the unit is a valid standardized unit.

Parameters:: unit (str) – a unit (physical dimension).
Returns:: (bool) – True if the unit is a valid standardzied unit, False if not.

nnsa.edfreadpy.io.utils.standardize_and_check_ecg_label(label)[source]

Check if the label is a valid signal label for ECG signals and convert a non-standardazied, but valid ECG label to a standardized ECG label.

Parameters:

label (str) – a label for an ECG signal.

Returns:

label (str) – the (standardized) ECG signal label. If valid is True, this is the valid standardized label. If valid is False, this is the unchanged, original (invalid) label.
valid (bool) – True if the specified label is a valid label, False if not.

nnsa.edfreadpy.io.utils.standardize_and_check_eeg_label(label)[source]

Check if the label is a valid signal label for EEG signals and convert a non-standardazied, but valid EEG label to a standardized EEG label.

Parameters:

label (str) – a label for an EEG signal.

Returns:

label (str) – the (standardized) EEG signal label. If valid is True, this is the valid standardized label. If valid is False, this is the unchanged, original (invalid) label.
valid (bool) – True if the specified label is a valid label, False if not.

nnsa.edfreadpy.io.utils.standardize_and_check_label(label)[source]

Check if the label is a valid standardized signal label according to EDF+ standards.

If not, tries to convert a non-standardazied, but valid label to a standardized label.

Parameters:

label (str) – a label for an arbitrary signal.

Returns:

label (str) – the (standardized) signal label. If valid is True, this is the valid standardized label. If valid is False, this is the unchanged, original (invalid) label.
valid (bool) – True if the specified label is a valid label, False if not.

Module contents

Package for reading EDF(+) files.