Quickstart#
Prerequisites#
This is an introduction to YASA, geared mainly for new users. However, you’ll need to know a bit of Python, and especially scientific libraries such as NumPy and Pandas. This tutorial also assumes that you are familiar with basic sleep research and methods for analyzing polysomnography (PSG) data.
Make sure to install the latest version of YASA, by opening a terminal or Anaconda command prompt and entering: pip install --upgrade yasa
To follow this tutorial, you’ll need to download the two following files on your computer. The files should be saved in the same folder as this notebook. The PSG data is saved in the European Data Format (.edf).
Full-night
polysomnography data (.edf format, 266MB)
from a 22-years-old healthy female. The PSG recording includes 19 EEGs, 2 EOGs, 1 EMG and 1 EKG. The recording spans an entire night of sleep from 11:59 PM to 08:01 AM.Sleep stages (aka hypnogram, .csv format, 3KB)
, in 30-seconds epoch. The sleep stages are encoded as follow: 0 = Wake, 1 = N1, 2 = N2, 3 = N3, and 4 = REM.
Data loading and preprocessing#
Polysomnography data#
We will use the MNE package to load and preprocess the data in Python:
>>> import mne
>>> raw = mne.io.read_raw_edf("yasa_example_night_young.edf", preload=True)
>>> raw
<RawEDF | yasa_example_night_young.edf, 23 x 5784000 (28920.0 s), ~1015.0 MB, data loaded>
Note
YASA does not allow to visualize or scroll through the PSG data. However, this can be done using the free and excellent EDFBrowser software.
Selecting channels
The channel names can be shown with:
>>> print(raw.ch_names)
['ROC-A1', 'LOC-A2', 'C3-A2', 'O2-A1', 'C4-A1', 'O1-A2', 'EMG1-EMG2', 'Fp1-A2', 'Fp2-A1', 'F7-A2',
'F3-A2', 'FZ-A2', 'F4-A1', 'F8-A1', 'T3-A2', 'CZ-A2', 'T4-A1', 'T5-A2', 'P3-A2', 'PZ-A2', 'P4-A1',
'T6-A1', 'EKG-R-EKG-L']
Let’s remove the EOG, EMG and EKG channels:
>>> raw.drop_channels(["ROC-A1", "LOC-A2", "EMG1-EMG2", "EKG-R-EKG-L"])
>>> chan = raw.ch_names
>>> print(chan)
['C3-A2', 'O2-A1', 'C4-A1', 'O1-A2', 'Fp1-A2', 'Fp2-A1', 'F7-A2', 'F3-A2', 'FZ-A2',
'F4-A1', 'F8-A1', 'T3-A2', 'CZ-A2', 'T4-A1', 'T5-A2', 'P3-A2', 'PZ-A2', 'P4-A1', 'T6-A1']
Downsampling and filtering
The sampling frequency of the data in Hertz (Hz) is given by:
>>> print(raw.info["sfreq"])
200.00
The current sampling frequency of the data is therefore 200 Hz. To speed up computation, let’s downsample the data to 100 Hz:
>>> raw.resample(100)
>>> sf = raw.info["sfreq"]
>>> sf
100.0
Optionally, we can apply a 0.3-45 Hz bandpass-filter:
# We use "verbose" and ";" to disable the text output
>>> raw.filter(0.3, 45)
Finally, the underlying data can be accessed with:
>>> data = raw.get_data(units="uV")
>>> print(data.shape)
(19, 2892000)
In this example, data
is a two-dimensional NumPy array where the rows represent the channels (19 EEG channels) and the columns represent the data samples (~3 million samples per channel).
Hypnogram#
Sleep staging (aka hypnogram) for this example night was performed by a trained technician following the standard rules of the American Academy of Sleep Medicine (AASM).
The output is saved in a .csv file, where each row represents 30 seconds of data. The stages are mapped to integers such that 0 = Wake, 1 = N1 sleep, 2 = N2 sleep, 3 = N3 sleep and 4 = REM sleep.
We can load this file using the pandas.read_csv
function:
>>> import pandas as pd
>>> hypno = pd.read_csv("yasa_example_night_young_hypno.csv", squeeze=True)
>>> hypno
0 0
1 0
2 0
3 0
4 0
..
959 2
960 2
961 2
962 2
963 0
Name: Stage, Length: 964, dtype: int64
Note
If you do not have sleep staging, you can use YASA to automatically detect the sleep stages for you. We’ll come back to this later on in this tutorial.
Using the plot_hypnogram
function, we can plot the hypnogram:
>>> import yasa
>>> yasa.plot_hypnogram(hypno);
Sleep statistics and stage-transition matrix#
Using the hypnogram, we can calculate standard sleep statistics using the sleep_statistics
function.
Importantly, this function has an sf_hyp
argument, which is the sampling frequency of the hypnogram. Since we have one value every 30-seconds, the sampling frequency is 0.3333 Hz, or 1 / 30 Hz.
>>> yasa.sleep_statistics(hypno, sf_hyp=1/30)
{'TIB': 482.0,
'SPT': 468.5,
'WASO': 9.0,
'TST': 459.5,
'N1': 17.5,
'N2': 214.0,
'N3': 85.5,
'REM': 142.5,
'NREM': 317.0,
'SOL': 13.0,
'Lat_N1': 13.0,
'Lat_N2': 16.5,
'Lat_N3': 31.5,
'Lat_REM': 77.0,
'%N1': 3.808487486398259,
'%N2': 46.572361262241564,
'%N3': 18.607181719260065,
'%REM': 31.01196953210011,
'%NREM': 68.98803046789989,
'SE': 95.33195020746888,
'SME': 98.07897545357524}
Furthermore, we can also calculate the sleep stages transition matrix using the transition_matrix
function:
>>> counts, probs = yasa.transition_matrix(hypno)
>>> probs.round(3)
probs
is the probability transition matrix, i.e. given that the current sleep stage is A, what is the probability that the next sleep stage is B.
Several metrics of sleep fragmentation can be calculated from probs
. For example, the stability of sleep can be calculated by taking the average of the diagonal values of N2, N3 and REM sleep:
>>> import numpy as np
>>> np.diag(probs.loc[2:, 2:]).mean().round(3)
Spectral analyses#
Full-night spectrogram plot#
The current sampling frequency of the hypnogram is one value every 30-seconds, i.e. ~0.3333 Hz. However, most YASA functions requires the sampling frequency of the hypnogram to be the same as the sampling frequency of the PSG data. In this example, we therefore need to upsample our hypnogram from 0.333 Hz to 100 Hz.
This can be done with the hypno_upsample_to_data
function:
>>> hypno_up = yasa.hypno_upsample_to_data(hypno, sf_hypno=1/30, data=raw)
>>> print(len(hypno_up))
Now that the hypnogram and data have the same shape, we can plot our hypnogram on top of a multitaper spectrogram using the plot_spectrogram
function, which shows the time-frequency representation of a single EEG channel across the entire night. The x-axis of the spectrogram is time in hours, and the y-axis is the frequency range (from 0 to 25 Hz).
Warmer colors indicate higher spectral power in this specific frequency band at this specific time for this channel. This kind of plot is very useful to quickly identify periods of NREM sleep (high power in frequencies below 5 Hz and spindle-related activity around ~14 Hz) and REM sleep (almost no power in frequencies below 5 Hz).
# We select only the C4-A1 EEG channel.
>>> yasa.plot_spectrogram(data[chan.index("C4-A1")], sf, hypno_up);
Note
Whenever you start a new analysis in YASA, we always recommend that you use the plot_spectrogram
function to check your data. This can help you easily identify artefact in the data or misalignement between the PSG data and hypnogram.
EEG power in specific frequency bands#
For a primer on EEG spectral bandpower please refer to https://raphaelvallat.com/bandpower.html.
Spectral analysis quantifies the power (or amplitude) of the EEG signal in different frequency bands. In neuroscience, the most common frequency bands are delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (30–~100 Hz). There are numerous studies that have reported significant relationship between the EEG power spectrum and human behavior, cognitive state, or mental illnesses, and EEG spectral analysis is now one of the principal analysis methods in the field of neuroscience and sleep research. It is especially relevant for sleep analysis, as it is well-known that the different stages of sleep vary drastically in their spectral content. For example, deep slow-wave sleep (N3) is associated with increased power in the low frequencies, especially the delta band (0.5-4Hz), and decreased power in the beta and gamma bands.
Calculating the average spectral power in different frequency bands is straightforward with the bandpower
function:
>>> yasa.bandpower(raw)
This calculates, for each channel separately, the average power in the main frequency bands across the entire recording. Importantly, the values are relative power, i.e. they are expressed as a proportion of the total power between the lowest frequency (default 0.5 Hz) and the highest frequency (default 40 Hz). We can disable this behavior and get the absolute spectral power values in \(μV^2 / Hz\) by using the relative=False
argument. Similarly, we can define custom frequency bands with the bands
parameter. In the example below, we calculate the absolute power in the 1-9 Hz frequency range (named “Slow”) and the 9-30 Hz range (named “Fast”):
>>> yasa.bandpower(raw, relative=False, bands=[(1, 9, "Slow"), (9, 30, "Fast")])
We can also pass an hypnogram to calculate the spectral powers separately for each sleep stage. In the example below, we use the upsampled hypnogram to calculate the spectral power separately for N2, N3 and REM. We save the results in a new variable named bandpower
.
>>> bandpower = yasa.bandpower(raw, hypno=hypno_up, include=(2, 3, 4))
If desired, we can then export the bandpower
dataframe to a CSV file using pandas.DataFrame.to_csv
:
>>> bandpower.to_csv("bandpower.csv")
Finally, we can use the topoplot
function to visualize the spectral powers across all electrodes. In the example below, we only plot the spectral values of stage N3, using the pandas.DataFrame.xs
function. As expected, the relative delta power is higher in frontal channels.
>>> fig = yasa.topoplot(bandpower.xs(3)["Delta"])
Events detection#
Spindles#
Automatic spindles detection can be performed with the yasa.spindles_detect
function. The detection is based on the algorithm described in Lacourse et al 2018, and a step-by-step explanation is provided in this notebook. For the sake of this tutorial, we’ll use the default detection thresholds, but these can (and should) be adjusted based on your own data. In the example below, we’ll specify the hypnogram and limit the detection to stage N2 and 3 (include=(2, 3)
).
>>> sp = yasa.spindles_detect(raw, hypno=hypno_up, include=(2, 3))
Here, the sp
variable is a SpindlesResults
, which is simply a bundle of functions (called methods) and data (attributes). For example, we can see a dataframe with all the detected events with:
>>> sp.summary()
The documentation of the spindles_detect
function explains what each of these columns represent and how they’re calculated. Furthermore, by specifying the grp_chan
and grp_stage
parameters, we tell YASA to first average across channels and sleep stages, respectively:
>>> sp.summary(grp_chan=True, grp_stage=True)
Finally, we can plot the average spindle, calculated for each channel separately and time-synced to the most prominent peak of the spindles:
>>> # Because of the large number of channels, we disable the 95%CI and legend
>>> sp.plot_average(ci=None, legend=False, palette="Blues");
Slow-waves#
The exact same steps can be applied with the sw_detect
function to automatically detect slow-waves:
>>> sw = yasa.sw_detect(raw, hypno=hypno_up, include=(2, 3))
>>> sw.summary()
>>> sw.plot_average(ci=None, legend=False, palette="Blues");
For more details on the output of the slow-waves detection, be sure to read the documentation and try the Jupyter notebooks.
Automatic sleep staging#
In this final section, we’ll see how to perform automatic sleep staging in YASA. As shown below, this takes no more than a few lines of code! Here, we’ll use a single EEG channel to predict a full-night hypnogram. For more details on the algorithm, check out the eLife publication or the documentation of the function.
>>> sls = yasa.SleepStaging(raw, eeg_name="C3-A2")
>>> hypno_pred = sls.predict() # Predict the sleep stages
>>> hypno_pred = yasa.hypno_str_to_int(hypno_pred) # Convert "W" to 0, "N1" to 1, etc
>>> yasa.plot_hypnogram(hypno_pred); # Plot
Let’s calculate the agreement against the ground-truth expert scoring:
>>> from sklearn.metrics import accuracy_score
>>> print(f"The accuracy is {100 * accuracy_score(hypno, hypno_pred):.3f}%")
The accuracy is 82.676%