yasa.EpochByEpochAgreement¶
- class yasa.EpochByEpochAgreement(ref_hyps, obs_hyps)[source]¶
Evaluate agreement between two hypnograms or two collections of hypnograms.
Evaluation includes averaged agreement scores, one-vs-rest agreement scores, agreement scores summarized across all sleep and summarized by sleep stage, and various plotting options to visualize the two hypnograms simultaneously. See examples for more detail.
New in version 0.7.0.
- Parameters
- ref_hypsiterable of
yasa.Hypnogram
A collection of reference hypnograms (i.e., those considered ground-truth).
Each
yasa.Hypnogram
inref_hyps
must have the samescorer
.If a
dict
, key values are use to generate unique sleep session IDs. If any other iterable (e.g.,list
ortuple
), then unique sleep session IDs are automatically generated.- obs_hypsiterable of
yasa.Hypnogram
A collection of observed hypnograms (i.e., those to be evaluated).
Each
yasa.Hypnogram
inobs_hyps
must have the samescorer
, and this scorer must be different than the scorer of hypnograms inref_hyps
.If a
dict
, key values must match those ofref_hyps
.- .. important::
It is assumed that the order of hypnograms are the same in
ref_hyps
andobs_hyps
. For example, the third hypnogram inref_hyps
andobs_hyps
must come from the same sleep session, and they must only differ in that they have different scorers.- .. seealso:: For comparing just two hypnograms, use :py:meth:`yasa.Hynogram.evaluate`.
- ref_hypsiterable of
Notes
Many steps here are influenced by guidelines proposed in Menghini et al., 2021 [Menghini2021]. See https://sri-human-sleep.github.io/sleep-trackers-performance/AnalyticalPipeline_v1.0.0.html
References
- Menghini2021
Menghini, L., Cellini, N., Goldstone, A., Baker, F. C., & de Zambotti, M. (2021). A standardized framework for testing the performance of sleep-tracking technology: step-by-step guidelines and open-source code. SLEEP, 44(2), zsaa170. https://doi.org/10.1093/sleep/zsaa170
Examples
>>> import yasa >>> ref_hyps = [yasa.simulate_hypnogram(tib=600, scorer="Human", seed=i) for i in range(10)] >>> obs_hyps = [h.simulate_similar(scorer="YASA", seed=i) for i, h in enumerate(ref_hyps)] >>> ebe = yasa.EpochByEpochAgreement(ref_hyps, obs_hyps) >>> agr = ebe.get_agreement() >>> agr.head(5).round(2) accuracy balanced_acc kappa mcc precision recall f1 sleep_id 1 0.31 0.26 0.07 0.07 0.31 0.31 0.31 2 0.33 0.33 0.14 0.14 0.35 0.33 0.34 3 0.35 0.24 0.06 0.06 0.35 0.35 0.35 4 0.22 0.21 0.01 0.01 0.21 0.22 0.21 5 0.21 0.17 -0.06 -0.06 0.20 0.21 0.21
>>> ebe.get_agreement_bystage().head(12).round(3) fbeta precision recall support stage sleep_id WAKE 1 0.391 0.371 0.413 189.0 2 0.299 0.276 0.326 184.0 3 0.234 0.204 0.275 255.0 4 0.268 0.285 0.252 321.0 5 0.228 0.230 0.227 181.0 6 0.407 0.384 0.433 284.0 7 0.362 0.296 0.467 287.0 8 0.298 0.519 0.209 263.0 9 0.210 0.191 0.233 313.0 10 0.369 0.420 0.329 362.0 N1 1 0.185 0.185 0.185 124.0 2 0.121 0.131 0.112 160.0
>>> ebe.get_confusion_matrix(sleep_id=1) YASA WAKE N1 N2 N3 REM Human WAKE 78 24 50 3 34 N1 23 23 43 15 20 N2 60 58 183 43 139 N3 30 10 50 5 32 REM 19 9 121 50 78
>>> import matplotlib.pyplot as plt >>> fig, ax = plt.subplots(figsize=(6, 3), constrained_layout=True) >>> ebe.plot_hypnograms(sleep_id=10)
>>> fig, ax = plt.subplots(figsize=(6, 3)) >>> ebe.plot_hypnograms( >>> sleep_id=8, ax=ax, obs_kwargs={"color": "red", "lw": 2, "ls": "dotted"} >>> ) >>> plt.tight_layout()
>>> session = 8 >>> fig, ax = plt.subplots(figsize=(6.5, 2.5), constrained_layout=True) >>> style_a = dict(alpha=1, lw=2.5, ls="solid", color="gainsboro", label="Michel") >>> style_b = dict(alpha=1, lw=2.5, ls="solid", color="cornflowerblue", label="Jouvet") >>> legend_style = dict( >>> title="Scorer", frameon=False, ncol=2, loc="lower center", bbox_to_anchor=(0.5, 0.9) >>> ) >>> ax = ebe.plot_hypnograms( >>> sleep_id=session, ref_kwargs=style_a, obs_kwargs=style_b, legend=legend_style, ax=ax >>> ) >>> acc = ebe.get_agreement().multiply(100).at[session, "accuracy"] >>> ax.text( >>> 0.01, 1, f"Accuracy = {acc:.0f}%", ha="left", va="bottom", transform=ax.transAxes >>> )
When comparing only 2 hypnograms, use the
evaluate()
method:>>> hypno_a = yasa.simulate_hypnogram(tib=90, scorer="RaterA", seed=8) >>> hypno_b = hypno_a.simulate_similar(scorer="RaterB", seed=9) >>> ebe = hypno_a.evaluate(hypno_b) >>> ebe.get_confusion_matrix() RaterB WAKE N1 N2 N3 RaterA WAKE 71 2 20 8 N1 1 0 9 0 N2 12 4 25 0 N3 24 0 1 3
Methods
__init__
(ref_hyps, obs_hyps)get_agreement
([sample_weight, scorers])Return a
pandas.DataFrame
of weighted (i.e., averaged) agreement scores.get_agreement_bystage
([beta])Return a
pandas.DataFrame
of unweighted (i.e., one-vs-rest) agreement scores.get_confusion_matrix
([sleep_id, agg_func])Return a
ref_hyp
/``obs_hyp``confusion matrix from either a single session or all sessions concatenated together.Return a
pandas.DataFrame
of sleep statistics for each hypnogram derived from both reference and observed scorers.multi_scorer
(df, scorers)Compute multiple agreement scores from a 2-column dataframe (an optional 3rd column may contain sample weights).
plot_hypnograms
([sleep_id, legend, ax, ...])Plot the two hypnograms of one session overlapping on the same axis.
summary
([by_stage])Return group-level agreement scores.
Attributes
A
pandas.DataFrame
including all hypnograms.The number of unique sleep sessions.
The name of the observed scorer.
The name of the reference scorer.
- get_agreement(sample_weight=None, scorers=None)[source]¶
Return a
pandas.DataFrame
of weighted (i.e., averaged) agreement scores.- Parameters
- self
EpochByEvaluation
A
EpochByEvaluation
instance.- sample_weightNone or
pandas.Series
Sample weights passed to underlying
sklearn.metrics
functions where possible. If apandas.Series
, the index must match exactly that ofdata
.- scorersNone, list, or dictionary
The scorers to be used for evaluating agreement. If None (default), default scorers are used. If a list, the list must contain strings that represent metrics from the sklearn metrics module (e.g.,
accuracy
,precision
). If more customization is desired, a dictionary can be passed with scorer names (str) as keys and custom functions as values. The custom functions should take 3 positional arguments (true values, predicted values, and sample weights).
- self
- Returns
- agreement
pandas.DataFrame
A
DataFrame
with agreement metrics as columns and sessions as rows.
- agreement
- get_agreement_bystage(beta=1.0)[source]¶
Return a
pandas.DataFrame
of unweighted (i.e., one-vs-rest) agreement scores.- Parameters
- self
EpochByEvaluation
A
EpochByEvaluation
instance.- betafloat
- self
- Returns
- agreement
pandas.DataFrame
A
DataFrame
with agreement metrics as columns and aMultiIndex
with session and sleep stage as rows.
- agreement
- get_confusion_matrix(sleep_id=None, agg_func=None, **kwargs)[source]¶
Return a
ref_hyp
/``obs_hyp``confusion matrix from either a single session or all sessions concatenated together.- Parameters
- self
yasa.EpochByEpochAgreement
A
yasa.EpochByEpochAgreement
instance.- sleep_idNone or a valid sleep ID
If None (default), cross-tabulation is derived from the entire group dataset. If a valid sleep ID, cross-tabulation is derived using only the reference and observed scored hypnograms from that sleep session.
- agg_funcNone or str
If None (default), group results returns a
DataFrame
complete with all individual session results. If not None, group results returns aDataFrame
aggregated across sessions whereagg_func
is passed asfunc
parameter inpandas.DataFrame.groupby.agg()
. For example, setagg_func="sum"
to get a single confusion matrix across all epochs that does not take session into account.- **kwargskey, value pairs
Additional keyword arguments are passed to
sklearn.metrics.confusion_matrix()
.
- self
- Returns
- conf_matr
pandas.DataFrame
A confusion matrix with stages from the reference scorer as indices and stages from the test scorer as columns.
- conf_matr
Examples
>>> import yasa >>> ref_hyps = [yasa.simulate_hypnogram(tib=90, scorer="Rater1", seed=i) for i in range(3)] >>> obs_hyps = [h.simulate_similar(scorer="Rater2", seed=i) for i, h in enumerate(ref_hyps)] >>> ebe = yasa.EpochByEpochAgreement(ref_hyps, obs_hyps) >>> ebe.get_confusion_matrix(sleep_id=2) Rater2 WAKE N1 N2 N3 REM Rater1 WAKE 1 2 23 0 0 N1 0 9 13 0 0 N2 0 6 71 0 0 N3 0 13 42 0 0 REM 0 0 0 0 0
>>> ebe.get_confusion_matrix() Rater2 WAKE N1 N2 N3 REM sleep_id Rater1 1 WAKE 30 0 3 0 35 N1 3 2 7 0 0 N2 21 12 7 0 4 N3 0 0 0 0 0 REM 2 8 29 0 17 2 WAKE 1 2 23 0 0 N1 0 9 13 0 0 N2 0 6 71 0 0 N3 0 13 42 0 0 REM 0 0 0 0 0 3 WAKE 16 0 7 19 19 N1 0 7 2 0 5 N2 0 10 12 7 5 N3 0 0 16 11 0 REM 0 15 11 18 0
>>> ebe.get_confusion_matrix(agg_func="sum") Rater2 WAKE N1 N2 N3 REM Rater1 WAKE 47 2 33 19 54 N1 3 18 22 0 5 N2 21 28 90 7 9 N3 0 13 58 11 0 REM 2 23 40 18 17
- get_sleep_stats()[source]¶
Return a
pandas.DataFrame
of sleep statistics for each hypnogram derived from both reference and observed scorers.See also
See also
- Parameters
- self
yasa.EpochByEpochAgreement
A
yasa.EpochByEpochAgreement
instance.
- self
- Returns
- sstats
pandas.DataFrame
A
DataFrame
with sleep statistics as columns and two rows for each individual (one for reference scorer and another for test scorer).
- sstats
- static multi_scorer(df, scorers)[source]¶
Compute multiple agreement scores from a 2-column dataframe (an optional 3rd column may contain sample weights).
This function offers convenience when calculating multiple agreement scores using
pandas.DataFrame.groupby.apply()
. Scikit-learn doesn’t include a function that returns multiple scores, and the GroupBy implementation ofapply
in pandas does not accept multiple functions.- Parameters
- df
pandas.DataFrame
A
DataFrame
with 2 columns and length of n_samples. The first column contains reference values and second column contains observed values. If a third column, it must contain sample weights to be passed to underlyingsklearn.metrics
functions assample_weight
where applicable.- scorersdictionary
The scorers to be used for evaluating agreement. A dictionary with scorer names (str) as keys and functions as values.
- df
- Returns
- scoresdict
A dictionary with scorer names (
str
) as keys and scores (float
) as values.
- plot_hypnograms(sleep_id=None, legend=True, ax=None, ref_kwargs={}, obs_kwargs={})[source]¶
Plot the two hypnograms of one session overlapping on the same axis.
See also
- Parameters
- self
yasa.EpochByEpochAgreement
A
yasa.EpochByEpochAgreement
instance.- sleep_ida valid sleep ID or None
The sleep session to plot. If multiple sessions are included in the
EpochByEpochAgreement
instance, asleep_id
must be provided. If only one session is present,None
(default) will plot the two hypnograms of the only session.- legendbool or dict
If True (default) or a dictionary, a legend is added. If a dictionary, all key/value pairs are passed as keyword arguments to the
matplotlib.pyplot.legend()
call.- ax
matplotlib.axes.Axes
or None Axis on which to draw the plot, optional.
- ref_kwargsdict
Keyword arguments passed to
yasa.plot_hypnogram()
when plotting the reference hypnogram.- obs_kwargsdict
Keyword arguments passed to
yasa.plot_hypnogram()
when plotting the observed hypnogram.
- self
- Returns
- ax
matplotlib.axes.Axes
Matplotlib Axes
- ax
Examples
>>> from yasa import simulate_hypnogram >>> hyp = simulate_hypnogram(scorer="Anthony", seed=19) >>> ax = hyp.evaluate(hyp.simulate_similar(scorer="Alan", seed=68)).plot_hypnograms()
- summary(by_stage=False, **kwargs)[source]¶
Return group-level agreement scores.
Default aggregated measures are
- Parameters
- self
EpochByEpochAgreement
A
EpochByEpochAgreement
instance.- by_stagebool
If
False
(default),summary
will include agreement scores derived from average-based metrics. IfTrue
, returnedsummary
DataFrame
will include agreement scores for each sleep stage, derived from one-vs-rest metrics.- **kwargskey, value pairs
Additional keyword arguments are passed to
pandas.DataFrame.groupby.agg()
. This can be used to customize the descriptive statistics returned.
- self
- Returns
- summary
pandas.DataFrame
A
pandas.DataFrame
summarizing agreement scores across the entire dataset with descriptive statistics.>>> ebe = yasa.EpochByEpochAgreement(...) >>> agreement = ebe.get_agreement() >>> ebe.summary()
This will give a
DataFrame
where each row is an agreement metric and each column is a descriptive statistic (e.g., mean, standard deviation). To control the descriptive statistics included as columns:>>> ebe.summary(func=["count", "mean", "sem"])
- summary
- property data¶
A
pandas.DataFrame
including all hypnograms.
- property n_sleeps¶
The number of unique sleep sessions.
- property obs_scorer¶
The name of the observed scorer.
- property ref_scorer¶
The name of the reference scorer.