yasa.SleepStatsAgreement¶
- class yasa.SleepStatsAgreement(ref_data, obs_data, *, ref_scorer='Reference', obs_scorer='Observed', agreement=1.96, confidence=0.95, alpha=0.05, verbose=True, bootstrap_kwargs={})[source]¶
Evaluate agreement between sleep statistics reported by two different scorers. Evaluation includes bias and limits of agreement (as well as both their confidence intervals), various plotting options, and calibration functions for correcting biased values from the observed scorer.
Features include: * Get summary calculations of bias, limits of agreement, and their confidence intervals. * Test statistical assumptions of bias, limits of agreement, and their confidence intervals, and apply corrective procedures when the assumptions are not met. * Get bias and limits of agreement in a string-formatted table. * Calibrate new data to correct for biases in observed data. * Return individual calibration functions. * Visualize discrepancies for outlier inspection. * Visualize Bland-Altman plots.
See also
New in version 0.7.0.
- Parameters
- ref_data
pandas.DataFrame
A
pandas.DataFrame
with sleep statistics from the reference scorer. Rows are unique observations and columns are unique sleep statistics.- obs_data
pandas.DataFrame
A
pandas.DataFrame
with sleep statistics from the observed scorer. Rows are unique observations and columns are unique sleep statistics. Shape, index, and columns must be identical toref_data
.- ref_scorerstr
Name of the reference scorer.
- obs_scorerstr
Name of the observed scorer.
- agreementfloat
Multiple of the standard deviation to plot agreement limits. The default is 1.96, which corresponds to a 95% confidence interval if the differences are normally distributed.
Note
agreement
gets adjusted for regression-modeled limits of agreement.- confidencefloat
The percentage confidence interval for the confidence intervals that are applied to bias and limits of agreement. The same confidence interval percentage is applied to both standard and bootstrapped confidence intervals.
- alphafloat
Alpha cutoff used for all assumption tests.
- verbosebool or str
Verbose level. Default (False) will only print warning and error messages. The logging levels are ‘debug’, ‘info’, ‘warning’, ‘error’, and ‘critical’. For most users the choice is between ‘info’ (or
verbose=True
) and warning (verbose=False
).
- ref_data
Notes
Sleep statistics that are identical between scorers are removed from analysis.
Many steps here are influenced by guidelines proposed in Menghini et al., 2021 [Menghini2021]. See https://sri-human-sleep.github.io/sleep-trackers-performance/AnalyticalPipeline_v1.0.0.html
References
- Menghini2021
Menghini, L., Cellini, N., Goldstone, A., Baker, F. C., & de Zambotti, M. (2021). A standardized framework for testing the performance of sleep-tracking technology: step-by-step guidelines and open-source code. SLEEP, 44(2), zsaa170. https://doi.org/10.1093/sleep/zsaa170
Examples
>>> import pandas as pd >>> import yasa >>> >>> # Generate fake reference and observed datasets with similar sleep statistics >>> ref_scorer = "Henri" >>> obs_scorer = "Piéron" >>> ref_hyps = [yasa.simulate_hypnogram(tib=600, scorer=ref_scorer, seed=i) for i in range(20)] >>> obs_hyps = [h.simulate_similar(scorer=obs_scorer, seed=i) for i, h in enumerate(ref_hyps)] >>> # Generate sleep statistics from hypnograms using EpochByEpochAgreement >>> eea = yasa.EpochByEpochAgreement(ref_hyps, obs_hyps) >>> sstats = eea.get_sleep_stats() >>> ref_sstats = sstats.loc[ref_scorer] >>> obs_sstats = sstats.loc[obs_scorer] >>> # Create SleepStatsAgreement instance >>> ssa = yasa.SleepStatsAgreement(ref_sstats, obs_sstats) >>> ssa.summary().round(1).head(3) variable bias_intercept ... uloa_parm interval center lower upper ... center lower upper sleep_stat ... %N1 -5.4 -13.9 3.2 ... 6.1 3.7 8.5 %N2 -27.3 -49.1 -5.6 ... 12.4 7.2 17.6 %N3 -9.1 -23.8 5.5 ... 20.4 12.6 28.3
>>> ssa.get_table().head(3)[["bias", "loa"]] bias loa sleep_stat %N1 0.25 Bias ± 2.46 * (-0.00 + 1.00x) %N2 -27.34 + 0.55x Bias ± 2.46 * (0.00 + 1.00x) %N3 1.38 Bias ± 2.46 * (0.00 + 1.00x)
>>> ssa.assumptions.head(3) unbiased normal constant_bias homoscedastic sleep_stat %N1 True True True False %N2 True True False False %N3 True True True False
>>> ssa.auto_methods.head(3) bias loa ci sleep_stat %N1 parm regr parm %N2 regr regr parm %N3 parm regr parm
>>> ssa.get_table(bias_method="parm", loa_method="parm").head(3)[["bias", "loa"]] bias loa sleep_stat %N1 0.25 -5.55, 6.06 %N2 -0.23 -12.87, 12.40 %N3 1.38 -17.67, 20.44
>>> new_hyps = [h.simulate_similar(scorer="Kelly", seed=i) for i, h in enumerate(obs_hyps)] >>> new_sstats = pd.Series(new_hyps).map(lambda h: h.sleep_statistics()).apply(pd.Series) >>> new_sstats = new_sstats[["N1", "TST", "WASO"]] >>> new_sstats.round(1).head(5) N1 TST WASO 0 42.5 439.5 147.5 1 84.0 550.0 38.5 2 53.5 489.0 103.0 3 57.0 469.5 120.0 4 71.0 531.0 69.0
>>> new_stats_calibrated = ssa.calibrate_stats(new_sstats, bias_method="auto") >>> new_stats_calibrated.round(1).head(5) N1 TST WASO 0 42.9 433.8 150.0 1 84.4 544.2 41.0 2 53.9 483.2 105.5 3 57.4 463.8 122.5 4 71.4 525.2 71.5
>>> import matplotlib.pyplot as plt >>> ax = ssa.plot_discrepancies_heatmap() >>> ax.set_title("Sleep statistic discrepancies") >>> plt.tight_layout()
>>> ssa.plot_blandaltman()
- __init__(ref_data, obs_data, *, ref_scorer='Reference', obs_scorer='Observed', agreement=1.96, confidence=0.95, alpha=0.05, verbose=True, bootstrap_kwargs={})[source]¶
Methods
__init__
(ref_data, obs_data, *[, ...])calibrate
(data[, bias_method, adjust_all])Calibrate a
DataFrame
of sleep statistics from a new scorer based on observed biases inobs_data
/obs_scorer
.get_calibration_func
(sleep_stat)Return a function for calibrating a specific sleep statistic, based on observed biases in
obs_data
/obs_scorer
.get_table
([bias_method, loa_method, ...])Return a
DataFrame
with bias, loa, bias_ci, loa_ci as string equations.summary
([ci_method])Return a
DataFrame
that includes all calculated metrics: * Parametric bias * Parametric lower and upper limits of agreement * Regression intercept and slope for modeled bias * Regression intercept and slope for modeled limits of agreement * Lower and upper confidence intervals for all metricsAttributes
A
pandas.DataFrame
containing boolean values indicating the pass/fail status of all statistical tests performed to test assumptions.A
pandas.DataFrame
containing the methods applied when'auto'
is selected.A long-format
pandas.DataFrame
containing all raw sleep statistics fromref_data
andobs_data
.The number of sessions.
The name of the observed scorer.
The name of the reference scorer.
Return a list of all sleep statistics included in the agreement analyses.
- calibrate(data, bias_method='auto', adjust_all=False)[source]¶
Calibrate a
DataFrame
of sleep statistics from a new scorer based on observed biases inobs_data
/obs_scorer
.- Parameters
- data
pandas.DataFrame
A
pandas.DataFrame
with sleep statistics from an observed scorer. Rows are unique observations and columns are unique sleep statistics.- bias_methodstr
If
'parm'
, sleep statistics are always adjusted based on parametric bias. If'regr'
, sleep statistics are always adjusted based on regression-modeled bias. If'auto'
(default), bias sleep statistics are adjusted by either'parm'
or'regr'
, depending on assumption violations.See also
- adjust_all: bool
If False (default), only adjust values for sleep statistics that showed a statistically significant bias in the
obs_data
. If True, adjust values for all sleep statistics.
- data
- Returns
- calibrated_data
pandas.DataFrame
A
DataFrame
with calibrated sleep statistics.
See also
calibrate()
..- calibrated_data
- get_calibration_func(sleep_stat)[source]¶
Return a function for calibrating a specific sleep statistic, based on observed biases in
obs_data
/obs_scorer
.See also
Examples
>>> ssa = yasa.SleepStatsAgreement(...) >>> calibrate_rem = ssa.get_calibration_func("REM") >>> new_obs_rem_vals = np.array([50, 40, 30, 20]) >>> calibrate_rem(new_obs_rem_vals) >>> calibrate_rem(new_obs_rem_vals) array([50, 40, 30, 20]) >>> calibrate_rem(new_obs_rem_vals, bias_test=False) array([42.825, 32.825, 22.825, 12.825]) >>> calibrate_rem(new_obs_rem_vals, bias_test=False, method="regr") array([ -9.33878878, -9.86815607, -10.39752335, -10.92689064])
- get_table(bias_method='auto', loa_method='auto', ci_method='auto', fstrings={})[source]¶
Return a
DataFrame
with bias, loa, bias_ci, loa_ci as string equations.- Parameters
- bias_methodstr
If
'parm'
(i.e., parametric), bias is always represented as the mean difference (observed minus reference). If'regr'
(i.e., regression), bias is always represented as a regression equation. If'auto'
(default), bias is represented as a regression equation for sleep statistics where the score differences are proportionally biased and as the mean difference otherwise.- loa_methodstr
If
'parm'
(i.e., parametric), limits of agreement are always represented as bias +/- 1.96 standard deviations (where 1.96 can be adjusted through theagreement
parameter). If'regr'
(i.e., regression), limits of agreement are always represented as a regression equation. If'auto'
(default), limits of agreement are represented as a regression equation for sleep statistics where the score differences are proportionally biased and as bias +/- 1.96 standard deviation otherwise.- ci_methodstr
If
'parm'
(i.e., parametric), confidence intervals are always represented using a standard t-distribution. If'boot'
(i.e., bootstrap), confidence intervals are always represented using a bootstrap resampling procedure. If'auto'
(default), confidence intervals are represented using a bootstrap resampling procedure for sleep statistics where the distribution of score differences is non-normal and using a standard t-distribution otherwise.- fstringsdict
Optional custom strings for formatting cells.
- Returns
- table
pandas.DataFrame
A
DataFrame
of string representations for bias, limits of agreement, and their confidence intervals for all sleep statistics.
- table
- summary(ci_method='auto')[source]¶
Return a
DataFrame
that includes all calculated metrics: * Parametric bias * Parametric lower and upper limits of agreement * Regression intercept and slope for modeled bias * Regression intercept and slope for modeled limits of agreement * Lower and upper confidence intervals for all metrics- Parameters
- ci_methodstr
If
'parm'
(i.e., parametric), confidence intervals are always represented using a standard t-distribution. If'boot'
(i.e., bootstrap), confidence intervals are always represented using a bootstrap resampling procedure. If'auto'
(default), confidence intervals are represented using a bootstrap resampling procedure for sleep statistics where the distribution of score differences is non-normal and using a standard t-distribution otherwise.
- Returns
- summary
pandas.DataFrame
A
DataFrame
of string representations for bias, limits of agreement, and their confidence intervals for all sleep statistics.
- summary
- property assumptions¶
A
pandas.DataFrame
containing boolean values indicating the pass/fail status of all statistical tests performed to test assumptions.
- property auto_methods¶
A
pandas.DataFrame
containing the methods applied when'auto'
is selected.
- property data¶
A long-format
pandas.DataFrame
containing all raw sleep statistics fromref_data
andobs_data
.
- property n_sessions¶
The number of sessions.
- property obs_scorer¶
The name of the observed scorer.
- property ref_scorer¶
The name of the reference scorer.
- property sleep_statistics¶
Return a list of all sleep statistics included in the agreement analyses.