Quantify the spacing effect: how recall improves with wider repetition spacing.
The spacing effect is a well-established memory phenomenon: repeated items benefit from wider spacing between presentations. Recall probability by lag (RPL) quantifies this by computing recall rate as a function of the number of intervening items between repeated presentations.
Two display modes are available: binned groups lags into coarse intervals (0, 1–2, 3–5, 6–8) for visualization, while full shows every lag bucket individually.
Workflow
Code
import matplotlib.pyplot as plt
import os
from jaxcmr.analyses.rpl import (
infer_max_lag,
plot_full_rpl,
plot_rpl,
subject_binned_rpl,
subject_full_rpl,
run_rpl_slope_analysis,
)
from jaxcmr.helpers import find_project_root, generate_trial_mask, load_data, save_figure
from jaxcmr.repetition import make_control_dataset
Code
data_path = "data/LohnasKahana2014.h5"
figure_dir = "results/figures"
figure_str = ""
ylim = None
mixed_trial_query = "data['list_type'] > 2"
control_trial_query = "data['list_type'] == 1"
control_shuffles = 10
mode = "full"
confidence_level = 0.95
Code
project_root = find_project_root()
figure_dir = os.path.join(project_root, figure_dir)
data_path = os.path.join(project_root, data_path)
data = load_data(data_path)
trial_mask = generate_trial_mask(data, mixed_trial_query)
control_dataset = make_control_dataset(data, mixed_trial_query, control_trial_query, control_shuffles)
control_mask = generate_trial_mask(control_dataset, mixed_trial_query)
max_lag = infer_max_lag(data['pres_itemnos' ], data['pres_itemnos' ].shape[1 ])
plotting_function = plot_rpl if mode == "binned" else plot_full_rpl
subject_function = subject_binned_rpl if mode == "binned" else subject_full_rpl
Code
plotting_function(
[data, control_dataset],
[trial_mask, control_mask],
labels= ["Mixed" , "Control" ],
contrast_name= "source" ,
confidence_level= confidence_level,
)
if ylim is not None :
for ax in plt.gcf().axes:
ax.set_ylim(ylim)
save_figure(figure_dir, figure_str)
Code
observed_result, control_result, comparison_result = run_rpl_slope_analysis(
data, trial_mask, control_dataset, control_mask, mode= mode, max_lag= max_lag,
)
print ("=" * 60 )
print ("Spacing Effect Slope: Observed" )
print ("=" * 60 )
print (observed_result)
print (" \n " + "=" * 60 )
print ("Spacing Effect Slope: Control" )
print ("=" * 60 )
print (control_result)
print (" \n " + "=" * 70 )
print ("Observed vs Control: Spacing Effect Slope" )
print ("=" * 70 )
print (comparison_result)
============================================================
Spacing Effect Slope: Observed
============================================================
N=35
Mean slope: 0.0186
t-stat: 5.644 p=0.0000
W-stat: 65.0 p=0.0000
============================================================
Spacing Effect Slope: Control
============================================================
N=35
Mean slope: 0.0036
t-stat: 2.800 p=0.0084
W-stat: 148.0 p=0.0053
======================================================================
Observed vs Control: Spacing Effect Slope
======================================================================
N=35
Mean slope (observed): 0.0186
Mean slope (comparison): 0.0036
Mean difference: 0.0150
t-stat: 4.058 p=0.0003
W-stat: 105.0 p=0.0003
Interpretation
The plot shows recall probability as a function of repetition spacing (number of intervening items) for both observed and control data. Key patterns:
Positive slope : recall probability increases with wider spacing, demonstrating the spacing effect.
Observed > Control : the spacing benefit exceeds what shuffled position assignments would produce.
Slope tests : the statistical tests quantify whether the spacing slope is reliably positive and whether it differs between observed and control.
API Details
Notebook parameters
data_path — path to an HDF5 file containing a RecallDataset.
figure_dir — directory for saving figures.
figure_str — base filename for the saved figure. Leave empty to display without saving.
ylim — y-axis limits as a list, or None for automatic scaling.
mixed_trial_query — query selecting trials with repeated items.
control_trial_query — query selecting trials for the control.
control_shuffles — number of shuffle iterations for building the control dataset.
mode — "binned" for coarse lag groups or "full" for individual lag buckets.
confidence_level — confidence level for subject-wise error bars.