Brief Overview
Summary
- Subjects watch different video stimuli in attention and distracted (or intervention) conditions
- They answer questions about the videos to assess memory and engagement.
- Multimodal physiological data is collected throughout.
- Both raw and preprocessed data alignes across all modalities are available.

Experiment Setup
All experiments were carried out at the City College of New York with the required approval from the Institutional Review Boards of the City University of New York.
Informed consent was obtained from all participants prior to the experiment.
Subjects were seated comfortably in a sound-attenuated booth with white fabric walls and normal ambient LED lighting around, and all data acquisition devices were securely and safely attached to participants.
They watched the videos on a 27” monitor approximately 60 cm from the subject, while audio was delivered through stereo speakers placed next to the monitor and separated by 60° from the subject, as shown in the figure below.

Dataset at a Glance
- Number of Experiments: 5
- Total Subjects: 178
- Total Raw Modalities: 7
- EEG | ECG | EOG | Respiration | Eyetracking (Pupil Size, Gaze and Head Coordinates)
- Total Hours of Raw Data:
- Total Derived Modalities: 12
- Continuous - Heart Rate | Breath Rate | Saccade Rate | Blink Rate | Gaze-Fixation Rate
- Discrete - Heart Beats | Breath Peaks | Saccades | Blinks | Gaze-Fixations
- Preprocessed - Filtered ECG | Filtered EEG
- Phenotypes Available: 4
- Participant Info (Sex | Age | Occupation | Tired | Study Time | GPA | Last Caffeine Intake | Occupation Field)
- ASRS (Adult ADHD Self Report) Responses | Digit Span Responses | Stimuli Questionnaire Responses
EEG | ECG | EOG | Head | Gaze | Pupil | Respiration | |
---|---|---|---|---|---|---|---|
Quantity | 91.2 hours | 93.5 hours | 94.4 hours | 92.5 hours | 110.6 hours | 110.6 hours | 44.2 hours |
Data Visualisation | Viewing Experiment 5 Data
Raw Signals
Derived Signals
Potential Dataset Use Cases
We encourage researchers to explore and utilize this dataset for advancing knowledge in cognitive and physiological sciences. This multimodal physiological signal dataset offers insights in various research areas, such as:
- Brain-Body Interaction: Exploring the interactions between neural activity and physiological dynamics, including which fluctuations precede one another.
- Multimodal Predictive Modeling: Leveraging synchronized data to predict physiological responses, such as estimating heart rate from gaze patterns or deriving breath rate from EEG signals, enabling cross-modal insights and applications.
- Engagement Response Analysis: Measuring how interactive or engaging educational videos are.
- Memory Retention Studies: Investigating how well information is retained after viewing.
- Cognitive Load Assessment: Understanding how different stimuli affect mental effort.
Usage Notes
-
Acknowledgment:
Cite this dataset in your research and acknowledge its contribution to your studies.
-
File Formats and Tools:
EEG data files are stored in
.bdf
format and can be accessed usingpyEDFlib
ormne-bids
libraries on Python, orEEGLAB
on MATLAB.
Physiological and behavioral signals are stored as compressed.tsv
(tab-separated value) files, and can be accessed using thepandas
library on Python.
All metadata files are stored in.json
, and the scores for digit-span, ASRS, and stimuli quizzes are stored as.tsv
files in the/phenotype
directory of each experiment.