Download Complete Package
High-fidelity synthetic dataset with privacy guarantees
Download ZIP (9.3 MB)Dataset Overview
High-fidelity synthetic longitudinal dataset modeling bipolar disorder with mixed features (ICD-10: F31.6x). Generated using CTGAN with differential privacy guarantees for ML/AI research, clinical decision support development, and educational purposes.
800
Patients
5,550
Observations
35
Variables
Note: This is fully synthetic data - no real patients were used. Suitable for research and education only.
Reports & Documentation
Professional PDF reports demonstrating data quality and methodology:
- Executive Summary ~100 KB
- Statistical Analysis Report ~200 KB
- Privacy Guarantees Report ~150 KB
- Clinical Validation Report ~150 KB
Sample Data Preview
Evaluate data structure before downloading the full package:
- sample_preview.csv (100 rows) ~15 KB
- README.md ~3 KB
Privacy Guarantees
k-Anonymity (k=12)
Diff. Privacy (ε<0.8)
MIA Resistant (0.52)
Key Features
Clinical Scales
- YMRS (Mania): Mean 29.6
- HAM-D (Depression): Mean 18.1
- GAF (Function): Mean 44.6
Unique Features
- Identity Crisis (38.8%)
- Sleep Aversion (65.1%)
- Stimulant Misuse (39.2%)
- Polypharmacy (27.9%)
Data Formats
| Format | Use Case |
|---|---|
| CSV | Universal |
| Parquet | Python/R |
| SQLite | SQL queries |
| FHIR R4 | Healthcare |
| CDISC ODM | Clinical trials |
| Stata DTA | Statistical |
| REDCap | Research |
License
CC BY-NC 4.0
Attribution-NonCommercial
For commercial licensing, contact us at contact@mentaldata.io