Accelerate Rare Disease Research with Citizen Health
Access comprehensive, research-ready real-world data to power critical rare disease studies
and advance treatment development.
Real-World Data That Drives
Research Forward
Citizen Health is a Real World Data Platform that creates FDA-grade, research-ready datasets from unstructured medical data.
We work with leading researchers and academic medical centers to generate and provide rich RWD to power critical rare disease research.
120+
Unique rare conditions
with data
100+
Advocacy, Industry
& research partners
30+
Abstracts &
manuscripts
Access Citizen Data
Citizen provides no-cost core natural history dataset modules and upgraded longitudinal datasets for academic institutions to support your rare disease research efforts.
Accessing Citizen Data requires a short application, agreement to our research terms and submission of our security questionnaire. Get started by clicking on a data module below or contact us for more information.
Our Data Offerings

Core NHS Dataset
Data Contents
- Year of birth
- Zip-3
- Genetics
- Symptoms (unique)
- Gender
- Self-reported diagnosis
- Conditions (unique)
- Medications (unique)
Data Access:
No cost access
Learn More
Extended NHS Dataset
- Conditions (longitudinal)
- Medications (longitudinal)
- Hospitalizations
- Development
- Seizure history
- Laboratory studies
- Symptoms (longitudinal)
- Procedures
- Growth parameters
- Standardized tests
- Physical exam findings
Data Access:
Fee based access
Learn More
Data Additions
Data Contents
- EEG tracings
- MRIs
- Data refreshes
- Custom variables
Data Access:
Fee based access
Learn MoreAcademic Research Partnership
Refer Patients, Power Research
Join our patient referral program to support essential digital natural history studies and low/no cost access opportunities to our otherwise fee based longitudinal datasets.

Refer Eligible Patients
Providers and Genetic Counselors can identify patients in eligible cohorts to join Citizen Health. Patients can create a profile in five minutes.
Optimize Your Workflows
Patients can share their comprehensive medical history with you prior to their visits so you can focus your time with them on the topics that matter most.

Access Data and Drive Research
Referring patients helps build essential digital natural history studies, supporting research that could lead to real treatments and help produce datasets for your research and low/no cost access options.
Learn More About Our PartnershipAcademic Research Partnership FAQs
- What is included in the data?
- Our core data product is a longitudinal and comprehensive dataset that describes how patients present and how their symptoms evolve over time. Data is derived from patient’s medical records, genetic reports, imaging studies and diagnostic data such as EEG and ECG tracings. We can also supplement that data with patient-reported data (e.g. surveys), as well as data collected through other research studies.
- Who is on the platform today?
- There are >20,000 patients on the Citizen platform today. We can provide more specific cohort counts upon request and continue to grow cohorts.
- How do you recruit patients?
- Our primary recruitment channel today is through partnering with Patient Advocacy Groups (PAGs), Academic Medical Centers and Genetic Counselors. We also explore other channels, including partnering with genetic testing companies and direct-to-consumer marketing.
- How is the data structured?
- Citizen converts the medical record documents to normalized longitudinal data, structuring data in line with disease specific ontologies and standardizing against recognized terminologies. (SNOMED-CT, LOINC, RxNorm).
- We review those medical records for completeness and backfill any missing records. Citizen specializes in translating the rich content and context within unstructured notes (e.g. Physical Exams, Physician Assessment and Plan) into a structured, research-ready output.
- Data are normalized across institutions and patients using two mechanisms: (1) we use a data model that defines the types of data variables we extract, and (2) unstructured data are mapped to a set of codes, defined by an internationally recognized terminology.
- How can academics get access?
- Complete a brief application to indicate the data type(s) of interest, your research project description and provide an IRB approval or exemption, signed research agreement with Citizen Health and complete Citizen’s Security questionnaire. We will work with you to approve your application and timeline for data sharing or other requirements.
Publications
Lessons from Henrietta Lacks inform a transparency framework to catalyze generative artificial intelligence in medicine
Deciphering the Natural History of SCN8A-Related Disorders
Clinical signatures of SYNGAP1-related disorders through data integration
Autism gene variants disrupt enteric neuronmigration and cause gastrointestinaldysmotility
Increasing Clinical Trial Participation of Black Women Diagnosed with Breast Cancer
Patient characteristics, treatment patterns, and outcomes in patients with cholangiocarcinoma
Comprehensive phenotypes of patients with SYNGAP1-related disorder reveals high rates of epilepsy and autism
Autism gene variants disrupt enteric neuron migration and cause gastrointestinal dysmotility
Validation and clinical discovery demonstration of breast cancer data from a real-world data extraction platform
SYNGAP1-related disorder genotype-phenotype analysis through a digital registry
A novel patient-centric longitudinal data registry platform generates insights into real-world cholangiocarcinoma (CCA) clinical practice
A Novel Relationship between Interictal Epileptiform Discharge Burden and Gross Motor Developmental Delay in SCN2A Developmental and Epileptic Encephalopathy
Characterizing a rare neurogenetic disease, SLC13A5 citrate transporter disorder, utilizing clinical data in a cloud-based medical record collection system
Delineating clinical and developmental outcomes in STXBP1-related disorders
Expanding the phenotype of CHD2-related disorders through 97,010 phenotypic annotations
Comparing phenotypes across five developmental and epileptic encephalopathies (DEEs) through evaluation of 2490 patient data years.
Advocacy Group Partnership
Growing Partner Ecosystem