6. Data available for sharing

For full details of available data please review the online MCPS Data Showcase. The Showcase displays all the data types currently available, in a grouped format (i.e. not at the individual participant level), along with further information about each data field (for example, background information about how measures were taken). Genetic variation can be viewed using the MCPS online Variant Browser.

Details of data available to researchers worldwide

Baseline data (1998 - 2004) available for 159,517 participants

Month and year of recruitment

Socio-demographic

age
sex
area of residence
marital status
educational achievement
occupation
income
health service provider

Lifestyle characteristics

smoking
passive smoking
alcohol consumption
physical activity
sleep duration
fruit/vegetable intake
fried food intake
type of cooking oil used

Prior diseases and medication

Reproductive history (women)

menopausal status
hysterectomy
oophorecomy
hormone replacement therapy
contraceptive use
age at first sexual relationship
age at first pregnancy
number of pregnancies

Physical measurements

height
weight
waist circumference
hip circumference
systolic blood pressure
diastolic blood pressure

Blood samples

time of blood sampling
time since last meal
glycosylated haemoglobin

Resurvey data (2015 - 2019) available for 10,143 participants. Similar data to that collected at baseline plus

Additional questionnaire data

diabetes control questions
diabetes consequences (e.g eyes, amputations, dialysis)
fractures/falls
treatment for breast cancer
additional dietary questions (e.g sugary drinks, added salt, meat, fish, desserts, diets)
cognitive function (MMSE)

Additional measurements

bioimpedance (fat mass, fat free mass, muscle mass, muscle score, bone mass, body water, degree of obesity, visceral fat rating, basal metabolic rate, metabolic age, Rohrer's index.)

Additional samples

time of urine sampling
urinary creatinine
urinary albumin

Baseline NMR metabolomic data using the Nightingale Health platform First release: 40,297 participants

14 Lipoprotein subclasses

XXL VLDL
XL VLDL
L VLDL
M VLDL
S VLDL
XS VLDL
IDL
L LDL
M LDL
S LDL
XL HDL
L HDL
M HDL
S HDL

Lipoprotein mean particle sizes and apolipoproteins

VLDL-D
LDL-D
HDL-D
Apo A1
Apo B

Cholines and glycolysis-related

total cholines
phosphatidylcholine
sphingomyelin
lactate
citrate
glucose

7 Lipid measures for each subclass

particle number
cholesterol
free cholesterol
esterified cholesterol
triglycerides
phosphorolipids
total lipids

Fatty acids

polyunsaturated fatty acids
monosaturated fatty acids
saturated fatty acids
docosahexaenoic acid
linoleic acid
omega-3
omega-6
total fatty acids

Amino acids

alanine
gluatmine
histidine
isoleucine
leucine
valine
phenylalanine
tyrosine

Ketone bodies, inflammation and kidney function

acetate
acetone
β-hydroxy-butyrate
albumin
creatinine
glycoprotein acetyls

Genomic Data

Described fully in Ziyatdinov et al. 2023

Genome–wide genotyping with the Illumina Global Screening Array (GSA) version 2

Non-filtered dataset (140,831 participants)
650,381 variants: 619,501 autosomal variants; 30,101 sex chromosome variants; 779 mitochondrial variants.

Quality controlled dataset (138,511 participants)
559,923 variants: 539,315 autosomal variants; 19,954 sex chromosome variants; 654 mitochondrial variants.

Whole Exome Sequencing (WES)

Non-filtered dataset (141,046 participants)
13,331,228 variants: 12,957,291 autosomal variants; 368,300 chromosome X variants; 5,637 chromosome Y variants

Whole Genome Sequencing (WGS)

Non-filtered dataset (9950 participants)
158,464,363 variants:151,639,445 autosomal variants; 6,342,270 chromosome X variants; 482,648 chromosome Y variants.

Phased WGS Imputation Reference Panel (MCPS10k) 9,948 whole genome sequenced phased samples

Total of 134,337,444 variants distributed across 22 autosomes and chromosome X

Data available in four file formats.

TopMed Imputed
Non-filtered dataset (140,831 participants)
307,624,124 variants: 292,293,083 autosomal variants; 15,331,041 chromosome X variants.

mortality data (up to 30 september 2022)

date of death
ICD-10 underlying cause
ICD-10 contributory causes
timing/duration of diseases
location of death
seen by doctor before death.

Apo A1=apolipoprotein A1; Apo B=apolipoprotein B; HDL=high density lipoproteins; HDL−D=high density lipoprotein particle diameter; IDL=intermediate density lipoproteins; L=large; LDL=low density lipoproteins; LDL−D=low density lipoprotein particle diameter; M=medium; S=small; VLDL=very low density lipoproteins; VLDL−D=very low density lipoprotein particle diameter; XL=very large; XS=very small; XXL=extremely large.

additional data currently available only to researchers in mexico

NMR metabolomic data using the Nightingale Health platform second release: 152833 participants at baseline and 9657 participants at resurvey

All metabolites as listed for the first release plus Phosphoglycerides, Pyruvate, Aceto-acetate and Clinical LDL-C.

Cookies on this website