close
close

Study reveals biological drivers of plasma proteins and provides new insights for disease biomarkers and drug development

A new study maps the biological influences on thousands of plasma proteins, revealing potential disease biomarkers and drug targets and offering hope for more precise, personalized treatments.

Study: Mapping biological influences on the human plasma proteome beyond the genome. Image source: Kateryna Kon / Shutterstock

In a study recently published in the journal Natural metabolism, Researchers used an integrated genomic-deep phenomic approach to map data-driven biological influences (modifiable and non-modifiable) that influence the levels of 4,775 plasma proteins. The study was conducted on more than 8,000 participants from the Fenland Study, with a subset of experiments and analyzes (particularly proteins as biomarkers of disease) conducted on a cohort of the European Prospective Investigation into Cancer (EPIC) Norfolk Study.

The study results showed that the variance in a large portion (n = 3,242) of the human plasma proteome is best explained by non-modifiable factors (age, sex, and genetics), but a large portion of the proteome can be explained by biologically meaningful relationships (n = 1,737). Remarkably, each protein target was found to be explained by four to 56 features. Some proteins showed strong associations with specific, non-modifiable factors such as genetic influence, explaining up to 74.27% of their variance, while others, such as C-reactive protein, were significantly influenced by modifiable factors such as inflammation (up to 68 .34%). Those associated with one of a few risk factors are ideal candidates for disease screening, while those associated with many risk factors represent potential biomarkers of holistic health. In addition, the study's use of Mendelian randomization revealed several causal links between plasma protein levels and disease, such as the association of reduced kidney function with cardiovascular disease through the COL6A3 protein. In addition, almost 600 proteins have been identified as drug targets.

background

“Proteins” is an umbrella term for a group of large, complex biomolecules that are critical to most life functions. They can serve as structural supports, biochemical catalysts, hormones, enzymes, building blocks for more complex macromolecules, and even triggers of cell death. Although they represent the most comprehensive class of biomolecules for drug development, systematic, comprehensive proteomic profiling at the population level remains limited.

Advances in biomedical engineering have recently enabled the identification and characterization of thousands of blood proteins. Unfortunately, the relative novelty of the field, compounded by the low protein content in blood (estimated at about 10%), has meant that the origins and purposes of most of the human proteome have remained unknown. Previous studies of the human plasma proteome have been limited primarily to a single protein or, at most, to a class of similar proteins.

Given the increasing frequency of protein-associated clinical trials (disease screening and drug development), a fundamental understanding of the modifiable and non-modifiable factors influencing the human proteome and the biological consequences of these influences is essential. The current study fills this gap by systematically integrating genomic data with phenomic data to map influences on plasma protein levels, thereby providing a comprehensive framework for future research.

About the study

The present study uses an aptamer-based assay approach to identify and measure human plasma proteins. The relative contributions of modifiable risk factors (diet, lifestyle), non-modifiable traits (age, gender, genetics) and technical factors such as sample handling and measurement methods for these proteins (expression, post-translational modifications) are then assessed.

Study data comes from the Fenland Longitudinal Study of more than 12,000 United Kingdom (UK) adults born between 1950 and 1975. Data collection included blood samples (for metabolic assessment), participant-provided information on dietary habits, general health and lifestyle, objective baseline measures of clinical well-being (cardiorespiratory fitness, body mass index). [BMI]physical activity and body composition) and anthropometry. In addition, fat mass (visceral, subcutaneous) was estimated using a dual-energy X-ray absorptiometry (DEXA) scan and liver health (hepatic steatosis) was estimated using an abdominal ultrasound scan.

Experimental procedures included genotyping (using the Affymetrix UK Biobank Axiom array), proteomic profiling (using the SomaScan v4 aptamer platform), calculation of weighted genetic risk scores (GRS), and Uniform Manifold Approximation and Projection (UMAP ) to visualize all underlying structures in the variation explanations of observed proteome patterns.

Genetic/hereditary factors were calculated using single nucleotide polymorphism (SNP)-based genetic relationship matrices. To account for the influence of technical factors on plasma protein levels, these were systematically removed from the analysis to allow more precise biological interpretations of proteome variation. Proteins with potential for drug development were annotated using the Human Protein Atlas (HPA) tissue expression dataset. Finally, causal associations between proteins and their major biological contribution were estimated using Mendelian randomization (MR) analysis and disease risk associations were estimated using survival analysis.

Study results

Of the 12,435 adults who took part in the Fenland study, 8,350 met the inclusion criteria (no pregnancy, terminal illness or physical disability) and were included in the analysis. The study used 4,979 aptamers to identify and measure 4,775 plasma proteins. Notably, each protein could be explained by 4–56 (median 25) features in modifiable, non-modifiable, and technical domains. Since technical factors are beyond the scope of this study, they were included for downstream analysis.

UMAP analysis revealed that non-modifiable factors (n = 3,242 proteins) could explain most biologically mediated proteome variations, while mutable factors explained 1,737. For example, genetic factors explained up to 77.3% of the variance for certain proteins such as neurexin 1. Modifiable factors such as chronic inflammation and smoking were shown to explain the variance of certain proteins, although on average they accounted for a smaller proportion of the total proteome variation (0.10 %–0.29%). Gender (0.55% to 60.22%) and genetic factors (3.10% to 74.27%) showed the strongest associations. Notably, some proteins were explained by only one of a few factors, highlighting their importance as biomarkers for disease screening. These corresponded to significant protein-disease associations, including type 2 diabetes (T2D), peripheral artery disease (PAD), chronic obstructive pulmonary disease (COPD), liver disease and all-cause mortality.

“In contrast, putative modifiable factors include chronic low-grade inflammation (CRP explains up to 68.34% of the variation), liver function (alanine transaminase (ALT) explains up to 56.66% of the variation), kidney function (estimated glomerular filtration rate (eGFR ), which explains up to 12.79% of the variation) and current smoking status (which explains up to 39.98% of the variation) explained the variation in plasma levels of most proteins, but on average explained a relatively small proportion (the mean variance declared between 0.10% and 0.29%). ).”

Overall, the “changeable” proteome was found to make up about 14% of the human plasma proteome. These results suggest that lifestyle choices such as smoking, diet and physical activity have a significant impact on plasma protein levels and may provide insights into the biological mechanisms that modulate disease risk. Lifestyle choices (e.g. smoking), diet and health behaviors (e.g. physical activity) have been shown to have profound effects on the plasma proteome.

Conclusions

The present study uses a deep proteomics approach to decipher the significant proteome variation in human plasma and identify its risk associations. The study revealed 4,775 proteins that exhibit variations due to modifiable (e.g. diet, lifestyle), non-modifiable (e.g. age, gender) and technical (methodology) factors.

Some proteins have been found to have only a few critical factors, highlighting their importance as biomarkers for general health and for screening specific diseases. Others have been found to have multiple determinants, highlighting their potential in drug discovery for a range of diseases. In addition, causal analysis using Mendelian randomization provided clues to potential disease-causing pathways, helping to refine the biological interpretation of these proteins and providing opportunities for targeted interventions.

These results provide unprecedented clarity on the biological drivers underlying proteome variation and provide clinicians and scientists with a framework for future studies of the human proteome. By controlling technical variations and mapping the multifactorial influences on the proteome, the study lays the foundation for integrating proteomics into clinical practice for disease screening and drug development.