Content of review 1, reviewed on June 01, 2021

Overall, the researchers successfully justify the use of deep sequencing to analyse causal relationships between genomic variants of LPA, circulating Lp(a) and sub-clinical atherosclerotic disease. The finding that genetic variants of LPA affect atherosclerotic risk in African Americans is novel and not previously reported. The study demonstrates the utility of the analytic genetic tools they applied to impute relationships between rare genomic variants in KIV-CN and circulating Lp(a) together with the large cohort studied in an effort to add another important dimension to cardiovascular disease risk calculation.

The title of the article: "Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestry", is succinct although it only summarises the main method of the study and not any of the key findings. The introduction of the abstract better reflects what the key aim of the study was ('delineating the inherited basis of plasma Lp(a)' concentration) and hence may an alternative title however the current title suffices.

The abstract is structured well; briefly introducing the topic, aim of the study, methods (quantification of K-IV-2 copy number using whole genome variants and direct genome mapping), and results (discovered a novel SORT1 locus associated with Lp(a) cholesterol and genetic modifiers). The abstract also highlighted the novelty of the study (distinguishing itself from previous GWAS studies) and elaborated specifically on the outcomes (heritability and inter-ethnic heterogeneity of Lp(a) levels). The references overall used in this article appeared to be in line with the specific area of inquiry in this paper, including recent papers (Noureen A, et al, 2015 PLoSOne).

Sound and clear background information was provided outlining what is already known in the topic and why Lp(a) plasma levels are a causal factor in atherosclerotic cardiovascular disease (apo(a) KIV-2 polymorphisms) and thus a is a therapeutic target. This information nicely framed the associated gap in knowledge relevant to their research question i.e., incomplete understanding of Lp(a) heritability. The research question is clearly outlined in specific terms against what the current gaps in knowledge are. The paper made clear what the technical hurdles are in terms of the gaps in Lp(a) genomic variability (different genotyping strategies) thereby justifying their method of choice: deep whole genome sequencing. Three different populations were tested (Estonian, Finnish, African American). While other cohorts should be included in future studies, the study provided a good proof of principle for the chosen methodologies.

The research question (framed over three aims) was broadly justified given what is already known on the topic (as stated by the authors). This appears to be consistent with the recent literature which they have referenced. The 3 broad aims they state are: understanding and characterising the full spectrum of Lp(a) genetic variations, comparing these between two core and an additional population, and then exploring the phenotypic consequences of these appeared to be a sound comprehensive strategy.The process of subject selection (i.e., the study participants) was outlined in the Supplementary text. Participants were obtained from three cohorts representing the two main demographics (African, European) and an additional cohort (Estonian).

The experimental rationale was clear: deep coverage whole genome sequencing was purported to determine the full range of genomic variation, aside from genotyping SNPs. The overall goals of the study were clearly broken down into a three-fold investigation of genetic variation in Lp(a)-C, between Europeans and African-American, and then looking at phenotypic consequences and clinical events. Moreover, they did this using 4 different types of variant analyses. There is a sampling flow chart provided in Figure 1 which outlines the general schema of the overall study design. Phenotype stratification was indicated however it wasn't the simplest schema to understand. The variables were broadly defined in Figure 1 however it was difficult to ascertain the pipeline of experiments in a logical and structured manner. A simpler flow chart omitting graphics and only keeping summarised text with clear directionality of experimental plan will suffice. Details were provided in the body of text in the article however a clearer flowchart in Figure 1 will help the reader to gauge these details more concisely. Measurement of variables were appropriately described and the associated statistical analyses performed. Enough detail was provided to replicate this study.

Overall, the data has been presented in an appropriate manner. Figure legends are descriptive. Figure 1 is a schematic of the overall study approach. As mentioned previously, the schematic could be clearer and less 'noisy'. Sufficient supplementary figures and tables were provided to detail baseline characteristics of the cohorts and outline the whole genome sequencing approach. The authors note that the study is well powered to detect genetic differences between cohorts however the power calculation is missing (please provide). Differences between groups for specific variables tested (Figure 2 onwards) are accompanied by mean values and associated standard deviations. All correlation values need to specifically state what the cut-off is for the stated correlation coefficient values and how this was determined. Figure 3 is presented as 4 graphs and while the graphs have relevant headings, the text refers to 'Figure 3a, b, and c' and this does not match up to anything that can be referred to clearly in the figure itself . These labels need to be added. In relation to results stated in Figure 4, LPA and non-LPA loci associated with Lp(a) phenotypes were examined between ethnicities: it wasn't clear to me what the 'conditioning' data referred to. A prefacing sentence or two to explain what the conditioning was would be helpful to provide some context as to why this was required.

Overall the discussion was clear and concisely re-stated the main aim of the study: analysing genetic determinants influencing Lp(a) concentrations by utilising genetic instruments that impute apo(a) isoforms correlated to cardiovascular disease risk prediction. The conclusions drawn by the study were:

  1. Whole genome sequencing and imputation demonstrate genetic heritability in the African American and European populations studied. The researchers then performed single variant analysis and found a novel locus (SORT1) variant in Lp(a)-C across both ethnicities that is correlated with decreased plasma Lp(a) levels independent of LDL. Genetic modification analysis was used to confirm previous rare coding variants of LPA and the novelty in this study was that this was linked largely to structural variation in KIV-CN.

  2. The second conclusion was stated in a fairly vague and unclear manner. From what I could gauge, LPA locus variants specific to African Americans affect standardised Lp(a) levels. One suggestion for improvement is that the data is concisely summarised by providing more explicit statements of the specific results of this part of the study. From what was stated at the beginning of the paper, the aim the study overall was to use deep coverage sequencing to characterise the full spectrum of genetic variation affecting Lp(a) concentration among diverse individuals and then relate any differences to 'any incident or subclinical measures'. What exactly were these incident/subclinical measures? Clarify. The lack of specificity in circumscribing what the clinical measures were makes it difficult to properly assess the novelty of the findings discussed later with any clarity. Nevertheless the study does demonstrate, in my opinion, the utility of the deep coverage approach as highlighted in the third conclusion of the study:

  3. Whole genome sequencing can detect relevant genomic variants for Lp(a) as opposed to whole exome sequencing or genotyping arrays, contributing to CVD risk calculation in light of differential Lp(a) levels.

The authors acknowledge the limitations in the study, which I would agree with. An aggregate depth of coverage for copy number variation analysis limits the determination of allelic KIV2-copy number. Nevertheless the computational sensitivity analysis does attempt to bridge/account for this disparity. The imputation modelling is carried out with a well-described SNP in KIV-CN and then shown to be strongly associated with known Lp(a) phenotypes. Associations with lesser-known Lp(a) phenotypes would have been insightful. Perhaps this is for future investigation.

Source

    © 2021 the Reviewer.