INTRODUCTION
RESULT
Whole genome DNA methylation level-based outcome prediction by a linear transformation
1 Linear relationship between embryo whole genome DNA methylation level and clinical outcome. A ML-score distribution of embryos with different clinical outcomes. Embryos that resulted in pregnancy failure or pregnancy loss were combined as “Failed”. B Expected live birth rate and standard errors relative to ML-score. The result is fitted by a binomial generalized linear model, and presented discretely with an interval of 0.01 on the x-axis. Linear relationship significance P value indicated above |
Linear model leads to significantly improved clinical outcome
2 Clinical efficacy evaluation of linear model. A H-index distribution in patients with different maternal ages. The dashed horizontal line indicates the anticipated threshold as 0.03. B–D Clinical outcome of embryos from patients that had embryos with similar quality (B), selected superior embryo (C) and selected inferior embryo (D). E–G Summarized clinical outcome of patients that had embryos with similar quality (E), selected superior embryo (F) and selected inferior embryo (G). The inclusion criteria for each group are indicated in each corresponding panel |
Insights into epimutations related to pregnancy failure
3 Insights into epimutations related to pregnancy failure. A Epimutation frequencies in embryos with different clinical outcomes. Embryos that resulted in pregnancy failure or pregnancy loss were combined as ‘Failed’. P value indicates the significance of the two-sided Wilcox test. B Epimutation frequency differences in analyzed promoters. Differences are calculated by failed embryos subtracting birth embryos. P value on the Y-axis indicated the significance of the two-sided Fisher’s exact test. C Gene ontology analysis result of 446 epimutation related genes. Fold enrichment indicates the gene number ratio of observed relative to expected in corresponding pathways. P value indicates binomial statistical significance of gene enrichment |
Enhanced classification performance by a mixed model
4 Data cleaning and embryo classification performance of PIMS-AI. A An example of noisy label identification, is a failed embryo with the highest TSS-score would be annotated as noise and excluded in the next iteration. B Learning-curve of TSS-score AUC relative to noise filtering. Error bars indicate 1 standard deviation. C AUC of ML-score, TSS-score and combined PIMS-AI model in cross-validation or independent testing. Results based on train set with the removal of 5 noise label |
Accurate embryo discrimination and selection by PIMS-AI
5 PIMS-AI performance in simulated embryo selection. A PIMS-AI prediction result of the embryo from each patient. Embryos from the same patient are plotted on the same column, with shape indicating their transfer order and color indicating their clinical outcome. B DI of ML-score, TSS-score and PIMS-AI in simulated selection for each patient. Patients are sorted in the same order as that in Panel A |
DISCUSSION
MATERIALS AND METHODS
Data processing and annotation
Methylation level score and linear model
Estimate PIMS clinical efficiency with a linear model
Detection of methylation mutation in promoters
Machine-learning based TSS-score prediction
Simulate clinical application by simulated selection
Abbreviation
AI | Artificial intelligence |
ART | Assisted reproductive technology |
AUC | Area under the curve |
DI | Discriminability index |
GBDT | Gradient boosting decision tree |
GLM | Generalized linear model |
PGS | Preimplantation genetic screening |
PGT-A | Preimplantation genetic testing for aneuploidy |
PIMS | Preimplantation DNA methylation screening |
TSS | Transcription start site |