Calibrating epigenetic clocks with training data error

Benjamin Mayne,Oliver Berry,Simon Jarman

Evolutionary Applications（2023）

引用 0|浏览1

暂无评分

摘要

Animal age data are valuable for management of wildlife populations. Yet, for most species, there is no practical method for determining the age of unknown individuals. However, epigenetic clocks, a molecular-based method, are capable of age prediction by sampling specific tissue types and measuring DNA methylation levels at specific loci. Developing an epigenetic clock requires a large number of samples from animals of known ages. For most species, there are no individuals whose exact ages are known, making epigenetic clock calibration inaccurate or impossible. For many epigenetic clocks, calibration samples with inaccurate age estimates introduce a degree of error to epigenetic clock calibration. In this study, we investigated how much error in the training data set of an epigenetic clock can be tolerated before it resulted in an unacceptable increase in error for age prediction. Using four publicly available data sets, we artificially increased the training data age error by iterations of 1% and then tested the model against an independent set of known ages. A small effect size increase (Cohen's d >0.2) was detected when the error in age was higher than 22%. The effect size increased linearly with age error. This threshold was independent of sample size. Downstream applications for age data may have a more important role in deciding how much error can be tolerated for age prediction. If highly precise age estimates are required, then it may be futile to embark on the development of an epigenetic clock when there is no accurately aged calibration population to work with. However, for other problems, such as determining the relative age order of pairs of individuals, a lower-quality calibration data set may be adequate.

查看译文

关键词

bioinfomatics/phyloinfomatics,molecular evolution,wildlife management

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要