Improving the estimation of genetic distances from Next-Generation Sequencing data

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Authors Filipe G. Vieira, Florent Lassalle, Thorfinn S. Korneliussen, Matteo Fumagalli
Journal/Conference Name Biological Journal of the Linnean Society
Paper Category , ,
Paper Abstract Next-Generation Sequencing (NGS) technologies have revolutionized research in evolutionary biology, by increasing the sequencing speed and reducing the experimental costs. However, sequencing errors are higher than in traditional technologies and, furthermore, many studies rely on low-depth sequencing. Under these circumstances, the use of standard methods for inferring genotypes leads to biased estimates of nucleotide variation, which can bias all downstream analyses. Through simulations, we assessed the bias in estimating genetic distances under several different scenarios. The results indicate that naive methods for assigning individual genotypes greatly overestimate genetic distances. We propose a novel method to estimate genetic distances that is suitable for low-depth NGS data and takes genotype call statistical uncertainty into account. We applied this method to investigate the genetic structure of domesticated and wild strains of rice. We implemented this approach in an open-source software and discuss further directions of phylogenetic analyses within this novel probabilistic framework.
Date of publication 2015
Code Programming Language C++

Copyright Researcher 2022