Joint Research Conference
June 24-26, 2014
A critical component in planning Next Generation Sequencing (NGS) experiments is determining appropriate read depth. This is particularly important for reliably detecting what if any differences exist between two biological samples, for example normal versus tumor tissue. The problem of determining appropriate read depth can be thought of as a sample size estimation problem with two important considerations. The first is the inclusion of systematic sequencing errors (SSE) on read accuracy; this can include quality score and other metrics. The second is controlling for multiple comparisons, which in the context of NGS is especially important as we are comparing millions/billions of locations between the two samples. In this work we develop a method for helping to determine read depth and associated power estimates, taking into account both sequencing error and false discovery in studies looking at the relationship between two or more samples.