QPRC 2016

Statistical Approaches for Characterization and Analysis of Large, Heterogeneous Data

Lawrence Ticknor and Emily Casleton

Los Alamos National Laboratory


Big data has become a buzzword that schools, students, and businesses have vaguely used to describe the amount of data currently available for a data analysis.  From a statistical analysis perspective, the appropriate method for a given dataset is independent of the size of the data.  The most important step in any analysis should be gaining an understanding of the data.  Although this step can become more difficult as the amount of data increases, it should not be ignored.  In this talk, we will present examples showing the importance of understanding the data.  In addition, we present some additional complications resulting from large amounts of data, such as difficulty in knowing if the results make sense, and dealing  with more data than can be saved such as in large simulations  while presenting some  current research ideas to start solving these problems.