2015 QPRC

Analytical Heuristic Decision-Making Methods for Complex High-Dimensional Failure Data Selection and Classification

Keivan Sadeghzadeh, Nasser Fard
Department of Mechanical and Industrial Engineering
Northeastern University
Boston, MA

Abstract

Quality increases systems reliability, productivity and efficiency, decreases variability, error and failure, and also facilitates consistent outcome. Quality and reliability of complex large-scale systems are function of many variables. By the advent of modern data collection technologies, a massive amount of data is increasingly accessible from various sources for evaluation of these variables and their impact on system quality and reliability. Determining efficient explanatory variables, specifically in a complex and huge data with high-dimensional covariates provides an excellent opportunity in systems reliability and productivity analysis, and quality improvement. This paper presents a class of heuristic variable selection and classification methods to analyze a complex large-scale failure data with many covariates in order to reduce redundant information and facilitate a practical decision-making. Proper procedures are presented to drive a subset of variables that are significantly more valuable in quality, reliability, and productivity analysis of systems. Considering the complexity of the data and the presence of censored observations, the proposed methods utilize random subset selection concept. To validate these proposed method and their performance, several numerical simulation experiments are developed to compare and demonstrate their advantages. The experimental results are compared with similar efficient techniques comprehensively. It is shown that by using these approaches, data analysis and decision-making process for variable selection and classification in a large-scale failure data can be done quickly, and with high accuracy.

Abstract Summary

This paper presents a class of heuristic variable selection and classification methods to analyze a complex high-dimensional failure data in order to drive a subset of efficient variables that are significantly more valuable in quality, reliability and productivity analysis of complex large-scale systems.