Although the use of machine learning for disease detection has seen a sharp increase within the past several years, diagnostic methods for mental illnesses such as schizophrenia remain largely qualitative. This project aims to introduce a data-driven diagnosis by using genomic wide array data to predict schizophrenia. Various machine learning models using Python and TensorFlow were run on a dataset of 5334 subjects’ genomes from 17262 loci provided by NorthShore University HealthSystem. A linear dimensional analysis run on the raw data revealed that variables were collinear. Various support vector machine tests were also conducted, and the radial basis function kernel resulted in an average accuracy rate of 72.97%. A convolutional neural network structured as a five-layer sequential model for binary image classification with the adaptive moment estimation optimizer is being altered to further improve accuracy. Currently, a recurrent neural network is being built to understand the efficiency and use of general neural networks. Since a target accuracy rate lies above 95%, future steps include utilizing different parameters and data formats to improve the machine learning pipeline. The future of quantitative mental illness detection remains promising, but more data and a more intricate pipeline are necessary for greater results.
Available at: http://works.bepress.com/chandra-gangavarapu/11/