Felix J. Mühlberg, Mei-Li Zhang, Priya N. Iyer
The advancement of machine learning techniques offers significant potential in the field of computational biology, particularly in genomic prediction modeling. This study aims to integrate sparse genomic datasets using neural network architectures to improve predictive accuracy in genomic studies. We employed convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to handle high-dimensional genomic data with inherent sparsity. Our approach was validated on multiple datasets, including maize and human genome samples. The model achieved a 15% improvement in predictive accuracy compared to traditional linear models (p < 0.01). Key findings indicate that the integration of CNN and RNN architectures addresses the sparsity issue effectively, capturing complex patterns that linear models often miss. Our research demonstrates that neural networks can significantly enhance genomic prediction accuracy, providing a robust framework for future applications in personalized medicine and crop improvement programs. In conclusion, leveraging machine learning techniques is crucial for the advancement of computational biology, offering new pathways for handling complex biological data efficiently.