The rate of advancement made in phenomic-assisted breeding methodologies has lagged those of genomic-assisted techniques, which is now a critical component of mainstream cultivar development pipelines. However, advancements made in phenotyping technologies have empowered plant scientists with affordable high-dimensional datasets to optimize the operational efficiencies of breeding programs. Phenomic and seed yield data was collected across six environments for a panel of 292 soybean accessions with varying genetic improvements. Random forest, a machine learning (ML) algorithm, was used to map complex relationships between phenomic traits and seed yield and prediction performance assessed using two cross-validation (CV) scenarios consistent with breeding challenges. To develop a prescriptive sensor package for future high-throughput phenotyping deployment to meet breeding objectives, feature importance in tandem with a genetic algorithm (GA) technique allowed selection of a subset of phenotypic traits, specifically optimal wavebands. The results illuminated the capability of fusing ML and optimization techniques to identify a suite of in-season phenomic traits that will allow breeding programs to decrease the dependence on resource-intensive end-season phenotyping (e.g., seed yield harvest). While we illustrate with soybean, this study establishes a template for deploying multitrait phenomic prediction that is easily amendable to any crop species and any breeding objective.
Available at: http://works.bepress.com/baskar-ganapathysubramanian/88/