Skip to main content
Data Mining and Machine Learning to Improve Northern Florida’s Foster Care System
Beyond: Undergraduate Research Journal
  • Daniel Oldham, Embry-Riddle Aeronautical University, Daytona Beach
  • Nathan Foster, Embry-Riddle Aeronautical University, Daytona Beach
  • Mihhail Berezovski, Embry-Riddle Aeronautical University
Faculty Mentor
Mihhail Berezovski

The purpose of this research project is to use statistical analysis, data mining, and machine learning techniques to determine identifiable factors in child welfare service records that could lead to a child entering the foster care system multiple times. This would allow us the capability of accurately predicting a case’s outcome based on these factors. We were provided with eight years of data in the form of multiple spreadsheets from Partnership for Strong Families (PSF), a child welfare services organization based in Gainesville, Florida, who is contracted by the Florida Department for Children and Families (DCF). This data contained a number of different aspects of the clients (“participants”) who were entered into the system as part of PSF’s record keeping. These aspects included dates, ages, removal types, disabilities, demographics, case details, and more of the parents, children, relatives, and caregivers involved. We analyzed and mined through this data using statistical analysis software (mostly R Studio), searching for correlations that could help us predict if a child is to be removed from their home and enter back into the foster care system. This research was overall a success, and we found significant insights into the cases that allowed us to predict their success or failure; we also built multiple machine learning models and prediction schemes that facilitated further understanding of statistically significant insights about the cases.

Citation Information
Daniel Oldham, Nathan Foster and Mihhail Berezovski. "Data Mining and Machine Learning to Improve Northern Florida’s Foster Care System"
Available at: