ML Predicts School Dropout Risk & Boosts Graduation Rates

The next post in our series on how customers are gaining actionable insights on their data through the power of Microsoft advanced analytics – at scale and in the cloud.

Can software predict which students are at risk for dropping out? Till not long ago, the reputation of public education in Tacoma, WA, wasn’t stellar. A 2007 study, for instance, referred to the five high schools in the 30,000-student district as “dropout factories”. Even as recently as 2010, only 55 percent of the district’s high school students earned their diplomas on time, well below the national average of 81 percent.

Administrators in the beleaguered district, however did not give up hope. Through an intense effort, they boosted graduation rates from 55 to 78 percent by 2014. Today, the district is recognized nationally for its educational achievements.

So how did the Tacoma Public School district achieve such a dramatic turnaround?

Learn more in this post.  

Measuring the Whole Child

The change in fortune at Tacoma can be traced back to a several years ago when new leaders joined the school district’s board. The new leaders expressed a desire to be more transparent with data and to use the data to address any shortcomings. They resoundingly embraced the value of data-driven analytics for the benefit of the district and its students.

The board also asked themselves a radical question: What if they could process all their data to predict whether or not a student was likely to disengage and ultimately drop out? Such a tool would surely help them intervene on time and help such students succeed, reversing their district’s graduation trends.

The district started by exploring business intelligence (BI) options. Although these, by themselves, were viewed as being insufficient for all their needs, it nevertheless brought about a conversation with Microsoft who then started to help the district forge their path forward. Microsoft started by developing a data warehouse from the district’s student information system, with student grades, attendance, health records and other data.

At a meeting, the board President, Scott Heinze, stated, “We now have this world-class data system for teachers to use. They want to know what is going on in their classrooms. “ 

Historical data was made available to teachers and administrators via familiar tools like SharePoint and Excel. Board meetings started to routinely discuss benchmarks and – importantly – what actions were being taken as a result of observed metrics.

This solution was at the heart of an initiative called “Measuring the Whole Child”, the idea behind which was to use comprehensive data to collectively measure a child and determine how well the district was helping kids move forward.

Getting to Predictive Analytics

Next the district wanted to develop student success indicators which could help them predict future dropout risks.

As Shaun Taylor, the district’s CIO, said “By using predictive analytics, we thought we would be able to intervene earlier and work closely with those at-risk students. Then we would be able to reach our ultimate goal: getting that graduation number close to 100 percent.”

But taking that next step had always been viewed as a big hurdle. Things changed, however, when Taylor and his team were introduced to Azure ML. “When we saw Azure Machine Learning, we started to see how it could be possible for us to realize our vision,” Taylor says.

The district worked with Microsoft to create a proof-of-concept (POC). The POC used Azure ML to create a model to analyze data uploaded into Azure from multiple on-premises information systems. This included student data spanning five years, including demographic, academic and student performance information. Azure Data Factory (ADF) was used to setup a predictive pipeline that uses the ML model to predict if a student was at risk of dropping out during the following semester.

Predictive results were output to an Azure SQL Database from where district board members and IT staff were able to view them using a Power BI dashboard. “When we started this POC, we didn’t know if any predictive analytics would be attainable,” says Christopher Baidoo-Essien, BI Analyst at Tacoma Public Schools. “As we progressed and used more historical data, the model proved to be almost 90 percent accurate.”

These early results gave the district the confidence that they had the right tool to achieve their goal.

Future Work

The district continues to refine the data model and is striving to make things even more agile. “One of the challenges is that the data we looked at is historical in nature, and the data sets are semester-based,” says Taylor. “So if the data about an at-risk student is a few weeks old, that student has already lost two weeks of additional intervention and support.”

For teachers, intervening as early as possible is critical. To address this issue, the district will eventually give teachers and administrators weekly reports, factoring in attendance, time spent by teachers with individual students, disciplinary actions and so forth – with the goal being to have a near-real-time indicator of a student’s risk of disengagement.

Taylor and his team hope to have a more agile Azure ML model in place by the start of the 2015-16 school year. “We’d love to have administrators be able to look into the data and see that 30 percent of a group of incoming fifth-graders need reading or math intervention,” says Dorothy Kippie, Director, Technical Operations.

The district believes it can also help change traditional but incorrect perceptions about the reasons for students’ struggles, by identifying the true indicators that contribute to students’ tendencies to drop out. “Often, students are seen as fitting certain profiles that indicate a potential lack of success, but none of those profiles are supported by analytical data. We wanted to use data to change that perception,” says Baidoo-Essien. “And eventually, we want to predict what the key indicators are for kids disengaging.”

Additional data such as information about student nutrition and even social media signals could be important factors which they wish to bring into this equation.


All the hard work at the Tacoma Public School district is paying off. The efforts of their board, along with the efforts by teachers in every classroom, have contributed to the huge bump in graduation rates in 2014. “More than anything, the combination of new visionary leadership with sophisticated data analytics is helping drive improvement in graduation rates,” says Taylor.

The district also wants to help other school districts around the country make similar solutions work for them. “Our district is getting recognition nationally for our Whole Child initiative,” Taylor adds. “Data is at the heart of making that happen.”

ML Blog Team