By Charis Loveland, Senior Program Manager for Cortana Intelligence Competitions in the Data Group at Microsoft
According to a World Health Organization (WHO) report released in 2011, about 820,000 women and men aged 15-24 were newly infected with HIV in developing countries. Over 60% of these were women. Among so many other challenges, developing countries are plagued with serious reproductive health illnesses such as sexually transmitted infections (STIs), unintended pregnancies, and complications from childbirth. A key priority for policymakers, researchers, and health care providers working in developing nations is to emphasize prevention and distribution of information about STIs and other reproductive tract infections (RTIs). This report on Improving Reproductive Health in Developing Countries from the U.S. National Academy of Sciences contains additional information on the topic.
To achieve the goal of improving women’s reproductive health outcomes in underdeveloped regions, Microsoft has created a competition calling for optimized machine learning solutions to allow a patient to be accurately categorized into different health risk segments and subgroups. Based on the categories that a patient falls in, healthcare providers can offer an appropriate education and training program to patients. Such customized programs have a better chance to help reduce the reproductive health risk of patients.
The objective of this machine learning competition is to build ML models to assign a young woman subject (15-30 years old) from one of the 9 underdeveloped regions into a risk segment, and a subgroup within the segment. Microsoft is awarding a total of $5,000 in cash prizes. The Grand Prize winner will get $3,000 cash, followed by a 2nd prize of $1,500, and a 3rd prize of $500. Enter now, before the competition closes on September 30, 2016.
This dataset used in this competition was collected via survey in 2015 as part of a project funded by Bill & Melinda Gates Foundation exploring the wants, needs, and behaviors of women and girls with regards to their sexual and reproductive health in nine geographies. The data are made available here in accordance with the Bill & Melinda Gates Foundation open data access policy. The data may be used and shared for non-commercial purposes.
Cortana Intelligence Competitions Platform
In March 2016, Microsoft launched Cortana Intelligence Competitions, a gamification feature of Cortana Intelligence Suite, to encourage new ML applications and foster a vibrant online community. We are thrilled to launch our second competition on this empowering dataset.
The Cortana Intelligence Competitions platform provides an intuitive and fun environment to hone users’ data science and analytics expertise. Our new competition will allow you to have the chance to contribute to improving global health outcomes to win prizes and recognition.
Competitions allow you to:
- Explore unique data sets by participating in varying levels of competitions. These datasets are being released publically for the first time, so don’t miss out.
- Compete with data science experts at a global level. What better way to learn than some healthy competition? Can you build predictive models with a higher accuracy than the experts? We allow you to submit multiple solutions, so keep learning and trying.
- Advance the data science field, and in many cases science in general, by contributing your ideas and creativity toward meaningful challenges and sharing results with other experts.
- Win prizes worth thousands of dollars or find yourself on the coveted Top 10 public leaderboard. Here’s your opportunity for fame and glory!
You do not need to be an expert to compete. In fact, many aspiring data scientists with minimal background in data science have already participated in our competitions. Our tutorials, videos, and data set descriptions make it easy for those who have interest in analytics and data science.
You’ll be able to submit your first competition entry in four easy steps:
- Find the competition you’d like to participate in the Cortana Intelligence Gallery. Then click on the ‘Enter Competition’ button to copy the Starter Experiment into your existing Azure ML workspace. You can create a free workspace without a credit card by simply logging in with a valid Microsoft account or Office 365 account. Add your special sauce here using either built-in modules or by bringing your R/Python scripts directly, to improve on the target performance metric.
- Create a Predictive Experiment with the trained model out of your Starter Experiment, then adjust the input and output schema of the web service to ensure they conform to the specification from the Competition documentation.
- Deploy a web service out of your Predictive Experiment. Test your web service using the ‘Test’ button or the Excel template automatically created for you, to ensure it is working properly.
- Submit your web service as the competition entry, and see your public score in the Cortana Intelligence Gallery competition page. And celebrate if you make into the leaderboard!
After you successfully submit an entry, you can go back to the copied Starter Experiment, iterate, and update your Predictive Experiment, update the web service, and submit a new entry.
Azure ML Studio provides a rich set of ML modules as well as data processing modules in a friendly GUI for constructing experiments. It also allows experienced data scientists to bring custom R and/or Python scripts for native execution. The R and Python runtime in the Studio come with a rich set of open source R/Python packages, and additional packages can be imported as script bundles and referenced in the scripts as well.
The Competition supplies a Starter Experiment that leverages many built-in modules in Azure ML, but you can choose to replace them with R or Python scripts constructed externally and imported into the Azure ML environment to create a valid entry for submission.
Azure ML Studio also has a built-in JuPyteR Notebook service for you to do freestyle data exploration. Of course, you can always download the datasets used in the Competition and explore it locally in your favorite tool.
For a complete set of rules and features, please check out our FAQ.
We encourage you to enter today and look forward to seeing you contribute to women’s health outcomes and hone your skills with this compelling new dataset.
If you want to get some practice first to learn the end-to-end workflow of constructing and submitting a competition entry, please visit our Iris multiclass classification practice contest.
Questions? Contact us at firstname.lastname@example.org.
Best of luck!