The recent Practice of Machine Learning Conference at Microsoft concluded with a lively panel discussion moderated by principal researcher Misha Bilenko on the topic of: "Are We at Peak ML, or at the Start of AI Takeover? Hype vs. Reality of Machine Learning.” Our panelists were:
Greg Buehrer, Partner Development Manager, Bing Ads
John Platt, Distinguished Scientist, Microsoft Research
Joseph Sirosh, Corporate Vice President of Machine Learning
This post recaps their conversation and the kinds of issues raised by our audience.
The first question played off the title of the panel: "As humans get replaced by machines, how will data scientists be useful and will they be replaced as well?"!
Greg commented that he didn't see that happening soon. Machines are good at mechanical and physical processes, but still not that great at automating decision-making especially in very complex environments. John and Joseph took the conversation further by discussing the challenges associated with employment and wealth distribution in an environment of seemingly ever-increasing automation, but the panelists admitted they couldn’t predict exactly how this trend might play out.
After discussing the Azure ML marketplace as a place where data scientists can publish their innovative ideas as web services, all panelists issued a call to action to the audience to be ever more data-driven in their work and build their ML skills, as there are many possibilities ahead of us to make products and services more intelligent.
Next, in response to an audience question, the panelists gave their opinions on whether there is a danger to using ML systems as "black boxes" inside systems such as drones, cars, etc. without fully understanding what's inside.
Joseph's opinion was that "I don't think of ML as being any different than any software algorithm collection, and there is a lot of software you can ask the same question about". He felt that there must be systems in place to ensure reliability. Greg agreed that mistakes can be made, and that experience and decision-making responsibilities have to be assigned appropriately. John then argued that "People are teaching ML incorrectly. They teach what's inside the black box, but before that you need to learn statistical hygiene: You need to have a test set, you need to not cheat, you have to do confidence intervals, and you need to worry about outliers. Learn statistical hygiene to avoid disasters". All three panelists agreed on this.
An audience member suggested that many ML use cases focus on mitigating negatives (e.g. intrusion detection, fraud detection) and asked about positive implementations. Greg suggested that avoiding these negatives in itself was a positive for customers and businesses 🙂 but also mentioned things such as recommendation systems which help consumers discover things that they may like. John brought up new experiences such as Office Delve which are helping people become more productive. Joseph mentioned a conversation that he had with the founder of eHarmony about how they use ML to help compatible people discover each other. He added that there are a whole host of other scenarios where ML is making some extremely positive contributions, including speech recognition, visual and gesture recognition. John closed with the tongue-in-cheek comment that if eHarmony is using ML, then "the future evolution of human DNA is being driven by machine learning."
Next, the panelists debated how privacy and ML intersect. Some of this discussion focused on the experience that Julia Angwin describes in her book Dragnet Nation, about how difficult it can be to gain true privacy in the Internet age. Joseph’s takeaway from hearing Julia speak recently was that the discussion might be changing from whether one has privacy to "whether people who have your data are using it responsibly… it's about justice, people who have your data are accountable for it, are responsible for using it in a just way."
The discussion also touched upon the enormous possibilities that the cloud opens up for data scientists and ML practitioners, as well as the potential of massive amounts of data that are becoming available. One example, for instance, focused on airplanes collecting vast amounts of weather data as they fly, which can be used to predict weather patterns much more accurately.
John noted that, if you can solve the privacy problem, then "One of the powers of data is that if you pool data together, you get more out than you put in, because you get more information by correlation". Greg added, "Not only do you want to encourage people to put data in a certain spot, but you want to ensure that the applications collecting the data collect it in the most structured way possible" so that you can make the most use of it.
Joseph chimed in with the comment that "When you put enormous compute against enormous data, and you bring machine learning to bear along with it, and the Internet of Things feeding data into the cloud…and streaming analytics running on live data…I think that in a very short time, you will see a completely different picture of analytics".
The discussion lasted an hour and the panelists addressed well over a dozen questions then, so this recap is necessarily incomplete. However, you can stay tuned to more ML happenings around Microsoft by subscribing to our ML blog feed and by following us on Twitter @MLatMSFT.
ML Blog Team