In ML, What They Know Can Hurt You

This blog post is authored by Juan M. Lavista, Principal Data Scientist at Microsoft.

Imagine you own a restaurant in a city (any city but not Seattle*). From inside the restaurant there’s no way to see the outside. You come up with a marketing plan that if it rains, your restaurant will provide free food. Given that there are no windows, you use the following model to determine if it is raining – if there’s at least one customer that comes into the restaurant with a wet umbrella you conclude that it’s raining. Initially, this model will be very accurate. After all, in what other circumstances would someone walk in with a wet umbrella? However, if you decide to disclose the umbrella rule to all your customers, do you believe that this model would continue to be accurate?

Information on how the model works and how features behave within a given model can be very useful if used wisely. However, if this information is used incorrectly, it can actually hurt the accuracy of the model.

Models require that the set of signals/features have information that are predictive. However, the relationship between these signals/features and the outcome do not necessarily need to be causal. This means that the feature might be indicative/correlated but not necessarily the cause of what we are trying to predict.

For example, let's say we need to predict [C], but we can only measure [B] as feature, and [B] does not affect [C]. The real cause that affects [C] is [A], but we cannot measure [A]. [A] on the other hand also affects [B], so we can use [B] as way to predict [C].

So let’s say we estimate this model and it gives very good predictions of [C]. However, we later release the model to the public who take advantage of this information and start targeting changes in [B] directly. At this point, the information being provided by [B] to the model is no longer only associated to [A] so it loses prediction power for predicting [C]. By releasing the model to the public, we actually end up hurting the accuracy of the model.



A real world example is Pagerank[1]. Pagerank provides a way to rank the importance of a webpage based on which other websites have links to this particular website. The basis of the model is that if a document is important or relevant, other websites will reference it and include links to it, and when the links naturally happens the ranking works. However, once you make this model open to the general public, there are clear incentives to rank higher in a search result. What ends up happening is that users will try to game the system, for example, by buying backlinks to try to increase the ranking of their website. In reality, if we artificially increase the links to a website, this will not mean the website is more relevant, this is just gaming the system. By disclosing how the algorithm works we are actually hurting the accuracy of the model.

Credit Scores

The design objective of the FICO credit risk score is to predict the likelihood that a consumer will go 90 days past due (or worse) in the subsequent 24 months after the score has been calculated. Credit scores are another place where disclosing rules ends up hurting the model. For example myfico states: “Research shows that opening several credit accounts in a short period of time represents a greater risk – especially for people who don't have a long credit history.” [2] This type of study shows correlation but not causation. These findings can help predict credit risk, however, by disclosing these rules, the credit agency incurs the risk that users will learn how to game the system and hurt their credit risk prediction.


Not All Information Hurts the Models

If the relationship between the feature and the outcome is causal, the outcome is different. For example, if I’m building a model for predicting blood pressure using the amount of exercise as feature, by providing this information to the public I will actually help the users. Given that the relationship between exercise and blood pressure is causal [3], it will not hurt the accuracy of the model.


Before publicly sharing information about how your model works, it is important to understand if the relationship between the features and the outcome are causal. The rule of thumb is the following:

If the relationship between the feature and the outcome is not causal, especially if the signal/feature is easy to change – for example, by buying links, or by walking in with an umbrella in the examples above – and there are reasons why people have incentives to affect the actual outcome, then we might be at risk of users gaming the system.

Even if we do not disclose how the model works, we still might be at risk because users may find out. It’s important to understand and evaluate the risk and monitor your systems periodically.

Follow me on Twitter


*This model will not work in Seattle because Seattleites do not carry umbrellas 🙂



[1] The PageRank Citation Ranking: Bringing Order to the Web. Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry (1999) The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.

[2] How my FICO Score is calculated

[3] Exercise: A drug-free approach to lowering high blood pressure by Mayo Clinic