This post is authored by Remko de Lange, Data Solution Architect at Microsoft.
How often does it happen that you have a clear picture in mind of what you want to purchase next, but you don’t know how or where to get it? We usually start with consulting our favorite search engine on the web. However, putting your mental picture in exact words can be difficult. For example, I saw a great camping stove last time I was at my favorite camp site, not knowing that such a thing can have unpronounceable abbreviations as a name. I took a picture of the burner, though. Now, we are on the right track, as new technology can help us out.
Wouldn’t it be great if you could search for items with your image as input? Indeed, the technology presented here makes exactly that possible. While the technology names sound as exotic as the camping stove, the use of them is simple. So, first the technologies: with Microsoft R Server (version 9.1.0) comes the R MicrosoftML package that includes the machine learning transform named FeaturizeImage. It is the image featurization feature that does most of the work, as it uses a deep neural net model that has been pre-trained on millions of images already. That might scare you away, but to build a simple solution will only end up in a short R script. A developer can start small with the Microsoft R Client locally and still be able to scale to Microsoft R Server and use parallel computing and machine learning in the Azure cloud.
Similarity of Images
Finding similar images is based on how the images look alike, and that goes way deeper than “a cat has two eyes” types of characteristics—it’s also stuff that we humans never consider to be a characteristic. Many of these characteristics or features can be defined with a deep learning approach and brought together in a vector, meaning that the image is represented by an array of numbers. The nice thing with vectors is that we can do math with them. The images that are similar end up having short distances between the feature vectors of each image. Thus, finding the image that has the smallest distance between it and my stove picture will bear a resemblance to my picture. If the image database is large and rich enough, we can search through the collection to find similar images to my stove picture, and get clues what type of burner I really am looking for.
Putting It Into Practice – Finding the Right Chair
Let us explore this technology with an example. We start small, so a local machine is enough. Here we can make use of the free Microsoft R Client. This is based on the open source R, but has some enhancements, both in performance as well as extended libraries. One of these is the Microsoft machine learning library (MicrosoftML), including the pre-trained deep learning model that we obviously will use here. The example provided on GitHub searches for the next chair model that will replace our good old friend. It shows the R code that is needed to find a matching chair.
Use Cases for Image-to-Image Search
The example we have seen succeeds by providing a rich image collection to compare to. We can compare our own picture or an image that we found on the web to many types of image databases. What if, using our photo, we search a (web) shop catalogue to find the right items? Or we search an internet marketplace to find similar items? This is really another approach rather than a recommendation mechanism, as in that case you would incorporate other users’ behavior before you’re able to perform a search. With this technology in hand, everyone can implement their own image-to-image search technology for their own purpose.
Developers can start with code.