How to build an accurate Visual Search – Hacker Noon

One of the most popular phrases today is “Google it”. Searching on the web is a spontaneous, unplanned activity for all of us. We instinctively hunt on search engines knowing that in a click of a button we’ll get an overwhelming response in milliseconds.

But how many of us have ever stopped to think how Google gives us exactly what we’re looking for? Google has perfected the search and ranking process by making continuous iterations and tweaks to their algorithm. They have major updates once or twice a year which significantly impacts the search outcome.

“Possum” is one such major update by Google. In 2016, Google ensured that results were dependent on the searcher’s location i.e. your results will be close to your location. So, if you search for a pharmacist, the one closest to you will pop up as the first result. Prior to this update the search result would probably be a list of pharmacists in your area with the 1st ranked store being that which was most searched for. This localisation of search results made it much easier to find places close by without having to specify the area, etc.

Besides these major updates, we are all aware of the fact that the more you search, the overall quality of results improves. The quality of search itself is a spectrum i.e there are no good or bad search results. The more you search on one topic, the results move from “good” to “better” to “best”. In other words, search results only improve over time.

Similarly, visual search is no different than normal text search. The results are bettered with each upgrade of the visual search engine; although what hinders visual search results are not similar to text search.

Let’s look at some of the challenges we have overcome so far –

Depth of Subcategories

Let’s say we search for jeans as seen in the picture below. There are a number of different kinds like flared, boyfriend, distressed etc. that the visual search engine may pick up. But our search engine picks up jeans only like the ones searched for. The subcategories of jeans is a challenge which our visual search engine has overcome. They train the neural network powering the visual search engine with multiple subcategories of articles. Currently, we support more than 1000 subcategories of articles like apparel, accessories, furniture and kitchen items sold online.

Ability of a Visual Search engine to recognise various sub-categories of jeans

Model Posture

This is by far the most common challenge faced by most visual search engines. As seen below, the query is of a suit worn by a model, posing in a certain way. The image search for this can bring back less relevant results with images of models posed in the same way instead of the clothes they are wearing. However, we train our neural network with hundreds of images each with a different pose to make sure it recognises clothes regardless of the pose. Therefore, our results as seen in the picture below is of other suits and not the model’s pose. Older search algorithms like the one based on locality sensitive hash based algorithms cannot achieve this.

Ability of an image search engine to recognise a blazer in various pictures where the model poses differently

Background noise

As the name suggests, if the picture being searched has a vivid background, it is very possible that the search result will only bring images of matching backgrounds instead of the desired object. Below we can see search results with varied background images rather than the patio furniture being searched for. We have trained our neural network to remove backgrounds from the images. This way the search is only performed on the main object.

An example where the search engine provided results with a strong bias on background instead of the object

Multiple item query

Your search could be for furniture for the entire room as shown in the picture. While searching for multiple articles, our search engine lets us select which pieces we want to search for by highlighting them with boxes or by using the auto crop feature i.e. it lets us select the object of interest easily by automatically cropping parts of the photograph not needed. Most of the other existing solutions would have results with entire combinations as depicted in the picture which are often not the most relevant results.

Example where Auto-crop helps Visual Search engine serve better results

Color weightage challenge

Suppose we are searching with an image of a floral patterned, yellow shift dress. Our actual expected result might be a red dress with similar floral patterns but all the visual search engine gets back to us is with more yellow dresses. This challenge is overcome when the visual search engine has color detection decoupled from other parameters. Our results are not be only based on the color, they are based on pattern, sleeve length and other attributes to the dress. Therefore, we ensure that color is not the only parameter used while searching for visually similar products.

Conclusion

For us overcoming these obstacles has become a regular and planned activity. We take 10 diverse images of each subcategory, pass it through our visual search engine and check what the results are. Referring to our previous example, we will pass 10 different types of jeans and test what the results will be. This ensures our search results do not have the above explained biases in any of the subcategories which we support.

Similar to Google, Turing Analytics also has scheduled updates every three months for major changes. For their last one in May 2018 they doubled the number of subcategories they support. Prior to this in Feb 2018 they released their auto crop feature allowing them to overcome the challenge of multiple items in one image. Another important update they had was in December last year where they reduced the search response time by approximately 80%. Turing Analytics’ search engine is a revolutionary software enhancing customer satisfaction.
 
Now that you know so much about how accurate Visual Search results are at Turing Analytics why not try our demo at VisualSearch.App/Demo

read original article here