The big picture: what’s next for image and voice search?


About 10 years ago, it would have been hard to believe that you could ask a Bluetooth speaker for a classic cheese soufflé recipe or take a picture of an object using your phone and find out exactly where to purchase it. Yet, here we are.

These interactions have been primarily realized through advancements in machine learning AI. One of the biggest developments in AI over the past three years has been in the area of voice recognition and natural language processing and we’re starting to see advancements in more complex human machine interaction in the form of image/video search.

Forward-thinking businesses are already using this new form of machine learning AI image recognition to allow users to search for products using pictures to find the same or similar looks and outfits they stock. However, does this mean intelligent image search is the next big thing?

Early days for image search an AI

Major search engines have supported a form of ‘image search’ for some time. Google introduced ‘Google images’ back in 2001 because of a demand for pictures of Jennifer Lopez in a green Versace dress which the regular Google search couldn’t handle. The functionality eventually evolved into being able to drag and drop pictures to find the same or other similar images the user was trying to match against.

The image search referred to today shouldn’t be confused with what now seems like relatively basic functionality (much like how ‘voice search’ today is more than the basic input of text search using your voice). The advancements in ‘image search’ allow for an extremely high degree of accurate image classification by recognizing elements and objects based on specific attributes identified within a given image.

At Forward3D, we have noticed clients increasingly asking about what they can do with the ever more accessible AI and machine learning APIs, but they are still very much unclear about how developments within this field could enhance many aspects of their business. ASOS’ use of image search is a great example of how such tech can be used to enhance the shopping experience and proves that the right application of AI can benefit any business if implemented correctly.

The future of image search

Google (as well as others) has recently started integrating features to enable users to shop for products captured with a smartphone, like the example of ASOS mentioned above. While image search will have its practical uses, the current siloed implementations in smart devices are likely to have a very limited impact on general search behavior for now.

While basic applications of image search can be used to identify what products are available, the technology will continue to evolve, becoming faster and more accurate, leading to more sophisticated, varied applications and use cases.

We are seeing use cases previously only seen in sci-fi, such as real-time image recognition of faces in AR setups, or applications within medicine where a machine can automatically identify early signs of medical conditions from an X-ray or MRI scans without the requirement for a specialist doctor to interpret the results.

The main challenge with this technology is that these purpose-built image recognition models are as only as good as its training data, meaning that any application can be significantly impaired if not done correctly or done so without enough correctly labeled data.

Image search vs voice search

Despite the advances in image search and its current implementation, voice is likely to remain the preferred way to request information on demand in the future. We are likely to see image-based search complement voice, allowing for a richer way of interacting with AI assistants.

The Amazon Echo Look has aimed to actively integrate both voice and image search to create a unique value proposition, enabling a virtual assistant that can scan and suggest clothing outfits for its user. An algorithm combined with input from fashion experts is then able to make recommendations on styling, providing various ‘looks’ via its app on your smartphone. These are great examples of how voice and image search can work to enhance interaction between users and virtual assistants.

Outside the home, if integrated correctly on smartphones or wearable tech, image search may enable a new dimension of window shopping that hasn’t previously been seen before. Consumers could ‘bookmark’ something of casual interest for later viewing similar to ‘Shazaming’ a song.

Who will come out on top?

Comparing voice search to image search is like comparing a computer keyboard to a mouse. They each have their own place in a world where search and discovery will be predominantly underpinned through interactions with virtual assistants.

While Amazon may already be combining advanced uses for voice and image search, taking meaningful search market share away from Google will be difficult. Amazon’s early penetration of standalone smart assistant devices like the Echo and Echo Look, along with their ability to supply products directly to the consumer (supporting its retail business), poses a potential long-term threat to Google. The search engine has long been the de facto entry point to satisfy consumer demand for both information and products for many years.

However, Google’s success to date as a traditional search engine has come from the ability to retrieve the most relevant and concise results with both speed and precision. Its core strength in organizing information and their recent shift to being an AI first business will pose a serious challenge to Amazon’s early lead.

Google does not have the same level of vertical supply chain integration as Amazon to deliver the most seamless customer journey for purchasing physical products. But with its overall accuracy and speed of information retrieval (with the vastness of its knowledge graph), along with integration with its other online consumer services will see it come out on top, be it for voice, image or any other digital search medium in the future.




Related reading


Source link