Search Engines of the Senses

The Next Leap in Multimodal AI

Jun 10, 2025

Welcome back to Digital Innovation Week, a five-part exploration of remarkable research that is quietly laying the groundwork for tomorrow’s technologies. Yesterday, we looked at how brainwaves are being decoded into digital intent, enabling direct brain-to-brain communication. If you missed that, you can find it here.

Today, we are venturing into something just as quietly revolutionary: a world where search engines can understand the world not through keywords or images but through texture, temperature, sound and even scent.

Teaching machines to sense the unsaid
What if, instead of describing the smell of freshly baked bread, you could just smell it and your phone would understand? What if you could ask an AI to find “that fuzzy material like the inside of a hoodie” or “the ambient hum of an old tube station” and it actually could?

“This is the ambition behind a growing field of multimodal AI: giving computers sensory intelligence.”

At the University of Edinburgh, researchers have developed a stretchable, 1 mm-thick electronic skin (or “e-skin”) made from flexible silicone and embedded with capacitive sensors. When paired with machine learning, this skin gives soft robots an uncanny ability to detect and map bending, twisting and stretching in real time. The breakthrough, published in Nature Machine Intelligence, allowed AI to reconstruct the robot's full shape with sub-3 mm precision, even under complex deformations (Yang et al., 2023a).

Lead researcher Dr Yunjie Yang described it as “a step change in the sensing capabilities of soft robots.” This kind of tactile sensing could allow machines to navigate delicate tasks, from robotic surgery to remote handling of archaeological artefacts. It could also create entirely new ways to index and search physical textures.

But this is only one sense. What about smell?

Encoding scent into data
A team at the National University of Singapore recently built a miniature “photonic nose,” capable of identifying complex mixtures of volatile organic compounds (VOCs) using mid-infrared light and AI. Their chip-scale sensor can distinguish between closely related smells, such as ethanol, isopropanol and acetone, with 93.6% accuracy. It can even predict concentration levels within 2.4 vol % (Chen et al., 2024).

“This technology is already being used for breath diagnostics and food quality control, but its potential goes much further.”

This technology is already being used for breath diagnostics and food quality control, but its potential goes much further. Imagine being able to search for a perfume by its scent fingerprint or trigger safety systems when the air smells wrong. In future applications, a sensory web browser could understand “a smoky scent with citrus high notes” or recommend a wine based on the smell of your favourite aftershave.

Behind both smell and touch is a shared insight: that human understanding is inherently multisensory. We think in combinations — taste and temperature, sound and setting — and digital systems are beginning to do the same.

The age of sensory AI
Large models like OpenAI’s CLIP and Google Gemini are already blending modalities such as text, image and audio. They work by embedding different kinds of data into a common “concept space,” a kind of shared understanding where a photo of a violin, the sound of it and the word “violin” all sit side by side (Radford et al., 2021). The technical groundwork for multimodal sensory search is already in place, even if senses like smell and texture have not yet joined the party.

The difficulty, of course, lies in capturing and standardising sensory data. But progress is accelerating. Edinburgh’s e-skin has already been tested for use in prosthetics, VR gloves and shape-aware medical devices (Yang et al., 2023b). The photonic nose, meanwhile, has found early uses in environmental monitoring and disease detection. This includes the potential to detect early-stage COVID-19 or Parkinson’s via breath (Chen et al., 2024).

“Once these signals are captured and labelled, AI can begin to draw connections.”

Once these signals are captured and labelled, AI can begin to draw connections. Search engines of the future might allow users to ask, “What fabric feels like this?” or “Play a sound that matches the mood of this space.” We are not there yet, but we now know what the path looks like.

What it means for the real world
This is about more than convenience. Sensory search could offer new ways for people with impairments to interact with the digital world. Those with limited vision could search by touch or sound. Museums might digitally archive the texture of ancient papyrus or the musky scent of century-old books. Musicians could search samples by vibe instead of keywords.

Crucially, it moves us away from the tyranny of language. There are many things we do not have words for — feelings, smells, experiences that are more sensed than spoken. Sensory AI has the potential to make these shareable, searchable and storable for the first time.

Tomorrow, we will explore another kind of invisibility: how scientists are working on an internet without towers, wires or satellites. It is a story about radio waves, raindrops and the future of digital access in the most remote places on Earth. Don’t miss it.

Chen, Z., Zhang, Z., Gai, X. et al. (2024). AI-empowered infrared spectroscopy for complex volatile organic compound mixture identification and quantification. Small, 20(29). https://doi.org/10.1002/smll.202401679

Radford, A., Kim, J. W., Hallacy, C. et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the International Conference on Machine Learning (ICML). https://arxiv.org/abs/2103.00020

Yang, Y., Ahmed, M. A., & Stathopoulos, V. N. (2023a). Stretchable capacitive soft sensors for shape reconstruction with deep learning. Nature Machine Intelligence, 5, 227–239. https://doi.org/10.1038/s42256-023-00641-5

Yang, Y., Stathopoulos, V. N., & Wang, W. (2023b). Tactile shape sensing for biomedical soft robotics using electronic skin. Advanced Intelligent Systems, 5(1). https://doi.org/10.1002/aisy.202200258

Wayne Saggers Innovation

Discussion about this post