We have the hardware and software capabilities to develop devices that are approaching the same level as human vision. The devices can recognize objects, scenarios, and context.
According to Erez Na'aman, OrCam Technologies vice president of engineering and business development, OrCam took this technology in hand and set out to build an algorithmic system to see how closely they could mimic human vision – and they’re almost there.
Three years into development, OrCam has developed an eyeglass-mounted device that helps the visually impaired better understand the world around them, and its primary use is to help user’s process text.
“The Orcam can read any kind of printed text in English, be it a book, newspaper, signs on the street, or a menu at a restaurant,” explains Na'aman
Running on a 6Quad applications processor from Freescale, the i.MX 6, OrCam can process advanced computer vision algorithms in real time. “The device has high end graphics capabilities, as well as 1080p video, in addition to multiple interfaces and security,” says Rajeev Kumar from Freescale.
“At a high level, the i.MX can take in camera or video input from multiple sources, running algorithms to improving picture quality, or to identify objects, such as in facial recognition,” he adds. The processor is also able to support the peripherals on the head unit and base unit, working continuously over a long period of time without getting too hot.
Although the sketch phase went in all directions, the current device consists of a head and a base unit. The head unit fits on the eye glasses using a small mount that remains on the glasses, making sure that the camera is always in the right position. A cable connects the camera to the base unit which houses the i.MX processor. A large battery provides four and half hours of continuous video processing, plus a few days in suspend mode.
According to Na’aman, one of the biggest challenges was teaching the device to understand context in terms of software. “It was a question of creating a system that could understand what it needs to do on its own,” he adds.
“Try to imagine if your phone only had one button, you press it and it knows what you want. This is the kind of experience were trying to give our users. You don’t have to tell it what to do.”
OrCam can recognize generic objects, but it can also be taught to recognize specific items that are important to the user. “Products that you tend to buy at the supermarket, your credit card, any kind of small item that you can hold in front of the camera,” adds Na’aman. “It could even [recognize] your favorite cup in the office.”
The user triggers the device by pointing in the general direction of an object. “We see [the user] pointing, and we understand that he is interested in knowing what is there. We look at what is there, understand what it is, and describe it to him,” explains Na’aman.
The description is then relayed to the user via a discrete ear phone that uses bone conduction technology and sits discretely on the user’s cheek, keeping peripheral hearing intact.
The company is now working to develop a function that will allow the device to recognize places and faces, moving closer to the level of human vision they hope to one day achieve.
For more information, visit www.orcam.com.