Today's digital cameras record not only the images themselves, but also the metadata behind the scenes, like camera settings, location, date, time, etc. But there's a lot more to say about a photograph. What's the subject? Is it night or day? Outside or inside? Person, place or thing?
"As we amass an incredible amount of photos, it becomes increasingly difficult to manage our collections," Matt said on his blog. "Imagine if descriptive metadata about each photo could be appended to the image on the fly—information about who is in each photo, what they're doing, and their environment could become incredibly useful in being able to search, filter, and cross-reference our photo collections."
I personally would love this feature in a camera.
The technology is simply not here yet, but the Descriptive Camera explores the possibilities.
Run by Python and a few command-line utilities, the camera is powered by the BeagleBone, which has a USB webcam, thermal printer, status LEDs, and a shutter button. Instead of some sort of artificial intelligence being able to accurately describe a photo's content, the camera is connected to the Internet and "hits" up the Web workforce through Amazon Mechanical Turk.
After the shutter button is pressed, the image is sent to Turk as a HIT (Human Intelligence Task). Then the camera waits for someone to take the task and send back the description. With a decent price of $1.25 for each HIT, the results usually come back between 3 and 6 minutes, outputted on the thermal printer, much like a Polaroid print, only text.
But the reliability of the descriptions all depends on the worker's interpretation. Not to mention, if a worker spells something wrong or is completely off base, it would be pretty hard to find the image later on your computer. But the same thing would probably happen with some future AI.
This is an amazingly thought out project.