Microsoft Research applying artificial intelligence to photo-based storytelling

Reading time icon 3 min. read


Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team Read more

Photo albums are full of precious moments that we tend to relive over and over again. As more visual representations of our lives become the norm, the experience of living through images and photographs is lost on those that cannot see. Live Science says that this is just one reason that Microsoft Research and their colleagues have begun developing photo-based storytelling.

The strings of sentences and descriptions were constructed by Amason’s Mechanical Turk. Researchers engaged the crowdsourcing marketplace to write descriptions of over 65,000 images for the project, according to Live Science. After all the sentences were gathered, scientists put together approximately 8,100 images to create a storyline for the artificial intelligence project. As LiveScience explains:

As LiveScience explains:

“This is a picture of a family; this is a picture of a cake; this is a picture of a dog; this is a picture of a beach,” the storytelling program might take those same images and say, “The family got together for a cookout; they had a lot of delicious food; the dog was happy to be there; they had a great time on the beach; they even had a swim in the water.”

During their tests, scientists found that the storytelling artificial intelligence produced computer-generated stories faster than human storytellers, as well as matching human judgment through the sentence structures. Unfortunately, the photo-based storyteller needs a bit more work. The project experienced little hiccups such as the artificial intelligence writing in objects or scenes that weren’t actually in the images themselves. These ‘hallucinations’ were due to the computer program not being able to clearly explain what it is seeing.

Nevertheless, Margaret Mitchell from Microsoft Research hopes computerized storytelling will be useful for social media in the future.

“You’d help people share their experiences while reducing nitty-gritty work that some people find quite tedious,” she said. Computerized storytelling “can also help people who are visually impaired, to open up images for people who can’t see them.”

Eventually, she sees the project developing into a stepping stone for video. “For instance, for security cameras, you might just want a summary of anything noteworthy, or you could automatically live tweet events,” Mitchell said.

Later this month, Microsoft Research will be presenting the results of the storytelling computer program at the North American Chapter of Association for Computational Linguistics in San Diego. Let us know in the comments if you think computer-generated storytelling will have an impact.