Microsoft Research has announced their latest big project for innovating how computers operate. Or in this case, how they can think similarly to human to understand concepts through experience and background.
The recent announcement of Microsoft Concept Graph reveals that a small team of researchers in Beijing, China have been working on associating a string of characters into an idea. The project uses "Jaguar" as an example, explaining how humans often think of a big cat while computers can only see the data associated with the word.
“We want to provide machines some common sense, high-level concepts” the blog post quoted senior researcher Jun Yan at Microsoft Research Asia.
The six year long Microsoft Concept Graph project is simply that, a graph of concepts with over 5.4 million ideas that are pulled from the internet. It started with training an algorithm to search through indexed web pages and search queries. The database kept the sites and queries anonymous to avoid invading any privacy. The team particularly focused on including common phrases like "such as" and "is a" to improve the Concept Graph's understanding of what people would be searching for.
More impressive is the ability for the algorithm to remove false concepts from the results. As the post explains, searching for "domestic animals other than dogs such as cats" would result in a clearer perspective that "cat is a domestic animal" rather than "cat is a dog". This means that the Microsoft Concept Graph can read between the lines and realize that a cat is indeed not a dog. Which, for computer conceptualizing, is a big step.
Of course, it wouldn't be innovative computer technology without the ability to apply it towards advertisement. The Microsoft Research blog post shares that the development of computer conceptualization can not only provide advertisements that you are already searching for but suggest related keywords.
Furthermore, it will improve search inquiries by combining phrases easily instead of recognizing each word as a separate entity.
For example, the graph recognizes certain phrases as single entities. When “Microsoft Research Asia” is queried, Bing ranks documents with the phrase “Microsoft Research Asia” higher than documents where “Microsoft,” “Research” and “Asia” are separated by additional words or punctuation.
Alongside the Microsoft Concept Graph, this team of researchers also released the Microsoft Concept Tagging Model. This technology helps develop the algorithm with probability scores towards the concepts. You can read more about both technologies and download them for research use on Microsoft Research.