The software recognizes the object and what is happening in the image

Stanford University researchers developed a computer program called NeuralTalk with the ability to analyze and express what is happening in a photo with fairly high accuracy. NeuralTalk has a similar operation as artificial neural networks developed and introduced by Google not long ago but the level

Stanford University researchers developed a computer program called NeuralTalk with the ability to analyze and express what is happening in a photo with fairly high accuracy.

Stanford University develops software that recognizes the object and what is happening in the image

NeuralTalk has a similar operation as artificial neural networks developed and introduced by Google not long ago but the " smart" level is said to be much superior.

Picture 1 of The software recognizes the object and what is happening in the image

NeuralTalk project and accompanying studies were published earlier this year by graduate student Fei-Fei Li, head of artificial intelligence laboratory at Stanford University . Basically, this system has the ability to look at a photo in a complex context and determine exactly what is going on. For example, in the picture you see below, the system has identified the objects in the photo like a man, a cat, a laptop and, more interestingly, it can know that "A man is using his laptop while his cat is looking at the screen."

Picture 2 of The software recognizes the object and what is happening in the image

A picture that is completely accurate, very awesome!

As mentioned above, the way NeuralTalk works is similar to the artificial neural network system developed by Google. It uses neural networks to analyze photographs, compare what it " sees" with "previously seen" images and express images with meaningful sentences. Once NeuralTalk learns the basics of the world (like what windows look like, how the table looks, how the cat is about to eat, .), it can fully apply understanding. that goes to specific images and videos.

Picture 3 of The software recognizes the object and what is happening in the image

An image that the system could not accurately identify a couple with a birthday cake in the garden is said to be "the woman who is combing her hair for an outside girl"

However, the system may not always produce perfect results, sometimes the expression is completely different from what is in the picture. As in the picture of two men holding skateboards on the beach, the system said " the person walking on the beach carrying a camera bag " or a couple with a birthday cake in the garden were supposed "the woman is combing her hair for an outside girl ". However, in most photos, in addition to the list of recognized objects, the system also returns extra descriptive sentences and in which there are correct sentences about the image. The team created a website, which demoed the current capabilities of the system, both right and wrong. You can visit to see more photos if you like. (Link)

Picture 4 of The software recognizes the object and what is happening in the image

A loaf of bread was identified quite accurately

Until recently, the huge amount of information on the internet was manually labeled by humans to be searchable. Even when Google first developed Google Maps, a team of employees had to manually check each item to make sure the symbols on the map were correct. After that, they created Google Brain and the previous group took 1 week to do it, the system took only 1 hour. Recently, people began to pay attention to the technology of using neural networks, " teaching them c" and then using them to analyze the composition of the image instead of focusing on simple objects.

This time, the Stanford research team's approach is even more unique in that after the image is identified, the system is also capable of returning results with meaningful expressions. This approach can be applied to improve the accuracy and user experience while searching for images, then the user just needs to type a natural sentence to search, instead of searching in the number of rows billions of images, the system will rely on nouns, verbs, . in the query question to give better results. In addition, this technology is also used to scan real-time images, equipment on vehicles, virtual reality glasses, . and maybe, a glass like Terminal or Robocop is not the future too far away.

Update 13 December 2018
« PREV
NEXT »
Category

Technology

Life

Discover science

Medicine - Health

Event

Entertainment