Virtual assistants often say silly things and we know why

Recently, American scientists have developed a new capacity test that can challenge even the smartest artificial intelligence systems today.

According to MIT Technology Review, it is clear that the virtual assistant tools of tech giants like Siri (Apple) and Alexa (Amazon) are still far from perfect. But we still hope that machine learning technology will progress steadily to soon turn them into powerful language supporters. However, a new test recently developed by US scientists shows that, in order to be truly fluent in languages, artificial intelligence systems (AI) need to adopt a completely approach. other.

Picture 1 of Virtual assistants often say silly things and we know why
Clearly, the virtual assistant tools of technology giants are still far from perfect.

In the project of researchers from Allen Institute of AI (AI2, a non-profit organization in Seattle, USA), they developed a dataset called AI2 Reasoning Challenge (ARC) that includes sentences Ask a lot of options in the field of primary school science. Each question needs some understanding of how the world works.

An example is this question: Which of the following is not made from a naturally grown material? (A) cotton shirt, (B) wooden chair, (C) plastic spoon, (D) basket with grass.

This question is very easy when you know that plastic is not something that grows naturally. The answer relates to the picture of common-sense experiences about the world that most people have, including young children.

But for AI, the above question is a difficult question, because practical experience is what AI lags behind audio assistants, chatbot, and also the shortcomings of translation software. That is one of the reasons why AI is easily embarrassed by this question.

Picture 2 of Virtual assistants often say silly things and we know why
The new experiment is one of AI2's initiatives to help AI systems instill this kind of understanding of the world.

So far, machine-dependent language systems often only provide compelling answers to questions they have seen through many similar examples. For example, a training system of thousands of IT support chat (chat) can claim to be a technical supporter in limited situations. But this system will fail in broader questions.

"We need to use practical experience to fill the gaps in the language we see, to have a coherent picture of what is being said. Machines don't have this common experience, because so they can only understand what is clearly written but ignore many of the implications and assumptions below the text, " said ARC project leader Peter Clark.

The new experiment is one of AI2's initiatives to help AI systems instill this kind of understanding of the world. This research is significant because it determines how a language system understands what it is saying that will require sophistication and dexterity like humans.

An example of this is Microsoft's research and a group at Alibaba in January. In a simple test called the Stanford Question Answering Dataset (the data set answering Stanford questions), researchers developed a question and answer program that demonstrated greater competence than humans. These advances have appeared on the headlines of newspapers with the idea that now AI can read better than humans.

Picture 3 of Virtual assistants often say silly things and we know why
This is a very good antidote for superficial measurement standards that are so common in machine learning.

In fact, the above programs cannot answer more complex questions or access other sources of knowledge. Therefore, in the future, technology companies will continue to develop AI capabilities in this new direction. Recently, Microsoft announced the development of a software capable of translating news from English to Chinese and vice versa, with the results assessed by independent volunteers as equivalent to specialized translators. career. To achieve a higher level of accuracy, Microsoft researchers used a high-end deep learning technique. This result is very useful and potential, but the AI ​​system will still struggle with compiling free or textual conversations in an unfamiliar field like medical notes.

In this context, the new test that challenges AI capabilities of AI2 Academy has received the support of a New York University professor. According to him, "this is a very good antidote for superficial measurement standards that are so common in the field of machine learning. It will really force AI researchers to advance in their game. ".