It is Elon Musk's AI's turn to integrate the features of seeing, hearing, and speaking like humans

Grok, the AI ​​chatbot developed by xAI - Elon Musk's company - is in the process of integrating multimedia information processing capabilities, allowing users to interact with both images and text.

Grok, the AI ​​chatbot developed by xAI - Elon Musk's company - is in the process of integrating multimedia information processing capabilities, allowing users to interact with both images and text.

Grok - an artificial intelligence (AI) product from xAI - a company founded by Elon Musk, is expected to soon be upgraded with the ability to receive multimedia information. This information is revealed through developer documentation published by xAI.

Picture 1 of It is Elon Musk's AI's turn to integrate the features of seeing, hearing, and speaking like humans

 Grok is considered a 'rookie' in the field of AI.

In March 2024, Grok made significant progress with version Grok 1.5, possessing significantly improved reasoning capabilities. Previously, in a blog post last month, xAI hinted that Grok-1.5V would provide "multi-modal models in certain fields". A recent developer documentation update seems to indicate that xAI is preparing to launch a new AI model. This means users can upload photos to Grok and receive text responses. Specifically, the document shows how developers can use xAI's software development kit (SDK) to generate both text- and image-based responses. The sample Python script demonstrates how to read an image file, set up a text prompt, and use the xAI SDK to generate a response.

Launched in November 2023 and only available to paid X Premium Plus users, Grok is considered a ' rookie' in the AI ​​field compared to serious competitors such as OpenAI's ChatGPT. A special feature of Grok is the ability to access real-time information, including posts on platform X. According to information from xAI, the Grok model is trained on "many sources of public text data." declared on the Internet as of the third quarter of 2023 and the data set is reviewed and curated by reviewers".

X's blog post also claims Grok-1 was not trained on X data (including public X posts). However, xAI also acknowledges that large language model benchmarks are often criticized because models can perform well on benchmarks if those benchmarks are included in their training data. This is like memorizing answers on a test, instead of actually understanding the content.

However, according to an xAI blog post, Grok 1.5 is gradually closing the gap with GPT-4 on many evaluation standards, from elementary to high school competitions. Multimodal chatbots are considered the next destination of the AI ​​race. Many industry giants such as Google announced new developments at the Google I/O event, while OpenAI also introduced GPT-4o. The lack of multimedia capabilities has left Grok behind for now. With upgrading efforts, can Grok create a surprise in this challenging race?

Update 26 May 2024
« PREV
NEXT »
Category

Technology

Life

Discover science

Medicine - Health

Event

Entertainment