Dr. Viet at Google uses AI to convert text into images

Dr. Luong Minh Thang and 10 experts at Google Brain built a Parti model, teaching artificial intelligence to draw pictures based on descriptive words.

Dr. Thang (34 years old), is the only Vietnamese in the key research group on Parti (Pathways Autoregressive Text-to-Image) model - self-transforming text into images at Google Brain in early 2021. Common language in human communication, but "if technology is applied to create creative photos and paintings, it can be considered a new step of AI", said Dr. Thang.

Picture 1 of Dr. Viet at Google uses AI to convert text into images
Dr. Luong Minh Thang currently works for Google Brain, specializing in developing AI products.

He shared that current AI models are applied in languages ​​through chatbot models that can interact with humans in writing. In the field of images, AI can recognize objects in images. "If combining these two things to convert text-based language into images, it will create a very modern AI model, effectively supporting people in the field of image creation," said Dr. Thang. reason for making Parti model.

Parti model allows to create images exactly as described and desired by the user. This technology can help people who specialize in image creation such as artists, photographers, fashion designers, graphic designers, etc. When they have an idea for a photo, just write the words. desired detail, AI will analyze and produce a suggested picture for that idea to help them increase their creativity. Just changing a sentence, word, or detail in the text can result in a different picture.

Picture 2 of Dr. Viet at Google uses AI to convert text into images
AI-generated photos based on textual descriptions below.

To create the Parti model, TS Thang and Google experts use hundreds of millions of text-image data pairs, respectively, to train the AI ​​model. The data is used from websites, processed by an artificial neural network with a capacity of about 20 billion neurons. "Based on text and image data, AI will combine to create a new photo, helping people have new ideas," said Dr. Thang.

The topics most represented by the Parti model are nature, animals, objects. On the Google Research website, many images are created from AI like real photos.

According to the research team, with images related to people, they are carefully handled by the group on the principle of not negatively affecting the community in terms of gender, ethnicity, religion.

Picture 3 of Dr. Viet at Google uses AI to convert text into images
Oil paintings in the style of famous painter Van Gogh made by AI.

The current disadvantage is that with documents that are too long, describe too many details, or describe conflicting images (like a sea next to a desert), AI can misinterpret or give no results.

Dr. Thang said that in the coming time, the team will overcome this limitation to build a complete AI model. The team considers training AI that can edit images on demand on users' texts to better serve them, as well as research to create videos from many photos with similar content.

Luong Minh Thang used to be a math student at the High School for the Gifted, Vietnam National University, Ho Chi Minh City. After graduating from high school, he studied computer science at the National University of Singapore. In 2011, he received a PhD scholarship at Stanford University (USA). In September 2016, he officially worked at Google Brain with expertise in machine learning and natural language processing.