Google makes visible progress in developing text-to-image technologies based on artificial intelligence


Engaged in developing new, even unexpected targeted technologies, Google’s research team said Imagen is a text-to-image model that has an “unprecedented degree of photorealism” and a deep level of language understanding. To understand, Text-to-image AI models are able to understand the relationship between an image and the words used to describe it. Text-to-image models take text inputs like “a dog on a bike” and produce a corresponding image. Imagen starts by generating a small (64×64 pixels) image and then does two “super resolution” passes on it to bring it up to 1024×1024.The research team says the improved software called DALL-E 2 “generates more realistic and accurate images with four times greater resolution”. Google’s research team said it has also created a benchmark tool to assess and compare different text-to-image models. Google Research also said that its preliminary analysis of Imagen suggests that the model encodes a range of “social and cultural biases” when making images of activities, events and objects.


The new technique generated some concerns about its misuse possibility. „The potential risks of misuse raise concerns regarding responsible open-sourcing of code and demos. At this time we have decided not to release code or a public demo,” the researchers declared. Even if for the instant an immediate application of this research was not clearly established, it’s without doubts a way to push the AI capabilities for future developments that will prove to be important for the humanity.