OpenAI’s Dall-E 2 generates all kinds of images faster and better from text input

In short: Imagine being able to describe an image to an AI and transform it into a photorealistic image. That’s one of the claims from an updated version of a program we first saw last year, and the results look exciting.

DALL-E 2 comes from the San Francisco-based OpenAI research lab behind artificial intelligence models like GPT-2 and GPT-3 that can write fake news and beat the best human opponents in games like DOTA 2.

DALL-E 2, a name that comes from a portmanteau of artist Salvador Dalí and the Disney robot WALL-E, is the second iteration of the neural network we first saw in January last year. , but this one offers higher resolution and lower latency than the original version. The images it generates now look much better at 1024 x 1024 pixels, a noticeable increase from the original’s 256 x 256.

Thanks to OpenAI’s updated CLIP image recognition system, now called unCLIP, DALL-E 2 can transform the user’s text into vivid images, even those surreal enough to rival Dali himself. Asking for a Koala playing basketball or a monkey paying taxes, for example, will see the AI ​​create frighteningly realistic images of those descriptions.

The last system switched to a process called diffusion, which starts with a pattern of random dots and gradually changes that pattern to an image as it recognizes specific aspects.

1649335871 92 The OpenAI Dall E 2 generates faster and better all

DALL-E 2 can do more than create new images from text. It is also capable of modifying sections of images; you can, for example, highlight someone’s head and tell them to add a fun hat. There’s even an option to create variations of a single image, each with different styles, content, or angles.

“This is another example of what I think is a new trend in computer interface: you say what you want in natural language or with contextual cues, and the computer does it,” said Sam Altman, CEO from OpenAI. “We can imagine an ‘AI office worker’ taking natural language requests like a human does. »

These types of image-generating AIs come with an inherent risk of being misused. OpenAI has certain safeguards in place, including not being able to generate faces based on name and not uploading or generating objectionable material, only family-friendly content. Some of the prohibited topics include hate, harassment, violence, self-harm, graphic/shocking images, illegal activities, deception such as fake news, political actors or situations, medical or disease-related images , or general spam.

Users must also disclose that an AI generated the images, and there will be a watermark indicating this fact on each of them.

The Verge writes that researchers can register to preview the system online. It is not released directly to the public, although OpenAI hopes to make it available for use in third-party applications at some point in the future.

We wish to thank the author of this short article for this awesome material

OpenAI’s Dall-E 2 generates all kinds of images faster and better from text input


Take a look at our social media accounts and other related pageshttps://www.ai-magazine.com/related-pages/