It is a new generation of application which exploits theartificial intelligence like never before: Dall.e 2, Stable Diffusion, MidJourney… The principle: you type a text and an image is generated. Yes, but… the imagination is totally present. We can ask to generate a visual of a koala driving a motorcycle, of a microcomputer from the Renaissance period, of Mozart trying out a new stereo helmet under the intrigued gaze of Marilyn Monroe… Or even venturing into the worlds of science- fiction, video games, street art. And cheerfully mixing styles, eras, inventing baroque and totally surreal situations. Each time, the AI will satisfy our desires.
These “image generation by AI” or “text-to-image” applications represent one of the first realizations accessible to all of potentials of artificial intelligence.
The OpenAI initiative
A non-profit organization, OpenAI appeared in December 2015 in San Francisco. Its objective is to push the limits of artificial intelligence with, however, an assumed ethic: AI must be safe and beneficial for humanity, whose security it would like to contribute to “preserve”.
It is chaired by:
- programmer-entrepreneur Sam Altman, also founder of Reddit;
- researcher Greg Brockman, founder of Stripe, an online payment company.
OpenAI was initially founded with grants from the likes of Altman and Brockman, but also Elon Musk, and later received a $1 billion endowment from Microsoft. It is also financially supported by partners such as Amazon Web Services or Infosys.
If OpenAI first distinguished itself by creating tools for developers, as of January 2021, it was able to present a concrete application of its research accessible to all: Dall.e. (a name created from surrealist painter Salvador Dalí and Pixar’s Wall-E cartoon robot).
The principle: you type a sentence and Dall.e transforms it into an image. On this first version, the result left something to be desired.
In April 2022, Slab 2 was presented to the public and this time, the results were deemed stunning: the creations are both original and of a patent artistic level. The images are of good quality and what’s more, it only takes about ten seconds to produce them.
Dalle.e 2 was one of the first manifestations visible to everyone of the prowess of artificial intelligence. Because it is possible to type highly surreal sentences and obtain a result that holds up. In fact, the more specific the request, the more impressive the result. The images are worthy of what a talented graphic artist could achieve. They are both creative and aesthetic. Moreover, if the user is not fully satisfied, it can generate variations.
Only limitation in the fall of 2022: Dall.e 2 and the various versions presented below only include sentences formulated in English for the moment.
It is an understatement to say that Dall.e 2 has won over a very wide audience. In September 2022, the application already had 1.5 million users and was used to create more than 2 million images per day.
To achieve such performance, OpenAI has developed two advanced technologies:
- GPT3: an AI capable of understanding human texts.
- Clip: a computer “vision” system, integrating an automated evaluation of what we consider to be aesthetic.
The GPT project (Generative Pre-trained Transformer) was born as early as 2018. It is a machine learning system that works by weighting its acquisition of knowledge after evaluating its relevance.
The other element, Clip, integrates hundreds of billions of images with their captions from the Common Crawl database. It includes the analysis of the particular style of many artists. Dall.e relies on the results of this analysis to offer images with an aesthetic close to that of a great painter or a renowned photographer.
Stable Diffusion is another very successful attempt to generate highly realistic photographic-looking images from text. A British initiative, Stable Diffusion’s Dream Studio tool appeared in August 2022. It produces particularly impressive renderings on an artistic level. The tool is available online at: https://beta.dreamstudio.ai/dream.
The third major application, MidJourney is the work of an AI research laboratory founded by David Holz, a talented researcher from California, holder of a very large number of patents and who defines the mission of MidJourney as ” expand the imaginative powers of the human species “. Notable for its creativity, MidJourney can be put into practice from the MidJourney Discord forum, in the “ newbies “. This forum had already attracted more than 2 million subscribers in the fall of 2022.
A threat to artists?
Other projects are underway, including Imagen which is led by Google.
It is an understatement to note that Dall.e 2 and its avatars have left the artistic world taken aback. Frederic Boisdrona specialist in AI and robotics, however, believes that these tools will gradually be integrated into the panel of creatives: “ It is certain that a minority of artists will feel in danger with the emergence of these AIs. But, others will take them for what they are, perfect tools to explode their inspirations, their creativity. Similarly, professional chess and go players are now working with AIs to discover new strategies that no one had thought of. »
We would like to thank the writer of this write-up for this incredible content
Definition | AI Image Generator – Text-to-image | Futura Tech
Find here our social media profiles , as well as other related pageshttps://www.ai-magazine.com/related-pages/