Google takes on OpenAI with flashy text-to-image generator

TheAIimagery competition is getting personal.

Googlethis week unveiled a new challenger toOpenAIsvauntedDALLE-2text-to-image generator and took shots at its rivals efforts.

Both models convert text prompts into pictures.

Google takes on OpenAI with flashy text-to-image generator

But Googles researchers claim their system provides unprecedented photorealism and deep language understanding.

Human raters preferred Imagen over DALLE-2 for both sample quality and image-text alignment.

Credit: Saharia et al.The cringingly-named Imagen system uses a large pre-trained language model as a text encoder.

: Example qualitative comparisons between Imagen and DALL-E 2 [54] on DrawBench prompts from Conflicting category. We observe that both DALL-E 2 and Imagen struggle generating well aligned images for this category. However, Imagen often generates some well aligned samples, e.g. “A panda making latte art.”

A cascade ofdiffusion modelsthen turn the users words into pictures.

In tests, the Google team said Imagen significantly outperformed DALL-E 2.

40% off TNW Conference!

Imagen vs DALL-E 2 on DrawBench a) image-text alignment, and b) image fidelity.

Dubbed DrawBench, the benchmark compares human judgments on the outputs of different text-to-image generators.

Unsurprisingly, Googles metric gave strong scores to Googles system.

DALL-E 2 can struggle to correctly assign colors to objects especially for prompts with more than one object.

Example qualitative comparisons between Imagen and DALL-E 2 [54] on DrawBench prompts from Colors category. We observe that DALL-E 2 generally struggles with correctly assigning the colors to the objects especially for prompts with more than one object.

Until the model and code get a public release, cynics will suspect that Googles cherry-picking the results.

Imagen was significantly better than DALL-E 2 in prompts with quoted text.

The researchers warn that generative methods can spread misinformation, stir harassment, and exacerbate marginalization.

Example qualitative comparisons between Imagen and DALL-E 2 [54] on DrawBench prompts from Text category. Imagen is significantly better than DALL-E 2 in prompts with quoted text.

Imagen significantly outperformed DALL-E 2 in the positional, text, and descriptions categories.

I await their update with caution.

On the other hand, I dont want our robot overlords to replace artists with algorithms.

Example qualitative comparisons between Imagen and DALL-E 2 [54] on DrawBench prompts from Reddit category.

Story byThomas Macaulay

Thomas is the managing editor of TNW.

He leads our coverage of European tech and oversees our talented team of writers.

Away from work, he e(show all)Thomas is the managing editor of TNW.

He leads our coverage of European tech and oversees our talented team of writers.

Away from work, he enjoys playing chess (badly) and the guitar (even worse).

Story byThomas Macaulay#

Also tagged with#

Story byThomas Macaulay

Also tagged with