What Does A Step-Change In AI Image Generation Quality Mean? – Thomas Lancaster’s Blog

admin April 1, 2025

0 1 3 minutes read

The quality of output from generative AI systems, when directed well by a skilled user, continues to improve at a rapid rate. This post is motivated by the improved image generation capabilities of ChatGPT within the 4o model as released in March 2025, but the general principles will likely apply to many future systems.

Prefer to watch rather than read?

Watch a video version of this blog post below. The content of this post is largely based on the video.

What Makes ChatGPT-4o’s Image Generation Different?

The image generation pipeline used by ChatGPT 4o is rather different to previous versions. Three main advances stand out to me as being the most useful in an educational setting:

Text Generation: One of the most significant improvements is the model’s ability to spell and write coherent words within images. This works whether you supply the words or let the model determine the best text to display. There can still be errors, but these are far fewer than there were previously.
Character Consistency: It is now much easier to achieve consistency in characters from one image to the next. This opens up possibilities like producing storybooks or comic strips featuring the same characters across multiple panels or pages.
Overall Quality: My experience is that the quality of the images, the designs, and the realism appear improved over many previous models.

What are some practical uses of this technology?

I’ve shared a lot of examples of the type of images that can be generated in the accompanying video and on social media. But think about useful areas like translating text in images, creating photos of product prototypes, designing newsletter templates, producing educational card games, or producing infographics.

At the more fun side of things, how about making fresh memes or comic strips? Or maybe just some cats doing their laundry?

Even the slides I used in the video were created as images in ChatGPT. I’ve used the sketchnote format, which I think is different and visually appealing, if nothing else (and I say that as someone with limited ability to draw).

I’ve also found that ChatGPT 4o is very good at taking a single image of a human, then converting them to different artistic formats, changing their location, or changing outfits.

There are a few examples in the picture below.

There are downsides…

While this all sounds incredibly positive, we must also consider the dangers. Any technology can be intentionally misused or accidentally used in ways not originally intended.

Images can be misused. Imagine fake news stories spreading rapidly through social media, made more convincing by realistic visuals involving celebrities or figures of authority. You could even fake screenshots, like fabricated Wikipedia pages or phone conversations, to lend credibility to false narratives. It’s amazing how quickly viral stories spread, and realistic visuals exacerbate this risk.

Here’s a humorous example of a fake conversation on an iPhone. The scenario and content both came from ChatGPT. This example is not particularly dangerous, but I’m sure you can imagine how other conversations could be used to mislead or to provide fake evidence.

We do need to think about responsible use

This technology always seems incredibly useful and I believe it’s worth exploring further. But we also have to raise awareness of all the risks that come with relying on this type of technology, and how easily people can be tricked into believing the information in an image is real.

Please do ensure that support networks are in place for people you know who might be susceptible to misinformation or scams amplified by this technology. A convincing fake newspaper story or celebrity endorsement is now easier to create than ever before. Convincing visuals can make scams more potent.

Even within an academic setting, think how easily students (or other people involved in the educational process) can be misled.

This technology, like all types of Generative AI, will only get faster and better. Let’s embrace its potential while remaining vigilant about its challenges.