GPT 4o image generation is finally here
OpenAI recently released an interesting new feature. The tech giant has developed a revolutionary image generation tool that not only creates lifelike images, but also uses a model that interprets the prompt very accurately.
So from now on, image generation is not done by Dall-E by default in ChatGPT, but by the GPT-4o model. This development has also been integrated into the Sora video generator. The descriptions and demonstration images are quite promising, so I thought I would try what this tool can do. Update: it is no longer available in the free version, so I have not tested the service yet. I will only highlight a few official images.
According to OpenAI, a good image generator should not only be able to create spectacular photorealistic, fantasy, and surreal images, but also be useful and functional.
When creating the new image generation model, attention was paid to ensuring that the texts appearing in the generated images were accurate. The model not only monitors the correctness of the text, but we can also specify the exact arrangement of the text, so we can write poems or create beautifully illustrated recipes or infographics.
The model also supports differentiated imaging, so after image generation, we can easily modify the existing image by entering a short prompt.
Furthermore, this model can easily create comics and even user interfaces (UI) for games with full visuals and consistent icons.
This latter feature could be a game changer, as it will now be possible to create UIs and infographics with precise text much faster and more efficiently.
Sample images
Unfortunately, access to the new image generator has now been disabled in the Free version, so I’ll only show some of the official images:
Example #1 – Infographic
Image credit: OpenAI
Correct captions and convincing depiction.
Example #2 – Reflections and text
Image credit: OpenAI

The reflections look quite accurate. Many AI image generators have trouble representing reflections properly.
Example #3 – UI design
Image credit: OpenAI


We can create computer game visual designs even with consistent icons.
Example #4 – Transparent background
Credit: OpenAI
Example #5 – Illustration for a science experiment
Image credit: OpenAI

Availability
The gpt-4o image generator is now available on Plus, Pro, and Team subscriptions. Due to overwhelming demand, the ability to generate images with the new model was disabled in the free version to prevent server overload.
The service will soon be available in Enterprise and Edu versions.
Performance
This model is a bit slower than Dall-E, as it interprets and processes our prompt more precisely. It can take more than 1 minute to render an image.
Advanced prompting
Since there is no UI element to set parameters (such as aspect ratio, colors used, image style) during image generation, all of this can be specified simply in text form. You can even specify the hexadecimal code of specific colors in the prompt. If you want to create a photorealistic image, you can specify the type of camera or lens or even the style of the photo (example: polaroid photo).
Limitations
The model does not always work flawlessly. There may be problems with rendering multilingual text, hallucinations may occur. Longer, small texts may be incorrectly rendered. There may be problems with content authenticity if we want to display more than 10-20 small icons in an image. We can see examples of this in the official article.
Summary
Based on the sample images on the official website, I can say that the gpt-4o image generator can be a very useful tool for creating illustrations, infographics, UI elements and icons.
Of course, we have to be careful if we want to use the images for commercial purposes. If we have the image made in the style of a specific artist, it can lead to legal problems. This applies to any AI image generator.