
Ideogram 4.0 is the most significant open-weight image model release of 2026 so far. The company, founded by former Google Brain researchers, has shipped a 9.3B parameter text-to-image model that not only opens its weights for the first time but introduces a fundamentally different way of controlling image generation: structured JSON prompts with pixel-precise bounding boxes. The result is a model that ranks #1 among all open-weight image models on design benchmarks, and #2 overall in designer-preference evaluations, sitting between GPT Image 2 and every other closed-source competitor.
A new kind of prompt
Every previous image model has accepted natural language. Ideogram 4.0 was trained exclusively on structured JSON captions, where each caption exhaustively describes every element in the image, with a style block and optional bounding boxes and color palettes. This is not just a prompt format change; it is a training-time architectural decision that changes what the model can actually do.
The JSON interface gives you three controls that plain-text prompting cannot reliably deliver:
- Bounding-box layout: Specify element placement as
[y_min, x_min, y_max, x_max]in 0-1000 normalized coordinates. Objects, text blocks, and background regions all go exactly where you put them. - Color palette conditioning: Up to 16 hex colors per image (5 per element) steer the dominant color scheme directly, not through descriptive language.
- Typed text elements: A
textelement carries the literal string to render and a separate visual description for its styling, enabling multi-line, multi-font in-image typography in a single generation pass.
Here is a minimal example of what a JSON prompt looks like in practice:
{
"high_level_description": "A gig poster for Sound & Color Festival 2026",
"style_description": {
"color_palette": ["#1B3A5C", "#E6B422"]
},
"compositional_deconstruction": {
"background": "Electric magenta to cobalt gradient with halftone texture",
"elements": [
{
"type": "text",
"bbox": [70, 100, 300, 900],
"text": "SOUND & COLOR\nFESTIVAL 2026",
"desc": "Heavy condensed sans-serif in cream-yellow, centered"
},
{
"type": "obj",
"bbox": [350, 200, 700, 800],
"desc": "Psychedelic sun-ray graphic radiating from center"
}
]
}
}Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves

