news 2026-04-18 · HuggingFace Daily Papers

CoReDi: The AI Framework That Evolves Its Own 'Vision' to Generate Images 13x Faster

What if an AI image generator could evolve its own way of understanding images while learning to create them?

That's exactly what CoReDi (Coevolving Representation Diffusion) achieves. Developed by researchers from Archimedes Research Center in Greece and valeo.ai in France, this new framework tackles a fundamental limitation in current diffusion models: they rely on fixed, pre-computed semantic representations that never adapt during training.

Think of it like an artist forced to see through glasses that can't be adjusted. No matter how skilled they become, their perception stays locked in place.

CoReDi introduces a learnable projection layer that evolves alongside the generative model, allowing the semantic representation space to progressively specialize for image synthesis. But making this work required solving a tricky stability problem — naive joint optimization leads to degenerate solutions where the system essentially "cheats."

The team identified three critical ingredients for stable coevolution: stop-gradient targets to prevent trivial loss minimization, batch normalization to maintain feature scale stability, and explicit regularization to prevent feature collapse where channels become redundant copies of each other.

The results are striking. CoReDi converges roughly 13x faster than REPA and 2x faster than DeCo in pixel space, while achieving equal or better image quality. On ImageNet 256×256, it matches the state-of-the-art FID score of 3.3 using only half the training iterations of its predecessor ReDi.

Perhaps most fascinating: visualizations show the coevolving representations developing increasingly structured spatial organization over training — the AI is literally reorganizing how it "sees" to become a better creator.

The framework works in both VAE latent space and pixel space, removing the reconstruction bottleneck that has limited image fidelity in previous approaches.

📄 Source

HuggingFace Daily Papers

← Previous

LLaTiSA: The AI That Learns to Read Charts Like Hu

Google Embeds AI Mode Into Chrome, Turning the Bro