-
Notifications
You must be signed in to change notification settings - Fork 534
Description
Feature Summary
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing.
Detailed Description
https://huggingface.co/deepgenteam/DeepGen-1.0
https://github.com/deepgenteam/deepgen
DeepGen 1.0 is a lightweight unified multimodal model with only 5B parameters (3B VLM + 2B DiT). It integrates five core capabilities—general image generation, general image editing, reasoning image generation, reasoning image editing, and text rendering—within a single model. Across multiple authoritative benchmarks, DeepGen 1.0 is competitive with competitive with or surpassing the state-of-the-art unified multimodal models that are 3× to 16× larger, achieving comprehensive performance, demonstrating that massive scaling is not the sole path to high-performance multimodal generation.
Alternatives you considered
No response
Additional context
No response