Stability AI has released Stable Diffusion 3.5

The developers of Stable Diffusion chose to focus on thorough improvements over the course of four months rather than implementing quick fixes. The primary innovation in Stable Diffusion 3.5 is the adoption of the MMDiT (Multimodal Diffusion Transformer) architecture. Unlike previous versions, the new model utilizes three pre-trained text encoders simultaneously. This significant technical decision required extensive modifications to the entire architecture.

On the Hugging Face platform, Stable Diffusion 3.5 models are now available, including the Stable Diffusion 3.5 Largewith 8 billion parameters and the faster version, Stable Diffusion 3.5 Large Turbo. These models are designed for high performance and speed, respectively. The source code for these models can be found on GitHub. The Stable Diffusion 3.5 Medium, which contains 2.5 billion parameters, is set for release on October 29, 2024. All three models are designed to handle various styles and genres effectively, including drawings, 3D models, and photographs. The company claims that these are their most powerful models to date, optimized to run efficiently on user devices without high system requirements. The Medium version is specifically optimized for smartphones, as reported by TechCrunch.

The final images generated using Stable Diffusion 3.5 can be used for free in non-commercial contexts. For commercial use, no fee is required as long as the company’s revenue does not exceed $1 million. If the company exceeds this revenue threshold, a corporate license must be obtained individually.