Stable Diffusion x4 upscaler | Free AI tool


Stable Diffusion x4 upscaler model


Implementation of stabilityai/stable-diffusion-x4-upscaler


This model card focuses on the model associated with the Stable Diffusion Upscaler, available here. This model is trained for 1.25M steps on a 10M subset of LAION containing images >2048x2048. The model was trained on crops of size 512x512 and is a text-guided latent upscaling diffusion model. In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a [predefined diffusion schedule].

Model Details

  • Developed by: Robin Rombach, Patrick Esser

  • Model type: Diffusion-based text-to-image generation model

  • Language(s): English

  • License: CreativeML Open RAIL++-M License

  • Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (OpenCLIP-ViT/H).

  • Resources for more information: GitHub Repository.

    author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn},
    title = {High-Resolution Image Synthesis With Latent Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2022},
    pages = {10684-10695}