Professional Documents
Culture Documents
Subject-Diffusion: No Fine-Tuning Needed For Personalized Image Generation
Subject-Diffusion: No Fine-Tuning Needed For Personalized Image Generation
com/
Introduction
Have you ever wished you could create realistic images from any text
description, without having to fine-tune a model for each domain or
style? If so, you might be interested in this new model, which is
developed by researchers from OPPO Research Institute. The model
was developed to address the slow progress in the area of open-domain
and non-fine-tuning personalized image generation. The motto behind
the development of this model was to create a model that does not
require test-time fine-tuning and only requires a single reference image
to support personalized generation of single- or multi-subject in any
domain. This new model is called 'Subject-Diffusion'.
What is Subject-Diffusion?
Subject-Diffusion has several key features that make it stand out from
other text-to-image generation models.
source - https://oppo-mente-lab.github.io/subject_diffusion/
source - https://oppo-mente-lab.github.io/subject_diffusion/
As shown in Figure above, for the image latent part, the image mask is
concatenated to the image latent feature. For multiple subjects, the
multi-subject image mask is overlaid. Then, the combined latent feature
is input to the UNet. For the text condition part, a special prompts
template is constructed. Then, at the embedding layer of the text
encoder, the “CLS” embedding of the segmented image replaces the
corresponding token embedding. Also, regular control is applied to the
cross-attention map of these embeddings and the shape of the actual
image segmentation map. In the fusion part, patch embeddings of
segmented images and bounding box coordinate information are fused
and trained as a separate layer of the UNet.
Performance Evaluation
source - https://arxiv.org/pdf/2307.11410.pdf
source - https://arxiv.org/pdf/2307.11410.pdf
If you are interested to learn more about the Subject-Diffusion model, all
relevant links are provided under the 'source' section at the end of this
article.
Limitations
Conclusion
Source
research paper - https://arxiv.org/abs/2307.11410
research document - https://arxiv.org/pdf/2307.11410.pdf
project details -https://oppo-mente-lab.github.io/subject_diffusion/
GitHub repo - https://github.com/OPPO-Mente-Lab/Subject-Diffusion