Unlearnable Samples in Diffusion Models

Naresh Kumar Devulapally

Unlearnable Samples

Crafting Unlearnable Samples for Latent Diffusion Models

Naresh Kumar Devulapally

Target: ACM MM 2025

Spring 2025

Naresh Kumar Devulapally

Refresher on Diffusion Models: Forward Diffusion

Spring 2025

q(x_t | x_{t-1}) = \mathcal{N}(x_t; \sqrt{1 - \beta_t} x_{t-1}, \beta_t I)

The forward process gradually adds Gaussian Noise

x_t

noisy version of

x_0

\beta_t

noise schedule

identity matrix

Direct Sampling relation:

q(x_t | x_0) = \mathcal{N}(x_t; \sqrt{\bar{\alpha}_t} x_0, (1 - \bar{\alpha}_t) I)

\bar{\alpha}_t = \prod_{s=1}^{t} (1 - \beta_s)

x_t = \sqrt{\bar{\alpha}_t} x_0 + \sqrt{1 - \bar{\alpha}_t} \epsilon

where

\epsilon \sim \mathcal{N}(0, I)

\sqrt{\bar{\alpha}_t}

\sqrt{1 - \bar{\alpha}_t}

scales down

x_0

noise magnitude

Unlearnable Samples

Naresh Kumar Devulapally

CSE 4/573 Computer Vision

Refresher on Diffusion Models: Reverse Diffusion

Spring 2025

Naresh Kumar Devulapally

Refresher on Diffusion Models: Reverse Diffusion

Spring 2025

p_\theta(x_{t-1} | x_t) = \mathcal{N}(x_{t-1}; \mu_\theta(x_t, t), \Sigma_\theta(x_t, t))

Goal is to learn to reconstruct from

p_\theta(x_{t-1} | x_t)

denoise one step

\mu_\theta(x_t, t)

predicted mean of the denoised image at step \( t \)

\Sigma_\theta(x_t, t)

learned variance at step \( t \)

x_0

x_t

\mu_\theta(x_t, t) = \frac{1}{\sqrt{1 - \beta_t}} \left( x_t - \frac{\beta_t}{\sqrt{1 - \bar{\alpha}_t}} \epsilon_\theta(x_t, t) \right)

\mathcal{L}_{LDM} = \mathbb{E}_{x_0, \epsilon, t} \left[ \| \epsilon - \epsilon_\theta(x_t, t) \|^2 \right]

Since learning \( \Sigma_\theta(x_t, t) \) is difficult, diffusion models approximate it as a fixed schedule or predict only the mean \( \mu_\theta(x_t, t) \).

Instead of directly predicting \( x_{t-1} \), modern diffusion models predict the added noise \( \epsilon \) and derive \( x_{t-1} \) from it:

Unlearnable Samples

Naresh Kumar Devulapally

CSE 4/573 Computer Vision

Refresher on Diffusion Models: Reverse Diffusion

Spring 2025

Naresh Kumar Devulapally

Refresher on Diffusion Models: CFG

Spring 2025

\tilde{\epsilon}_\theta(x_t, t, c) = \epsilon_\theta(x_t, t, \varnothing) + w \cdot (\epsilon_\theta(x_t, t, c) - \epsilon_\theta(x_t, t, \varnothing))

\tilde{\mu}_\theta(x_t, t, c) = \frac{1}{\sqrt{1 - \beta_t}} \left( x_t - \frac{\beta_t}{\sqrt{1 - \bar{\alpha}_t}} \tilde{\epsilon}_\theta(x_t, t, c) \right)

\mathcal{L}_{LDM-T2I} = \mathbb{E}_{x_0, c, \epsilon, t} \left[ \| \epsilon - \tilde{\epsilon}_\theta(x_t, t, c) \|^2 \right]

\( c \) is the textual conditioning

Classifier-Free Guidance (CFG) removes dependence on external classifiers by using the model itself.

CFG enhances text alignment while preserving image quality.

\( \epsilon_\theta(x_t, t, c) \) be the model's noise prediction with text conditioning \( c \).

\( \epsilon_\theta(x_t, t, \varnothing) \) be the model's noise prediction without text conditioning \( c \).

\( w \) is the guidance scale

Using the new noise estimate \( \tilde{\epsilon}_\theta(x_t, t, c) \), the revised mean estimate for denoising is:

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: DreamBooth

Spring 2025

DreamBooth fine-tunes a pretrained text-to-image diffusion model to personalize it for a specific subject.

The challenge is fine-tuning a subject-specific embedding without overfitting or losing general knowledge

The training objective consists of two key loss terms:

\mathbb{E}_{x, c, \epsilon, t} \textcolor{red}{\left[ w_t \| x_\theta( \alpha_t x + \sigma_t \epsilon, c) - x \|^2 \right]} + \textcolor{green}{\lambda w_{t'} \| x_\theta( \alpha_{t'} x_{\text{pr}} + \sigma_{t'} \epsilon', c_{\text{pr}}) - x_{\text{pr}} \|^2}

subject-specific representation

prior knowledge of the general class

\theta \leftarrow \theta - \eta \frac{\partial}{\partial \theta} \left( \mathbb{E}_{x, c, \epsilon, t} \left[ w_t \| x_\theta( \alpha_t x + \sigma_t \epsilon, c) - x \|^2 \right] + \lambda w_{t'} \| x_\theta( \alpha_{t'} x_{\text{pr}} + \sigma_{t'} \epsilon', c_{\text{pr}}) - x_{\text{pr}} \|^2 \right)

Gradient Update:

Finetunes all UNet parameters

See figure in the slide below

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: DreamBooth

Spring 2025

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: DreamBooth

Spring 2025

DINO Score (Deep Image Representation Similarity)

\text{DINO} = \frac{1}{N} \sum_{i=1}^{N} \cos( f(x_i), f(x_{\text{real}}) )

\( f(x) \) is the feature embedding extracted from a self-supervised ViT (DINO model).

Higher DINO scores indicate better subject fidelity.

CLIP-I (CLIP Image Similarity)

DINO Score (Deep Image Representation Similarity)

\text{CLIP-I} = \frac{1}{N} \sum_{i=1}^{N} \cos( \phi(x_i), \phi(x_{\text{real}}) )

CLIP-T (CLIP Text Similarity)

\text{CLIP-T} = \frac{1}{N} \sum_{i=1}^{N} \cos( \phi(x_i), \psi(c) )

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: Textual Inversion

Spring 2025

Textual Inversion personalizes pretrained text-to-image diffusion models by learning new word embeddings instead of fine-tuning the model.

A unique pseudo-word is learned that represents a subject, allowing it to be used in new prompts.

The model is trained to learn a new embedding \( v^* \) that represents the subject while keeping the diffusion model frozen.

v^* = \arg\min_{v} \mathbb{E}_{z \sim E(x), y, \epsilon \sim \mathcal{N}(0,1), t} \left[ \|\epsilon - \epsilon_\theta(z_t, t, c_\theta(y))\|^2_2 \right]

Optimize the embedding \( v^* \) so that the model associates it with the subject’s features.

v^* \leftarrow v^* - \eta \frac{\partial}{\partial v} \mathbb{E}_{z, y, \epsilon, t} \left[ \|\epsilon - \epsilon_\theta(z_t, t, c_\theta(y))\|^2_2 \right]

See figure in the slide below

Finetunes \( v^* \) embeds.

Frozen

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: Textual Inversion

Spring 2025

Unlearnable Samples

Naresh Kumar Devulapally

Personalized T2I generation: Textual Inversion

Spring 2025

FID (Fréchet Inception Distance)

FID measures how similar the distribution of generated images is to the distribution of real images using deep feature embedding

\text{FID} = \| \mu_r - \mu_g \|^2 + \text{Tr}(\Sigma_r + \Sigma_g - 2 (\Sigma_r \Sigma_g)^{\frac{1}{2}})

\( (\mu_g, \Sigma_g) \) are the mean and covariance of generated images.

Peak Signal-to-Noise Ratio (PSNR)

LPIPS

\text{LPIPS} = \mathbb{E}_{x} [ d(f(x_i), f(x_{\text{real}})) ]

\( f(x) \): CNN

\( (\mu_r, \Sigma_r) \) are the mean and covariance of real images in the feature space of a pretrained Inception network.

\( d(\cdot) \) distance function

Unlearnable Samples

Naresh Kumar Devulapally

Unlearnable Examples to counter personalization

Spring 2025

General Framework:

Approach: Introduce adversarial noise perturbations to ensure that certain images become unlearnable.

Goal: Prevent a diffusion model from learning and reproducing specific images while maintaining its generalization capability.

\min_{\|\delta^{u}\| \leq \rho_u} D(p_{\theta}^{u}(x), q(x))

\text{where } x \sim q(x), \|\delta\| \leq \rho_u

\text{s.t.} \max_{\delta^*} \mathbb{E}_{t, x' \sim p_{\theta}^{u}(x), x \sim q^{c}(x)} [ \mid \mid \mathcal{L}_{DM}(x') - x \mid \mid_2^2]

\delta^* =\argmin_\delta D(p_{\theta}^{u}(x), q(x))

\text{where } p_{\theta}^{u}(x) \sim x + \delta_\theta

\textcolor{green}{\text{Unlearnable for LDM}}

\textcolor{green}{\text{Imperceptible}}

Unlearnable Samples

Naresh Kumar Devulapally

Related work: AdvDM (Mist)

Spring 2025

Gradient Estimation with Monte Carlo

\nabla_{x_0} \mathbb{E}_{x_{1:T} \sim u(x_{1:T})} \mathcal{L}_{DM}(\theta) \approx \mathbb{E}_{x_{1:T} \sim u(x_{1:T})} \nabla_{x_0} \mathcal{L}_{DM}(\theta).

x_0^{(i+1)} = x_0^{(i)} + \alpha \, \text{sgn} \left( \nabla_{x_0^{(i)}} \mathcal{L}_{DM}(\theta) \Big|_{x_{1:T}^{(i)} \sim u(x_{1:T}^{(i)})} \right)

Projected Gradient Descent to control \( x_0^{(i+1)} \)

Unlearnable Samples

Naresh Kumar Devulapally

Related work: AdvDM (Mist)

Spring 2025

Datasets:

Metrics:

LSUN, WikiArt

FID, Precision, Recall

Unlearnable Samples

Naresh Kumar Devulapally

Related work: Anti-DreamBooth

Spring 2025

Unlearnable Samples

Naresh Kumar Devulapally

Related work: Anti-DreamBooth

Spring 2025

Needs access to Reference clean set to maintain structural similarity.

Space requirement

Fine-tunes UNet

Unlearnable Samples

Naresh Kumar Devulapally

Related work: Anti-DreamBooth

Spring 2025

Datasets:

Metrics:

CelebA-HQ, VGGFace2

FDFR, ISM, SER-FQA, BRISQUE

Unlearnable Samples

Naresh Kumar Devulapally

Related work: SimAC

Spring 2025

Unlearnable Samples

Naresh Kumar Devulapally

Related work: SimAC

Spring 2025

Unlearnable Samples

Datasets:

Metrics:

CelebA-HQ, VGGFace2

FDFR, ISM, SER-FQA, BRISQUE

Naresh Kumar Devulapally

Related work: Metacloak

Spring 2025

Unlearnable Samples

Takeaway: Low robustness of existing methods

Naresh Kumar Devulapally

Related work: VCPro

Spring 2025

Unlearnable Samples

Takeaway: Low invisibility of existing methods

Naresh Kumar Devulapally

Limitations of current works

Spring 2025

Unlearnable Samples

Invisibility

Robustness to Diff based attacks

Parameter efficiency

Plug and Play

Idea 1: Textual Inversion for invisible adversarial patch generation.

Idea 2: Linear and Non-Linear transformation in Latent subspace for unlearning.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Adversarial training in image space.

Builds on VCPro. (VCPro code not available)

Takeaways: Low Robustness, Visible Perturbation.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Finetuning VAE decoder in LDM.

Similar to Stable Signature

Takeaways: Med Robustness, Visible Perturbation.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Finetuning TI token embeddings for \( \tau^* \) steps

Takeaways: Robust, Invisible, High Training time.

Ours

Mist

Ours

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Finetuning TI token embeddings for \( \tau^* \) steps

Takeaways: Robust, Invisible, High Training time.

Original

Ours

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Finetuning TI token embeddings for \( \tau^* \) steps

Takeaways: Robust, Invisible, High Training time.

Diff image

Img2Img

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Performance with attacks

Takeaway: Latent-level training is more robust to diffusion attacks.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

q(x_t | x_0) = \mathcal{N}(x_t; \sqrt{\bar{\alpha}_t} x_0, (1 - \bar{\alpha}_t) I)

\bar{\alpha}_t = \prod_{s=1}^{t} (1 - \beta_s)

x_t = \sqrt{\bar{\alpha}_t} x_0 + \sqrt{1 - \bar{\alpha}_t} \epsilon

where

\epsilon \sim \mathcal{N}(0, I)

Recap: Forward Diffusion

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Recap: Unlearning task

\text{1. } I_c + \delta \sim I_c

\text{2. } \text{Img2Img}(I_c + \delta) \sim \mathcal{N}(0,I)

\( I_c + \delta \) does not have a specific requirement as long as its OOD for the (Img2Img)

Bring difference in \( I_c \) in latent space, by learning transformation at timestep \( t \)

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Recap: Unlearning task

\text{1. } I_c + \delta \sim I_c

\text{2. } \text{Img2Img}(I_c + \delta) \sim \mathcal{N}(0,I)

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Unlearning at latent level

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Unlearning at latent level

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Related existing research: LOCO-Edit (NeurIPS 2024)

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Idea 2: Learning transformations at latent space.

Related existing research: LOCO-Edit (NeurIPS 2024)

Local linearity of PMP for one-step, training-free, and supervision-free editing.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Related existing research: LOCO-Edit (NeurIPS 2024)

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Advantages of this method:

1. Works for several timesteps \( t \)

2. \( f_\theta \) exists, and can be learnt and applied to inference pipelines as well.

3. Closed form solution for \( f_\theta \) assuming \( \epsilon_\theta \) to be linear can be derived.

4. 1-step denoising using LCMs and Flow Matching models without changes to diffusion pipeline.

Naresh Kumar Devulapally

Ideas and Experiments

Spring 2025

Unlearnable Samples

Experiments:

\( f_\theta \) for misclassification (sanity check) (Done)
\( f_\theta \) for LDM unlearning (Quantitative tabulated)
Strength of \( f_\theta \) to control invisibility of unlearnable pattern. (ongoing)
Results on attacks. (todo)