Jinseong Park

[CVPR 2024] Data Synthesis for Privacy

In-distribution Public Data Synthesis with Diffusion Models for Differentially Private Image Classification

[CVPR 2024] Data Synthesis for Privacy

Please open the link below to see the paper. Paper Link

Abstract: To alleviate the utility degradation of deep learning image classification with differential privacy (DP) employing extra public data or pre-trained models has been widely explored. Recently the use of in-distribution public data has been investigated where tiny subsets of datasets are released publicly. In this paper we investigate a framework that leverages recent diffusion models to amplify the information of public data. Subsequently we identify data diversity and generalization gap between public and private data as critical factors addressing the limited public data. While assuming 4% of training data as public our method achieves 85.48% on CIFAR-10 with a privacy budget of ε=2 without employing extra public data for training.

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12