De Cock, MartineGolob, Steven2024-04-262024-04-262024-04-262024Golob_washington_0250O_26550.pdfhttp://hdl.handle.net/1773/51330Thesis (Master's)--University of Washington, 2024Synthetic data generation (SDG) lauds the benefit of augmenting, enhancing, and safeguarding real data, which in many applications is scarce. When acting as a privacy-enhancing technology, SDG aims to exclude any personally identifiable information from the underlying real data, all while maintaining important statistical properties that keep it useful to data consumers. Many SDG algorithms provide robust differential privacy guarantees. However, we show that those that preserve marginal probability statistics of the underlying data leak more information about individuals than has been previously understood. We demonstrate this by conducting a novel membership inference attack, MAMA-MIA, on three state-of-the-art differentially private SDG algorithms: MST, PrivBayes, and RAP. We present the heuristic for our attack on marginals-based SDG algorithms here. It assumes knowledge of auxiliary "population" data, and also assumes knowledge of which SDG algorithm was used. We use this information to adapt the recent DOMIAS attack to MST, PrivBayes, and RAP. Our approach went on to win the international SNAKE challenge in November 2023.application/pdfen-USCC BYDifferential PrivacyMembership InferenceSynthetic DataComputer scienceComputer science and systemsPrivacy Vulnerabilities in Marginals-based Synthetic DataThesis