Abstract
Outsourcing data to external parties for analysis is risky as the privacy of confidential variables can be easily violated. To eliminate this threat, the data values of these variables should be perturbed before releasing the data. However, the perturbation itself may significantly change the underlying properties of the data, affecting the analysis results. What is required is a subtle transformation to generate perturbed data that maintains, as much as possible, the statistical properties and effectiveness (i.e. the utility) of the original data whilst preserving the privacy. We examine privacy-preserving transformations in the context of data clustering. In particular, this paper demonstrates how non-metric multidimensional scaling (MDS) can be profitably used as a perturbation tool and how the perturbed data can be effectively used in clustering analysis without compromising privacy or utility. We apply the proposed technique to real datasets and compare the results, which were, in some circumstances, exactly the same as those obtained from the original data.
Original language | English |
---|---|
Title of host publication | Intelligent Data Engineering and Automated Learning - IDEAL 2011 |
Editors | H Yin, W Wang, V Rayward-Smith |
Publisher | Springer |
Pages | 287-298 |
Number of pages | 12 |
DOIs | |
Publication status | Published - 2011 |