A Survey on Privacy Preserving Data Mining Techniques

Authors

  • Aziza Shamis Aldfaii Department of Information Systems, College of Economics, Management, and Information Systems, University of Nizwa, Nizwa, Sultanate of Oman
  • Rabie Ramadan Department of Information Systems, College of Economics, Management, and Information Systems, University of Nizwa, Nizwa, Sultanate of Oman

Keywords:

privacy preserving, data mining, anonymization;, perturbation;

Abstract

Privacy-preserving data mining (PPDM) has become a significant area of
interest for researchers, facilitating the sharing and analysis of sensitive information while
ensuring privacy protection. This paper investigates methods for maintaining data confidentiality
while retaining the critical attributes necessary for analysis. The authors assess
the efficacy of various PPDM techniques against criteria such as performance, data usability,
and levels of uncertainty. The key findings and limitations of each approach are
thoroughly reviewed and summarized. Various PPDM techniques present distinct advantages
alongside certain limitations: Anonymization guarantees the anonymity of data
owners but is vulnerable to linking attacks. Perturbation protects attributes independently
but does not allow for the reconstruction of original values from the altered data.
Randomization provides robust privacy protection but diminishes data utility due to the
introduction of noise. Cryptographic methods offer strong security and utility but tend to
be less efficient than other strategies. No single technique outperforms all criteria; rather,
each is more effective under particular circumstances. This paper delivers a comparative
analysis of PPDM techniques, emphasizing their strengths and weaknesses, and offers
insights into their applicability across different scenarios.

References

Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In Proceedings

of the 2000 ACM SIGMOD International Conference on Management of Data (pp.

-450).

Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy

preserving data mining algorithms. In Proceedings of the twentieth ACM SIGMODSIGACT-

SIGART symposium on Principles of Database Systems (pp. 247-255).

Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal

of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(05), 557-570.

Evfimievski, A., Srikant, R., Agrawal, R., & Gehrke, J. (2002). Privacy preserving

mining of association rules. In Proceedings of the eighth ACM SIGKDD international

conference on Knowledge discovery and data mining (pp. 217-228).

Lindell, Y., & Pinkas, B. (2000). Privacy preserving data mining. In Annual International

Cryptology Conference (pp. 36-54). Springer, Berlin, Heidelberg.

Kargupta, H., Datta, S., Wang, Q., & Sivakumar, K. (2003). On the privacy preserving

properties of random data perturbation techniques. In Third IEEE International

Conference on Data Mining (pp. 99-106).

Evfimievski, A. (2003). Randomization in privacy-preserving data mining. ACM

SIGKDD Explorations Newsletter, 4(2), 43-48.

Pinkas, B. (2002). Cryptographic techniques for privacy-preserving data mining.

ACM SIGKDD Explorations Newsletter, 4(2), 12-19.

Du, W., & Zhan, Z. (2004). Building decision tree classifier on private data. In Proceedings

of the IEEE international conference on Privacy, security and data mining

(pp. 1-8).

Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., & Theodoridis,

Y. (2004). State-of-the-art in privacy preserving data mining. ACM SIGMOD Record,

(1), 50-57.

Goldreich, O., Micali, S., & Wigderson, A. (1987). How to play any mental game.

In Proceedings of the nineteenth annual ACM symposium on Theory of computing

(pp. 218-229).

Aggarwal, C. C., & Yu, P. S. (2005). On variable constraints in privacy preserving

data mining. In Proceedings of the 2005 SIAM International Conference on Data

Mining (pp. 115-125).

Fung, B. C., Wang, K., & Yu, P. S. (2005). Top-down specialization for information

and privacy preservation. In Proceedings of the 21st International Conference on

Data Engineering (pp. 205-216).

Bayardo, R. J., & Agrawal, R. (2005). Data privacy through optimal kanonymization.

In Proceedings of the 21st International Conference on Data Engineering

(pp. 217-228).

Machanavajjhala, A., Gehrke, J., Kifer, D., & Venkitasubramaniam, M. (2006). ldiversity:

Privacy beyond k-anonymity. In 22nd International Conference on Data

Engineering (pp. 24-24).

Machanavajjhala, A., Kifer, D., Gehrke, J., & Venkitasubramaniam, M. (2007). ldiversity:

Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery

from Data (TKDD), 1(1), 3-es.

Li, N., Li, T., & Venkatasubramanian, S. (2007). t-closeness: Privacy beyond kanonymity

and l-diversity. In 2007 IEEE 23rd International Conference on Data

Engineering (pp. 106-115).

Wang, H., & Jia, X. (2007). Preserving Privacy in Association Rule Mining: A

Randomization Approach. In 2007 International Conference on Computational Intelligence

and Security (pp. 676-680).

Yao, A. C. (1986). How to generate and exchange secrets. In 27th Annual Symposium

on Foundations of Computer Science (sfcs 1986) (pp. 162-167).

Rabin, M. O. (1981). How To Exchange Secrets with Oblivious Transfer. Technical

Report TR-81, Aiken Computation Lab, Harvard University.

Vaidya, J., Clifton, C. W., & Zhu, Y. M. (2008). Privacy preserving data mining

(Vol. 19). New York: Springer.

Aggarwal, C. C., & Yu, P. S. (Eds.). (2008). Privacy-preserving data mining: models

and algorithms (Vol. 34). Springer Science & Business Media.

Xiao, X., & Tao, Y. (2006). Personalized privacy preservation. In Proceedings of the

ACM SIGMOD international conference on Management of data (pp. 229-240).

Rizvi, S. J., & Haritsa, J. R. (2002). Maintaining data privacy in association rule

mining. In Proceedings of the 28th international conference on Very Large Data

Bases (pp. 682-693).

Kantarcioglu, M., & Clifton, C. (2004). Privacy-preserving distributed mining of

association rules on horizontally partitioned data. IEEE transactions on knowledge

and data engineering, 16(9), 1026-1037.

Ciriani, V., di Vimercati, S. D. C., Foresti, S., & Samarati, P. (2008). k-anonymous

data mining: A survey. In Privacy-preserving data mining (pp. 105-136). Springer,

Boston, MA.

Dwork, C. (2008). Differential privacy: A survey of results. In International conference

on theory and applications of models of computation (pp. 1-19). Springer,

Berlin, Heidelberg.

Han, J., Kamber, M., & Pei, J. (2006). Data mining: concepts and techniques.

Morgan kaufmann.

Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools

and techniques. Morgan Kaufmann.

Malina, L., & Hajny, J. (2013). Efficient security solution for privacy-preserving

cloud services. In 36th International Conference on Telecommunications and Signal

Processing (pp. 23-27).

Sachan, A., Roy, D., & Agrawal, P. V. (2013). An efficient intrusion detection system

using CUDA enabled GPU. International Journal of Advanced Research in Computer

Science and Software Engineering, 3(4), 156-165.

Ramadan, R. A., & Yadav, K. (2020). A Novel Hybrid Intrusion Detection System

(IDS) for the Detection of Internet of Things (IoT) Network Attacks. Annals of

Emerging Technologies in Computing (AETiC), 4(5), 61-74.

Ramadan, R. A., Aboshosha, B. W., Alshudukhi, J. S., Alzahrani, A. J., El-Sayed,

A., & Dessouky, M. M. (2021). Cybersecurity and Countermeasures at the Time of

Pandemic. Journal of Advanced Transportation, 2021, Article ID 6627264.

Published

2025-05-01

How to Cite

Aldfaii, A., & Ramadan, R. (2025). A Survey on Privacy Preserving Data Mining Techniques. PLOMS AI, 5(1), 12. Retrieved from https://www.plomscience.com/journals/index.php/PLOMSAI/article/view/24

Issue

Section

Cybersecurity