28.10.2016 Views

gender differential paper IJCRB

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ijcrb.webs.com<br />

INTERDISCIPLINARY JOURNAL OF CONTEMPORARY RESEARCH IN BUSINESS<br />

JUNE 2011<br />

VOL 3, NO 2<br />

DATA LOSS AND PROBABILITY OF THREATS ASSOCIATED<br />

WITH K-ANONYMITY<br />

Tariq Sadad<br />

MSCS, Department of Computer Science, International Islamic University, Islamabad<br />

Azhar Rauf<br />

Assistant Professor, Department of Computer Science, University of Peshawar, Peshawar<br />

Ayyaz Hussain<br />

Assistant Professor, Department of Computer Science, International Islamic University,<br />

Islamabad<br />

Abstract<br />

K-anonymity provides a model to protect the confidentiality of the individual. It presents the data<br />

in such a way that for each single tuple there are k same types of tuples in the released table. In<br />

this <strong>paper</strong>, we calculate the total data loss in a table after applying k-anonymity. A measure is<br />

given that can calculate data loss for any value of K. Similarly the probability of threats is also<br />

measured after applying K-anonymity. This work also measures the strength of security. These<br />

measurements help a Database Administrator or Security Officer to select the value of K for<br />

applying K-anonymity based on the security and data loss requirements of an organization. It is<br />

proved that K-anonymity is directly proportional to data loss and security, and inversely<br />

proportional to probability of threats.<br />

Key words: K-anonymity, Quasi-identifier, Data Loss, Probability of Threats<br />

1. INTRODUCTION<br />

Currently different organizations such as hospitals publish their raw non-aggregated data (also<br />

called micro data), for research purpose or a variety of different reasons. However, such data<br />

may contain private information as in the case of medical record, where identities of entities<br />

should be kept secret. In 1996, TIME/CNN conducted telephone poll in United States in which<br />

88% of the respondents replied that medical information about themselves should not be released<br />

without their permission. In a second survey, 87% people said to restrict organizations from<br />

giving out medical information without a patient’s permission. People prefer that employees and<br />

directly involved people can only have access to their records and organization should be<br />

bounded by ethical and legal standards to prohibit further disclosure of their data (Sweeney,<br />

1997)<br />

Currently, the leakage of health information is rigorously regulated in many jurisdictions.<br />

Organizations are required to apply privacy protection to health data earlier to their revelation to<br />

researchers. For example, the Health Insurance Portability and Accountability Act (HIPAA) in<br />

the United States, and the Personal Health Information Protection Act (PHIPA) in Canada, are<br />

some of the well known privacy regulations that protect the confidentiality of healthcare<br />

information.<br />

In order to protect the data of respondents, data holders often remove or encrypt explicit<br />

identifiers such as names, SSN and phone numbers before releasing (Ciriani, Foresti and<br />

COPY RIGHT © 2011 Institute of Interdisciplinary Business Research 302

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!