03.12.2021 Views

Cyber Defense eMagazine December Edition for 2021

Will you stay one step ahead of Cyber Scrooge this year? Learn new ways to protect your family, job, company & data. December Cyber Defense eMagazine: Cyber Deception Month is here...Defeat Cyber Scrooge! Cyber Defense Magazine December Edition for 2021 in online format #CDM #CYBERDEFENSEMAG @CyberDefenseMag by @Miliefsky a world-renowned cyber security expert and the Publisher of Cyber Defense Magazine as part of the Cyber Defense Media Group as well as Yan Ross, US Editor-in-Chief, Pieruligi Paganini, International Editor-in-Chief and many more writers, partners and supporters who make this an awesome publication! Thank you all and to our readers! OSINT ROCKS! #CDM #CDMG #OSINT #CYBERSECURITY #INFOSEC #BEST #PRACTICES #TIPS #TECHNIQUES See you at RSA Conference 2022 - Our 10th Year Anniversary - Our 10th Year @RSAC #RSACONFERENCE #USA - Thank you so much!!! - Team CDMG CDMG is a Carbon Negative and Inclusive Media Group.

Will you stay one step ahead of Cyber Scrooge this year? Learn new ways to protect your family, job, company & data. December Cyber Defense eMagazine: Cyber Deception Month is here...Defeat Cyber Scrooge!

Cyber Defense Magazine December Edition for 2021 in online format #CDM #CYBERDEFENSEMAG @CyberDefenseMag by @Miliefsky a world-renowned cyber security expert and the Publisher of Cyber Defense Magazine as part of the Cyber Defense Media Group as well as Yan Ross, US Editor-in-Chief, Pieruligi Paganini, International Editor-in-Chief and many more writers, partners and supporters who make this an awesome publication! Thank you all and to our readers! OSINT ROCKS! #CDM #CDMG #OSINT #CYBERSECURITY #INFOSEC #BEST #PRACTICES #TIPS #TECHNIQUES

See you at RSA Conference 2022 - Our 10th Year Anniversary - Our 10th Year @RSAC #RSACONFERENCE #USA - Thank you so much!!! - Team CDMG

CDMG is a Carbon Negative and Inclusive Media Group.

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Pillars of Privacy-Preserving AI<br />

Understanding the privacy challenges that chabots face requires, first and <strong>for</strong>emost, a general<br />

understanding of what the privacy challenges are <strong>for</strong> machine learning systems in general. There are<br />

four pillars to privacy-preserving AI:<br />

1) Training data privacy: making sure that you can’t reconstruct sensitive or personal in<strong>for</strong>mation<br />

within the training data,<br />

2) Input privacy: privacy of the individual whose data you’re inferring upon,<br />

3) Model weights privacy: privacy of the model of a particular corporation, institution, or individual<br />

who created it. This is about IP protection, but also training data privacy, since it is possible to<br />

determine in<strong>for</strong>mation about the training data from model weight updates,<br />

4) Output privacy: also about protecting the privacy of the individual whose data you’re inferring<br />

upon.<br />

By collecting private conversations with identifiable individuals and training their models on them,<br />

ScatterLab first violated (2) input privacy, then (1) training data privacy, and possibly (4) output privacy.<br />

Training Data Privacy<br />

Much of research and development these days focuses on training data privacy, in part because of how<br />

likely deep learning models are to memorize training data, with the potential of spewing it out in production<br />

to unknown parties. The secret sharer: Evaluating and testing unintended memorization in neural<br />

networks [Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The<br />

secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th<br />

USENIX Security Symposium, pages 267–284, Santa Clara, CA. USENIX Association.] by Carlini et<br />

al. (2019) is a pivotal paper discussing the problem. They placed a fake social security number into the<br />

Penn Treebank dataset as a canary and then trained a character language model on the dataset. They<br />

then measured the perplexity of various sequences of numbers and found that the model was less<br />

surprised to see the sequences of numbers that made up the canary; i.e., the language model had<br />

recorded that it was more likely to encounter the canary rather than other random numbers given the<br />

training data. This is a problem because it shows that the language model memorized the secret.<br />

Another paper titled Extracting training data from large language models by Carlini at al. (2020)<br />

demonstrates how GPT-2 was actually memorizing data from the pre-training dataset. [Nicholas Carlini,<br />

Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam<br />

Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. 2020. Extracting training data from large<br />

language models. arXiv preprint arXiv:2012.07805.] It had memorized addresses, names, and other<br />

in<strong>for</strong>mation that could be considered sensitive had the data not been publically available. It is important<br />

to keep in mind that these very models will be memorizing that same kind of in<strong>for</strong>mation from chatbot<br />

training data. The paper showed that an extra large GPT-2 model already started memorizing in<strong>for</strong>mation<br />

after seeing only 33 examples.<br />

Privacy issues have also been raised about training non-contextual word embeddings on data containing<br />

sensitive in<strong>for</strong>mation in Exploring the privacy-preserving properties of word embeddings: Algorithmic<br />

<strong>Cyber</strong> <strong>Defense</strong> <strong>eMagazine</strong> – <strong>December</strong> <strong>2021</strong> <strong>Edition</strong> 105<br />

Copyright © <strong>2021</strong>, <strong>Cyber</strong> <strong>Defense</strong> Magazine. All rights reserved worldwide.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!