Cyber Defense eMagazine December Edition for 2021
Will you stay one step ahead of Cyber Scrooge this year? Learn new ways to protect your family, job, company & data. December Cyber Defense eMagazine: Cyber Deception Month is here...Defeat Cyber Scrooge! Cyber Defense Magazine December Edition for 2021 in online format #CDM #CYBERDEFENSEMAG @CyberDefenseMag by @Miliefsky a world-renowned cyber security expert and the Publisher of Cyber Defense Magazine as part of the Cyber Defense Media Group as well as Yan Ross, US Editor-in-Chief, Pieruligi Paganini, International Editor-in-Chief and many more writers, partners and supporters who make this an awesome publication! Thank you all and to our readers! OSINT ROCKS! #CDM #CDMG #OSINT #CYBERSECURITY #INFOSEC #BEST #PRACTICES #TIPS #TECHNIQUES See you at RSA Conference 2022 - Our 10th Year Anniversary - Our 10th Year @RSAC #RSACONFERENCE #USA - Thank you so much!!! - Team CDMG CDMG is a Carbon Negative and Inclusive Media Group.
Will you stay one step ahead of Cyber Scrooge this year? Learn new ways to protect your family, job, company & data. December Cyber Defense eMagazine: Cyber Deception Month is here...Defeat Cyber Scrooge!
Cyber Defense Magazine December Edition for 2021 in online format #CDM #CYBERDEFENSEMAG @CyberDefenseMag by @Miliefsky a world-renowned cyber security expert and the Publisher of Cyber Defense Magazine as part of the Cyber Defense Media Group as well as Yan Ross, US Editor-in-Chief, Pieruligi Paganini, International Editor-in-Chief and many more writers, partners and supporters who make this an awesome publication! Thank you all and to our readers! OSINT ROCKS! #CDM #CDMG #OSINT #CYBERSECURITY #INFOSEC #BEST #PRACTICES #TIPS #TECHNIQUES
See you at RSA Conference 2022 - Our 10th Year Anniversary - Our 10th Year @RSAC #RSACONFERENCE #USA - Thank you so much!!! - Team CDMG
CDMG is a Carbon Negative and Inclusive Media Group.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4 Pillars of Privacy-Preserving AI<br />
Understanding the privacy challenges that chabots face requires, first and <strong>for</strong>emost, a general<br />
understanding of what the privacy challenges are <strong>for</strong> machine learning systems in general. There are<br />
four pillars to privacy-preserving AI:<br />
1) Training data privacy: making sure that you can’t reconstruct sensitive or personal in<strong>for</strong>mation<br />
within the training data,<br />
2) Input privacy: privacy of the individual whose data you’re inferring upon,<br />
3) Model weights privacy: privacy of the model of a particular corporation, institution, or individual<br />
who created it. This is about IP protection, but also training data privacy, since it is possible to<br />
determine in<strong>for</strong>mation about the training data from model weight updates,<br />
4) Output privacy: also about protecting the privacy of the individual whose data you’re inferring<br />
upon.<br />
By collecting private conversations with identifiable individuals and training their models on them,<br />
ScatterLab first violated (2) input privacy, then (1) training data privacy, and possibly (4) output privacy.<br />
Training Data Privacy<br />
Much of research and development these days focuses on training data privacy, in part because of how<br />
likely deep learning models are to memorize training data, with the potential of spewing it out in production<br />
to unknown parties. The secret sharer: Evaluating and testing unintended memorization in neural<br />
networks [Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The<br />
secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th<br />
USENIX Security Symposium, pages 267–284, Santa Clara, CA. USENIX Association.] by Carlini et<br />
al. (2019) is a pivotal paper discussing the problem. They placed a fake social security number into the<br />
Penn Treebank dataset as a canary and then trained a character language model on the dataset. They<br />
then measured the perplexity of various sequences of numbers and found that the model was less<br />
surprised to see the sequences of numbers that made up the canary; i.e., the language model had<br />
recorded that it was more likely to encounter the canary rather than other random numbers given the<br />
training data. This is a problem because it shows that the language model memorized the secret.<br />
Another paper titled Extracting training data from large language models by Carlini at al. (2020)<br />
demonstrates how GPT-2 was actually memorizing data from the pre-training dataset. [Nicholas Carlini,<br />
Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam<br />
Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. 2020. Extracting training data from large<br />
language models. arXiv preprint arXiv:2012.07805.] It had memorized addresses, names, and other<br />
in<strong>for</strong>mation that could be considered sensitive had the data not been publically available. It is important<br />
to keep in mind that these very models will be memorizing that same kind of in<strong>for</strong>mation from chatbot<br />
training data. The paper showed that an extra large GPT-2 model already started memorizing in<strong>for</strong>mation<br />
after seeing only 33 examples.<br />
Privacy issues have also been raised about training non-contextual word embeddings on data containing<br />
sensitive in<strong>for</strong>mation in Exploring the privacy-preserving properties of word embeddings: Algorithmic<br />
<strong>Cyber</strong> <strong>Defense</strong> <strong>eMagazine</strong> – <strong>December</strong> <strong>2021</strong> <strong>Edition</strong> 105<br />
Copyright © <strong>2021</strong>, <strong>Cyber</strong> <strong>Defense</strong> Magazine. All rights reserved worldwide.