You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Budapest University of Technology and Economics<br />
Department of Telecommunications<br />
Privacy enhancing protocols<br />
for wireless networks<br />
Ph.D. Dissertation<br />
of<br />
Tamás Holczer<br />
Supervisor:<br />
Levente Buttyán, Ph.D.<br />
TO BE ON THE SAFE SIDE<br />
Budapest, Hungary<br />
2012
Alulírott Holczer Tamás kijelentem, hogy ezt a doktori értekezést magam készítettem<br />
és abban csak a megadott forrásokat használtam fel. Minden olyan részt, amelyet szó szerint,<br />
vagy azonos tartalomban, de átfogalmazva más forrásból átvettem, egyértelműen,<br />
a forrás megadásával megjelöltem.<br />
I, the undersigned Tamás Holczer hereby declare, that this Ph.D. dissertation was<br />
made by myself, and I only used the sources given at the end. Every part that was<br />
quoted word-for-word, or was taken over with the same content, I noted explicitly by<br />
giving the reference of the source.<br />
A dolgozat bírálatai és a védésről készült jegyzőkönyv a Budapesti Műszaki és<br />
Gazdaságtudományi Egyetem Villamosmérnöki és Informatikai Karának dékáni hivatalában<br />
elérhetőek.<br />
The reviews of the dissertation and the report of the thesis discussion are available<br />
at the Dean’s Office of the Faculty of Electrical Engineering and Informatics of the<br />
Budapest University of Technology and Economics.<br />
Budapest, . . . . . . . . . . . . . . . . . . . . . . . .<br />
iii<br />
Holczer Tamás
Abstract<br />
Wireless networks are used in our everyday life. We use wireless networks to call each other, to<br />
download our emails at home, or to enter a building with a proximity card. In the near future<br />
wireless networks will be used in many new fields such as vehicular ad hoc networks, or critical<br />
infrastructure protection.<br />
The use of wireless networks instead of wired networks opens up new research challenges. These<br />
challenges include mobility, coping with unreliable links, resource constraints, and the security and<br />
privacy aspects of the wireless networks. In this thesis some privacy aspects of different wireless<br />
networks are investigated.<br />
In chapter 2, private authentication methods are proposed and analyzed for radio frequency<br />
identification (RFID) systems. A typical example for such an application is a Radio Frequency<br />
Identification System (RFID) system, where the provers are low-cost RFID tags, and the number<br />
of the tags can potentially be very large. I study the problem of private authentication in RFID<br />
systems. More specifically I propose two methods, that are the privacy efficient key-tree based<br />
authentication, and the group based authentication.<br />
The first key-tree based private authentication protocol has been proposed by Molnar and<br />
Wagner as a neat way to efficiently solve the problem of privacy preserving authentication based on<br />
symmetric key cryptography. However, in the key-tree based approach, the level of privacy provided<br />
by the system to its members may decrease considerably if some members are compromised. In this<br />
thesis, I analyze this problem, and show that careful design of the tree can help to minimize this<br />
loss of privacy. First, I introduce a benchmark metric for measuring the resistance of the system<br />
to a single compromised member. This metric is based on the well-known concept of anonymity<br />
sets. Then, I show how the parameters of the key-tree should be chosen in order to maximize the<br />
system’s resistance to single member compromise under some constraints on the authentication<br />
delay. In the general case, when any member can be compromised, I give a lower bound on the<br />
level of privacy provided by the system. I also present some simulation results that show that this<br />
lower bound is quite sharp. The results of Chapter 2 can be directly used by system designers to<br />
construct optimal key-trees in practice.<br />
In the second part of chapter 2, I propose a novel group based authentication scheme similar<br />
to the key-tree based method. This scheme is also based on symmetric-key cryptography, and<br />
therefore, it is well-suited to resource constrained applications in large scale environments. I<br />
analyze the proposed scheme and show that it is superior to the previous key-tree based approach<br />
for private authentication both in terms of privacy and efficiency.<br />
In chapter 3, I analyze the privacy consequences of inter vehicular communication. The promise<br />
of vehicular communications is to make road traffic safer and more efficient. However, besides the<br />
expected benefits, vehicular communications also introduce some privacy risk by making it easier to<br />
track the physical location of vehicles. One approach to solve this problem is that the vehicles use<br />
pseudonyms that they change with some frequency. In this chapter, I study the effectiveness of this<br />
approach. I define a model based on the concept of mix zone, characterize the tracking strategy<br />
of the adversary in this model, and introduce a metric to quantify the level of privacy enjoyed<br />
v
y the vehicles. I also report on the results of an extensive simulation where I used my model to<br />
determine the level of privacy achieved in realistic scenarios. In particular, in my simulation, I used<br />
a rather complex road map, generated traffic with realistic parameters, and varied the strength<br />
of the adversary by varying the number of her monitoring points. My simulation results provide<br />
information about the relationship between the strength of the adversary and the level of privacy<br />
achieved by changing pseudonyms.<br />
From the first half of Chapter 3, it can be seen that untraceability of vehicles is an important<br />
requirement in future vehicle communications systems. Unfortunately, heartbeat messages used by<br />
many safety applications provide a constant stream of location data, and without any protection<br />
measures, they make tracking of vehicles easy even for a passive eavesdropper. However, considering<br />
a global attacker, this approach is effective only if some silent period is kept during the pseudonym<br />
change and several vehicles change their pseudonyms nearly at the same time and at the same<br />
location. Unlike other works that proposed explicit synchronization between a group of vehicles<br />
and/or required pseudonym change in a designated physical area (i.e., a static mix zone), I propose<br />
a much simpler approach that does not need any explicit cooperation between vehicles and any<br />
infrastructure support. My basic idea is that vehicles should not transmit heartbeat messages when<br />
their speed drops below a given threshold, and they should change pseudonym during each such<br />
silent period. This ensures that vehicles stopping at traffic lights or moving slowly in a traffic jam<br />
will all refrain from transmitting heartbeats and change their pseudonyms nearly at the same time<br />
and location. Thus, my scheme ensures both silent periods and synchronized pseudonym change<br />
in time and space, but it does so in an implicit way. I also argue that the risk of a fatal accident at<br />
a slow speed is low, and therefore, my scheme does not seriously impact safety-of-life. In addition,<br />
refraining from sending heartbeat messages when moving at low speed also relieves vehicles of the<br />
burden of verifying a potentially large amount of digital signatures, and thus, makes it possible to<br />
implement vehicle communications with less expensive equipments.<br />
In chapter 4, I propose protocols that increase the dependability of wireless sensor networks,<br />
which are potentially useful building blocks in cyber-physical systems. Wireless sensor networks<br />
can be used in many critical applications such as martial or critical infrastructure protection<br />
scenarios. In such a critical scenario, the dependability of the monitoring sensor network can be<br />
crucial. One interesting part of the dependability of a network, is how the network can hide its<br />
nodes with specific roles from an eavesdropping or active attacker.<br />
In this problem field, I propose protocols which can hide some important nodes of the network.<br />
More specifically, I propose two privacy preserving aggregator node election protocols, a privacy<br />
preserving data aggregation protocol, and a corresponding privacy preserving query protocol for<br />
sensor networks that allow for secure in-network data aggregation by making it difficult for an<br />
adversary to identify and then physically disable the designated aggregator nodes. The basic<br />
protocol can withstand a passive attacker, while my advanced protocols resist strong adversaries<br />
that can physically compromise some nodes. The privacy preserving aggregator protocol allows<br />
electing aggregator nodes within the network without leaking any information about the identity of<br />
the elected node. The privacy preserving aggregation protocol helps collecting data by the elected<br />
aggregator nodes without leaking the information, who is actually collecting the data. The privacy<br />
preserving query protocol enables an operator to collect the aggregated data from the unknown<br />
and anonymous aggregators without leaking the identity of the aggregating nodes.<br />
vi
Kivonat<br />
Vezeték nélküli hálózatok a mindennapi élet részét képezik. Ilyen hálózatokat használhatunk<br />
például telefonálásra, Interneten elérhető szolgáltatások igénybe vételére, vagy kontaktus mentes<br />
kártyás beléptető rendszerekben. A közeljövőben a felhasználási területek jelentős mértékben ki<br />
fognak bővülni, többek között a gépjárművek is így fognak kommunikálni egymással, vagy szerepet<br />
fog kapni a kritikus infrastruktúra védelmében is.<br />
A vezeték nélküli hálózatok széleskörű használata új kutatási problémákat vet fel. Ilyen új<br />
problémakör a mobilitás, megbízhatatlan kapcsolatok kezelése, szűkös erőforrásokból származó<br />
problémák és kihívások vagy az adatvédelmi és adatbiztonsági kérdések kutatása. Ebben a disszertációban<br />
különböző vezeték nélküli hálózatok adatvédelmi kérdéseit vizsgálom.<br />
A disszertáció első fejezetében privát hitelesítési módszereket vizsgálok rádiófrekvenciás azonosítási<br />
problémák kezelésére. Tipikus alkalmazási terület az RFID rendszerek, ahol potenciálisan rengeteg<br />
felhasználó olcsó RFID kártyák segítségével hitelesítik magukat egy olvasó felé. A két hitelesítési<br />
mód a kulcsfa alapú illetve a csoport alapú azonosítás.<br />
Az első kulcsfa alapú privát hitelesítési protokollt Molnar és Wagner javasolta. Ez a módszer<br />
egy hatékony szimmetrikus kulcs alapú privát hitelesítő protokoll volt. Ez a módszer nagyon jól<br />
működik mindaddig, amíg nem kompromittálódik valamelyik felhasználó titkos kulcsai. Ekkor<br />
nemcsak a kompromittálódott felhasználó élvez kisebb anonimitást, de az összes többi felhasználó<br />
anonimitása is sérül.<br />
A disszertáció 2. fejezetében azt elemzem, hogy a fa paramétereinek gondos megválasztása<br />
hogyan tudja minimalizálni az elveszett anonimitást. Először is, definiálok egy mértéket, ami<br />
azt méri, hogy milyen hatása van annak, ha egy felhasználó kompromittálódik a rendszerben.<br />
Ez a mérték az anonimitási halmaz jól ismert fogalmára épül. Ezután megmutatom, hogy kell<br />
a kulcsfa paramétereit megválasztani úgy, hogy az előbb definiált mértékben minimális legyen a<br />
kompromittálódásból származó veszteség bizonyos külső kényszerek teljesülése mellett. Általános<br />
esetben, ahol nem csak egy felhasználó kompromittálódhat hanem több is, alsó becslést adok a<br />
rendszer által biztosított anonimitási szintre. Szimulációkkal megmutatom, hogy ez az alsó becslés<br />
jellemzően pontos becslés. A fejezet eredményei közvetlenül felhasználhatók rendszer tervezéskor,<br />
amikor meg kell találni a feladatnak legjobban megfelelő kulcsfát.<br />
2. fejezet második részében egy új csoport alapú privát hitelesítési módszert javaslok. Ez<br />
a módszer is szimmetrikus kulcsokon alapul, így jól alkalmazható erőforrás korlátozott eszközök<br />
esetén is. A fejezetben elemzem a javasolt megoldást, és megmutatom, hogy bizonyos tipikus<br />
esetekben jobban működik, mint a fejezet elején bevezetett kulcsfa alapú módszer.<br />
A 3. fejezetben a járműközi kommunikáció adatvédelmi következményeit elemzem. A közeljövőben<br />
megvalósuló járműközi kommunikáció biztonságosabb és hatékonyabb közlekedést tesz lehetővé, de<br />
ugyanakkor egyszerűbbé teszi a járművek követhetőségét is, ami jelentősen sértheti a járművezetők<br />
privát szféráját. Egy lehetséges megoldás a problémára, ha a járművek nem állandó azonosítókat<br />
használnak a kommunikációjuk során, hanem álneveket, amiket gyakran le tudnak cserélni. Ebben<br />
a fejezetben ennek a megoldásnak a hatékonyságát elemzem. Először is egy mix zóna alapú modellt<br />
alkotok. Ebben a modellben definiálom a támadó követési stratégiáját, és definiálom a mértéket,<br />
vii
ami azt méri, hogy az egyes járművek mennyire követhetők. Ezek után megvizsgálom a modellt egy<br />
részletes szimulációban. A szimuláció folyamán, egy komplex térképen valósághűen közlekednek<br />
járművek, és vizsgálom a forgalom és a támadó erősségének hatását a követhetőségre.<br />
Ahogy ez a 3. fejezet első részéből látszik, a járművek követhetősége fontos szempont a<br />
járműközi kommunikációban. Sajnálatos módon, ahogy láttuk, a folytonosan adott helyzetjelentések<br />
könnyen követhetővé teszik a járműveket. Általános megoldás a problémára, ha a járművek<br />
váltogatják az azonosítójukat. Ez a váltás, persze csak akkor tud hatékony lenni, ha a két<br />
különböző azonosító használata között eltelik legalább egy kis idő, amikor a jármű nem ad semmit,<br />
és egyszerre több egymás közelében lévő jármű vált azonosítót. Míg a legtöbb megoldás<br />
bonyolult szinkronizációt ír elő, vagy csak statikusan kijelölt helyeken engedi a cserét, addig az én<br />
megoldásom ennél sokkal egyszerűbb. Ebben a megoldásban nincs szükség explicit kooperációra<br />
vagy külső infrastruktúrára, hanem egyszerűen a járművek abbahagyják az adást egy bizonyos<br />
sebesség alatt, majd amikor átlépik ezt a küszöb sebességet, akkor újra elkezdenek adni de már az<br />
új azonosítóval. Ezáltal közlekedési lámpánál várakozó, vagy dugóban araszoló járművek egyszerre<br />
maradnak csöndben és cserélnek azonosítót. Ezáltal ez a módszer egyszerűen garantálja a szükséges<br />
csöndes periódust, és a helyileg és időben szinkronizált cserét valósit meg szinkronizáció nélkül. Ez<br />
a módszer egyrészt azért szerencsés, mivel alacsony sebességnél kicsi az esély súlyos balesetre, tehát<br />
épp akkor nem ad jeleket a jármű, amikor nincs is szükség rá, másrészt az egymáshoz közel araszoló<br />
járművek nagyon nagy mennyiségű feldolgozandó adatot generálnak, ami így szintén elkerülhető.<br />
A disszertáció 4. fejezetében protokollokat javaslok, amik növelni tudják egy vezeték nélküli<br />
szenzorhálózat megbízhatóságát. Vezeték nélküli szenzorhálózatokat fel lehet használni kritikus<br />
feladatokra is mint például hadászati vagy kritikus infrastruktúra védelem. Ilyen kritikus feladatokban,<br />
nagyon fontos lehet a kiemelt szerepű node-ok védelme illetve elrejtése támadók elől.<br />
Ezen probématerületen belül javaslok protokollokat, melyek el tudják rejteni a kulcsfontosságú<br />
eszközök kilétét. Pontosabban, két privát aggregátor választó protokollt egy privát aggregáló és<br />
egy privát lekérdező protokollt javaslok, amelyek használata esetén szenzor hálózatban támadók<br />
nem tudják azonosítani az aggregátor eszközöket. A két megoldás közül az egyszerűbb protokoll<br />
passzív lehallgatás ellen nyújt biztonságot, míg a komplexebb protokoll aktív támadások ellen is<br />
védelmet nyújt.<br />
viii
Acknowledgement<br />
First of all, I would like to express my gratitude to my supervisor, Professor Levente Buttyán,<br />
Ph.D., Departement of Telecommunication, Budapest University of Technology and Economics.<br />
He gave me guidance in selecting problems to work on, helped in elaborating the problems, and<br />
pushed me to publish the results. All these three steps were needed to finish this thesis.<br />
I am also grateful to the current and former members of the CrySyS Laboratory: Boldizsár<br />
Bencsáth, László Czap, László Csík, László Dóra, Amit Dvir, Gergely Kótyuk,<br />
Áron Lászka,<br />
Gábor Pék, Péter Schaffer, Vinh Thong Ta, and István Vajda for the illuminating discussions<br />
on different technical problems that I encountered during my research. They also provided a<br />
pleasant atmosphere which was a pleasure to work in.<br />
I would also like to thank for our joint efforts and publications to Petra Ardelean, Naim Asaj,<br />
Gildas Avoine, Danny De Cock, Stefano Cosenza, Amit Dvir, László Dóra, Julien Freudiger, Albert<br />
Held, Jean-Pierre Hubaux, Frank Kargl, Antonio Kung, Zhendong Ma, Michael Müter, Panagiotis<br />
Papadimitratos, Maxim Raya, Péter Schaffer, Elmar Schoch, István Vajda, Andre Weimerskirch,<br />
William Whyte, and Björn Wiedersheim.<br />
The financial support of the Mobile Innovation Centre (MIK) and the support of the SEVECOM<br />
(FP6-027795) and WSAN4CIP (FP7-225186) EU projects are gratefully acknowledged.<br />
And last but not least my thanks go to my wife Nóra, who accepted me as being a PhD student.<br />
I know sometimes it was not easy.<br />
ix
Contents<br />
1 Introduction 1<br />
1.1 Introduction to RFID systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />
1.2 Introduction to Vehicular Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . 3<br />
1.3 Introduction to Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . 4<br />
2 Private Authentication 9<br />
2.1 Introduction to private authentication . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />
2.2 Resistance to single member compromise . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
2.3 Optimal trees in case of single member compromise . . . . . . . . . . . . . . . . . . 14<br />
2.4 Analysis of the general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />
2.5 The group-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
2.6 Analysis of the group based approach . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
2.7 Comparison of the group and the key-tree based approach . . . . . . . . . . . . . . 26<br />
2.8 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />
2.10 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />
3 Location Privacy in VANETs 29<br />
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />
3.2 Model of local attacker and mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
3.2.1 The concept of the mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
3.2.2 The model of the mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />
3.2.3 The operation of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />
3.2.4 Analysis of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />
3.2.5 The level of privacy provided by the mix zone . . . . . . . . . . . . . . . . . 34<br />
3.3 Simulation of mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
3.3.1 Simulation settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
3.3.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />
3.4 Global attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />
3.5 Framework for location privacy in VANETs . . . . . . . . . . . . . . . . . . . . . . 37<br />
3.6 Attacker Model and the SLOW algorithm . . . . . . . . . . . . . . . . . . . . . . . 38<br />
3.7 Analysis of SLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />
3.7.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />
3.7.2 Effects on safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
3.7.3 Effects on computation complexity . . . . . . . . . . . . . . . . . . . . . . . 44<br />
3.8 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />
3.10 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />
xi
CONTENTS<br />
4 Anonymous Aggregator Election and Data Aggregation in WSNs 49<br />
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
4.2 System and attacker models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />
4.3 Basic protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
4.3.1 Protocol description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
4.3.2 Protocol analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56<br />
4.3.3 Data forwarding and querying . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
4.4 Advanced protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
4.4.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61<br />
4.4.2 Data aggregator election . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />
4.4.3 Data aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65<br />
4.4.4 Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />
4.4.5 Misbehaving nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />
4.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />
4.7 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73<br />
5 Application of new results 75<br />
6 Conclusion 77<br />
xii
List of Figures<br />
2.1 Illustration of a key-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
2.2 Illustration of single member compromise . . . . . . . . . . . . . . . . . . . . . . . 12<br />
2.3 Illustration of several members compromise . . . . . . . . . . . . . . . . . . . . . . 20<br />
2.4 Simulation results for branching factor vectors . . . . . . . . . . . . . . . . . . . . . 22<br />
2.5 system comparison based on approximation . . . . . . . . . . . . . . . . . . . . . . 23<br />
2.6 Operation of the group-based private authentication scheme . . . . . . . . . . . . . 24<br />
2.7 Tree and group based authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
2.8 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
3.1 Mix and observed zone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
3.2 Simplified map of Budapest generated for the simulation. . . . . . . . . . . . . . . 35<br />
3.3 Success probabilities of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />
3.4 Results of the simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37<br />
3.5 Success rate of a tracking attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />
3.6 Example intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />
3.7 Success rate of the simple attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 43<br />
3.8 Success rate of the simple attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 43<br />
3.9 Number of signatures to be verified . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />
4.1 Result of aggregator election protocol . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />
4.2 Probability of being cluster aggregator . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
4.3 Probability of being cluster aggregator . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />
4.4 Result of balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />
4.5 Entropy of the attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />
4.6 Connected dominating set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />
4.7 Aggregation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
4.8 Query example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />
4.9 Graphical representation of the suitable intervals . . . . . . . . . . . . . . . . . . . 69<br />
4.10 Misbehavior detection algorithm for the query protocol. . . . . . . . . . . . . . . . 71<br />
xiii
List of Tables<br />
2.1 Illustration of the operation of the recursive function f . . . . . . . . . . . . . . . . 19<br />
3.1 Notation in SLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />
4.1 Estimated time of the building blocks on a Crossbow MICAz mote . . . . . . . . . 55<br />
4.2 Optimal γ values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
4.3 Summary of complexity of the advanced protocol . . . . . . . . . . . . . . . . . . . 61<br />
xv
List of Algorithms<br />
1 Optimal branching factor generating algorithm . . . . . . . . . . . . . . . . . . . . 19<br />
2 Basic private cluster aggregator election algorithm . . . . . . . . . . . . . . . . . . 54<br />
xvii
Chapter 1<br />
Introduction<br />
In this dissertation privacy enhancing protocols for wireless networks are proposed. In this chapter,<br />
a brief overview is given on those wireless networks to which the work presented in this dissertation<br />
is related, namely on Radio Frequency Identification systems (RFID systems), Vehicular Ad Hoc<br />
Networks (VANETs), and Wireless Sensor Networks (WSNs). The privacy consequences of the<br />
usage of such networks and some related problems are sketched. The main reason for choosing<br />
these networks is that they are or will potentially be used by billions of users, so solving a problem<br />
related to these networks can have an effect on an extremely large amount of user’s privacy.<br />
Wireless technology is a truly revolutionary paradigm shift, enabling multimedia communications<br />
between people and devices from any location. It also enables exciting applications<br />
such as sensor networks, smart homes, telemedicine, and automated highways. Comprehensive<br />
introductions to wireless networks can be found in [Goldsmith, 2005; Rappaport, 2001].<br />
The security and privacy problems of wireless networks is a well studied field, however there<br />
are a lot of open question worth to work on. Overviews of security and privacy in wireless<br />
networks can be found in [Buttyán and Hubaux, 2008; Juels, 2006; Raya and Hubaux, 2007;<br />
Akyildiz et al., 2002].<br />
A wireless network consists of nodes that can communicate through wireless channels. Those<br />
channels include Infra Red (IR) or Radio Frequency (RF) channels. From the security point of view,<br />
the main difference between wireless and traditional wired networks is that a passive attacker can<br />
easily eavesdrop the wireless channel without detection, while it can be harder with wired networks.<br />
Harder actually means here that those attacks require physical access to the network (cables or<br />
network elements), and the lack of physical protection in case of wireless networks makes these<br />
attacks easier to carry out. An active attacker can inject, modify, and delete messages in the<br />
air with some knowledge of the network and wireless technologies, while again it is harder for a<br />
traditional wired network.<br />
In information technology, privacy is defined as the right of an entity to choose which information<br />
is revealed about the entity, what information is collected and stored, how that information<br />
is used, shared or published, and also the right to keep control on that information (e.g., the<br />
right to delete data from a database if the user wishes to do so). Privacy has actually two facets:<br />
data control and data protection. One way to keep control is to keep data secret, e.g., to remain<br />
anonymous. According to [Pfitzmann and Köhntopp, 2001], anonymity is the state of being not<br />
identifiable within a set of subjects, the anonymity set. In the remaining part of the dissertation,<br />
I will use privacy with this information centric meaning, and decisional privacy 1 or intentional<br />
privacy 2 will not be discussed.<br />
1 This conception of privacy addresses issues related to an individual’s authority to make decisions that affect the<br />
individual’s life and body and that of the individual’s family members such as end of life issues. [ITLaw] 2 This<br />
conception of privacy addresses issues related to intimate activities or characteristics that are publicly visible.<br />
[ITLaw]<br />
1
1. INTRODUCTION<br />
In the remainder part of this chapter, the three wireless networks I worked with within this<br />
dissertation are introduced.<br />
1.1 Introduction to RFID systems<br />
The following description of RFID systems and its security and privacy problems is based on [Juels,<br />
2006; Langheinrich, 2009; Peris-Lopez et al., 2006]. The interested reader can get a broader view<br />
and deeper understanding on RFID systems by reading the cited papers instead of only relying on<br />
this short introduction.<br />
RFID (Radio-Frequency IDentification) is a technology for automated identification of objects<br />
or people. An RFID system consist of simple Tags, Readers, and Backend servers. The tags carry<br />
unique identifiers. These unique identifiers are read by nearby Readers by radio communication.<br />
The Readers send the obtained identifiers to Backend Servers. The goal of an RFID system is the<br />
unique identification of the holders of the Tags.<br />
Example applications of RFID systems include smart appliances, shopping, interactive objects,<br />
or medication compliance. This list can be expanded to hundreds of scenarios [Wu et al., 2009;<br />
RFID, 2012].<br />
The main threats to privacy in RFID systems are tracking and inventorying. A tracking attacker<br />
can eavesdrop message exchanges in different parts of the network. If the system is not defended<br />
against such attacks, the attacker can link different message exchanges of the same user, hence<br />
can track the user. This is a very important concern in RFID systems, that is why this problem<br />
is discussed in Chapter 2 (the problem of tracking is actually not unique to RFID systems, and I<br />
will study it in a different context in Chapter 3, namely in vehicular networks).<br />
Inventorying is a specific attack against RFID systems. It relies on the assumption that in<br />
the near future, most of our objects will be tagged with distant readable RFID tags. An attacker<br />
carrying out an inventorying attack can get know exactly what a user wears, has in her pockets or<br />
bag without the consent of the user.<br />
In Chapter 2, two private authentication methods are given, which make it difficult for an<br />
attacker to carry out tracking and inventorying attacks.<br />
Another important field of security problems regarding RFID is the authenticity of the tags. In<br />
short, the privacy problem is related to malicious readers, while the authenticity problem is related<br />
to malicious tags. The main problem is that illegitimate tags can be counterfeited to obtain the<br />
same rights as the legitimate tag holds. In the following, I will assume the presence of malicious<br />
readers, but no malicious tags is considered.<br />
When considering the RFID tags capabilities, the tags on the market can be classified into two<br />
main categories: basic tags with no real cryptographic capabilities and advanced tags with some<br />
symmetric key cryptography capabilities.<br />
Basic tags<br />
Basic RFID tags lack the resources to perform true cryptographic operations. The lack of cryptography<br />
in basic RFID tags is a big impediment to security design; cryptography, after all, is the<br />
main building block of data security. The main approaches to provide privacy to basic tags are the<br />
following: killing, sleeping, renaming, proxying, distance measurement, blocking, and legislation.<br />
Killing and sleeping are very similar approaches. The basic idea is that an authenticated<br />
command can reversibly or permanently switch off the tag.<br />
Another approach is to divide the identifier space into two separate parts by a modifiable<br />
privacy bit [Juels et al., 2003; Juels and Brainard, 2004]. The two parts are the private and the<br />
public parts. A blocker device can make the scanning of private tags infeasible, and the tags can<br />
be moved between the public and private zone on demand. Another device based solution is the<br />
proxying, where the holder of the tag can use some equipments (like a mobile phone)to enforce<br />
privacy [Floerkemeier et al., 2005; Juels et al., 2006; Rieback et al., 2005].<br />
2
1.2. Introduction to Vehicular Ad Hoc Networks<br />
The tracking problem is based on the fact, that tags use static identifiers. Some proposals<br />
suggest to rename the tags by readers [Instruments, 2005], or the tag itself can rotate pseudonyms<br />
[Juels, 2005a] to make tracking harder. In distance measurement the tags can roughly measure<br />
their distance to the reader by measuring the signal-to-noise ratio of the channel [Fishkin et al.,<br />
2005]. This can be used to avoid distant aggressive scanning.<br />
A non technical approach is legislation: There are some efforts to regulate the usage of RFID<br />
tags from the privacy point of view [Kelly and Erickson, 2005], but these efforts are far from<br />
efficient completion. Ultimately, this approach may be more effective and cost efficient than any<br />
other (e.g. from an economic aspect, it is not worth to track if the tracker can go to jail by doing<br />
so). The authentication of basic tags is as hard as providing privacy to them. There are some<br />
work [Juels, 2005b], how the kill PIN can be used to authenticate the tags.<br />
Advanced tags<br />
Advanced tags are capable of simple symmetric key operations. However weak cryptographic<br />
algorithms are targets of successfull attacks [Bono et al., 2005]. Another attack type against<br />
cryptographically enabled tags are the man-in-the-middle attacks. In a MiM attack the attacker<br />
is relaying messages between the tag and the reader and by doing so, he can modify, delete, and<br />
inject messages in their communication. This can also be done if the tag and the reader are not in<br />
vicinity [Hancke, 2005; Kfir and Wool, 2005].<br />
The privacy of advanced tags is deeply analyzed in Chapter 2. In short, the problem is that<br />
the tag is not allowed to send its identifier in order to avoid tracking, therefore the reader needs a<br />
lot of trials to find the right decryption key.<br />
The computational burden on the reader can be partly alleviated with key-trees [Molnar and<br />
Wagner, 2004], synchronization [Ohkubo et al., 2004], or time-memory tradeoffs [Avoine et al.,<br />
2005; Avoine and Oechslin, 2005]. However, all known mitigation techniques lead to degradation<br />
of privacy or efficiency. The degradation of privacy is analyzed in Chapter 2, where efficient<br />
solutions are also proposed.<br />
1.2 Introduction to Vehicular Ad Hoc Networks<br />
The following description of Vehicular Ad Hoc Networks and their security and privacy properties<br />
is based on [Raya and Hubaux, 2005; Raya and Hubaux, 2007; Lin et al., 2008; Blum et al., 2004b;<br />
Dötzer, 2006]. The interested reader can get a broader view and deeper understanding on VANETs<br />
by reading the cited papers instead of only relying on this short introduction.<br />
The main motivation to use VANETs is to enhance traffic safety, traffic efficiency, give assistance<br />
to drivers, and the possibility of infotainment applications. A VANET consist of vehicles equipped<br />
with On Board Units (OBUs) and wireless communication equipment, Road Side Units (RSUs),<br />
and backend infrastructure. The vehicles exchange messages regularly with each other and with<br />
the infrastructure using wireless communication to achieve the main goals such as safer roads.<br />
The main vulnerabilities in VANETs come from the wireless nature of the communication, and<br />
the sensitive information, such as location of users, used by the network. One major vulnerability<br />
comes from the the wireless nature of the system: the communication can be jammed easily, the<br />
messages can be forged. Another problem related to the wireless communication is that while the<br />
nodes are relaying messages, they can modify them. This is called In-Transit Traffic Tampering.<br />
Another kind of problem, that the vehicles can impersonate other vehicles with higher privileges<br />
such as emergency vehicles to gain extra privileges. The most relevant problem to this dissertation<br />
is that the privacy of the drivers of the vehicles can be violated. This vulnerability is analyzed in<br />
Chapter 3. In general an attacker can achieve her goals by tampering the OBU, an RSU, sensor<br />
readings, or the wireless channel.<br />
Traditional mechanisms cannot deal with the vulnerabilities discussed above because of the<br />
new challenges in VANETs. Such challenge is the high network volatility caused by the highly<br />
mobile very large scale network. Another challenge is that the network must offer liability and<br />
3
1. INTRODUCTION<br />
privacy at the same time in an efficient way, as the applications are delay sensitive. To make things<br />
even worse, the network is very heterogenous, different vehicles can have different equipment and<br />
abilities, so no unique solution can solve every problem.<br />
When defining the key vulnerabilities and challenges of vehicular ad hoc networks, it is crucial<br />
to first define and characterize the possible attackers. In many papers [Raya and Hubaux, 2007;<br />
Hu et al., 2005] the attacker can be characterized as follows:<br />
Insider vs. Outsider: The key difference between an insider and an outsider attacker is that<br />
an insider poses legitimate and valid cryptographic credentials, while an outsider does not<br />
have any valid credentials. It is obvious that an insider attacker can mount stronger attacks,<br />
then an outsider.<br />
Malicious vs. Rational: The main goal of a malicious attacker is to disrupt the normal operation<br />
of the network without any further goal, while a rational attacker wants to make some<br />
profit with his attack. In general, it is easier to handle a rational attacker, because his steps<br />
can be foreseen easier.<br />
Active vs. Passive: A passive attacker only eavesdrops the messages of the vehicles, while an<br />
active attacker can send, modify, or delete messages.<br />
Local vs. Global: A local attacker mounts his attack on a small area (or on some non continuous<br />
small areas), while a global attacker has influence on broader areas.<br />
In the following, some basic and sophisticated attacks are presented to give the reader an idea<br />
about the threats in vehicular ad hoc networks.<br />
An insider attacker can diffuse bogus information to affect the behavior of other drivers. The<br />
source of the information can be a cheated sensor reading or a modified location data.<br />
In wireless networking, the wormhole attack [Hu et al., 2006] consists in tunneling packets<br />
between two remote nodes. Similarly, in VANETs, an attacker that controls at least two entities<br />
remote from each other and a high speed communication link between them can tunnel packets<br />
broadcasted in one location to another, thus disseminating erroneous (but correctly signed)<br />
messages in the destination area.<br />
According to [Kroh et al., 2006] the following security concepts must be used in a vehicular<br />
ad hoc network to handle most of the possible attacks: identification and authentication concepts,<br />
privacy concepts, integrity concepts, access control and authorization concepts. The concepts are<br />
introduced in Section 3.8 with a special attention on providing privacy to the users of the system.<br />
In Chapter 3, the privacy of VANETs is analyzed, especially the privacy provided by pseudonyms<br />
considering outsider rational passive local attackers. A pseudonym change algorithm is<br />
provided as well considering an outsider rational passive global attacker.<br />
1.3 Introduction to Wireless Sensor Networks<br />
The following description of Wireless Sensor Networks (WSNs) and the related security problems<br />
is based on [Akyildiz et al., 2002; Chan and Perrig, 2003; Li et al., 2009; Lopez, 2008; Perrig et<br />
al., 2004; Sharma et al., 2012; Yick et al., 2008]. The interested reader can get a broader view and<br />
deeper understanding on WSNs by reading the cited papers instead of only relying on this short<br />
introduction.<br />
A sensor network is composed of a large number of sensor nodes, which are typically densely<br />
deployed. One sensor node consists of some sensor circuits which can measure some environmental<br />
variable, central processing unit which is typically a microcontroller, and radio circuit which<br />
enables the communication with other nearby nodes. The goal of a wireless sensor network can<br />
be one of many applications: military applications (e.g. battlefield surveillance), environmental<br />
applications (e.g. forest fire detection), critical infrastructure protection (e.g. surveillance of water<br />
pipes), health applications (e.g. drug administration in hospitals), home applications (e.g. smart<br />
environment).<br />
4
1.3. Introduction to Wireless Sensor Networks<br />
Some important security challenges in WSNs are: secure routing, secure key management, efficient<br />
(broadcast) authentication, secure localization, secure data aggregation. A good introduction<br />
to these problems and some countermeasures can be found in [Lopez, 2008].<br />
The privacy related challenges can be categorized into two main groups [Li et al., 2009]: dataoriented<br />
and context oriented challenges. In data-oriented protection, the confidentiality of the<br />
measured data must be preserved. Context oriented protection covers the location privacy of the<br />
source and some significant nodes such as the base station or aggregator nodes:<br />
Data-oriented privacy protection: Data-oriented privacy protection focuses on protecting<br />
the privacy of data content. Here ”data” refer to not only sensed data collected within a<br />
WSN but also queries posed to a WSN by users.<br />
– Privacy protection during data aggregation: Data aggregation is designed to<br />
substantially reduce the volume of traffic being transmitted in a WSN by fusing or<br />
compressing data in the intermediate sensor nodes (called aggregators). It is an important<br />
technique for preserving resources (e.g., energy consumption) in a WSN. Interestingly,<br />
it is also a common and effective method to preserve private data against<br />
an external adversary, because the process compresses large inputs to small outputs at<br />
the intermediate sensor nodes. On the other hand, a malicious aggregator can modify<br />
the measurements of many nodes with one step, or can learn the individual measurements<br />
of individual nodes. Some countermeasure are proposed in [He et al., 2007;<br />
Zhang et al., 2008].<br />
Cluster-based privacy data aggregation (CPDA): The basic idea of CPDA<br />
[He et al., 2007] is to introduce noise to the raw data sensed from a WSN, such<br />
that although an aggregator can obtain accurate aggregated information but not<br />
individual data points.<br />
Slice-mixed aggregation (SMART): The main idea of SMART [He et al., 2007]<br />
is to slice original data into pieces and recombine them randomly. This is done in<br />
three phases: slicing, mixing, and aggregation.<br />
Generic privacy-preservation solutions for approximate aggregation (GP 2 S):<br />
The basic idea of GP 2 S [Zhang et al., 2008] is to generalize the values of data transmitted<br />
in a WSN, such that although individual data content cannot be decrypted,<br />
the aggregator can still obtain an accurate estimate of the histogram of data distribution,<br />
and thereby approximate the aggregates.<br />
– Private date query: The query issued to a WSN (to retrieve the collected data)<br />
is often also of critical privacy concerns. To address this challenge, a target-region<br />
transformation technique was proposed in [Carbunar et al., 2007] to fuzzy the target<br />
region of the query according to predefined transformation functions.<br />
Context-oriented privacy protection: Context-oriented privacy protection focuses on<br />
protecting contextual information, such as the location and timing information of traffic<br />
transmitted in a WSN. Location privacy concerns may arise for such special sensor nodes as<br />
the data source and the base station. Timing privacy, on the other hand, concerns the time<br />
when sensitive data is created at a data source, collected by a sensor node and transmitted<br />
to the base station.<br />
– Location privacy: A major challenge for context-oriented privacy protection is that<br />
an adversary may be able to compromise private information even without the ability<br />
of decrypting the transmitted data. In particular, since hop-by-hop transmission is<br />
required to address the limited transmission range of sensor nodes, an adversary may<br />
derive the locations of important nodes and data sources by observing and analyzing<br />
the traffic patterns between different hops.<br />
Location privacy of data source: In event driven networks, an event is generated<br />
if something interesting happens in the vicinity of the node. In some networks, the<br />
5
1. INTRODUCTION<br />
only data sent to the base station is the occurrence of the event. Thus the presence<br />
of communication reveals the location of the event. In some situations, it must be<br />
hidden from an attacker. Some approaches are described in the following:<br />
Baseline and probabilistic flooding mechanisms: The basic idea of baseline<br />
flooding is for each sensor to broadcast the data it receives from one neighbor<br />
to all of its other neighbors. The premise of this approach is that all sensors<br />
participate in the data transmission so that it is unlikely for an attacker to<br />
track a path of transmission back to the data source [Kamat et al., 2005]. This<br />
can be further optimized if not every node rebroadcasts the message, only a<br />
probabilistic set of them.<br />
Random walk mechanisms: According to [Kamat et al., 2005], a random<br />
walk can be performed before the probabilistic flooding to further increase the<br />
uncertainty of the attacker. To improve simple random walk, a two-way greedy<br />
random walk(GROW) scheme was proposed in [Xi et al., 2006].<br />
Dummy data mechanism: To further protect the location of the data source,<br />
fake data packets can be introduced to perturb the traffic patterns observed by<br />
the adversary. In particular, a simple scheme called Short-lived Fake Source<br />
Routing was proposed in [Kamat et al., 2005] for each sensor to send out a fake<br />
packet with a pre-determined probability.<br />
Fake data sources mechanism: The basic idea of fake data source is to<br />
choose one or more sensor node to simulate the behavior of a real data source<br />
in order to confuse the adversaries [Mehta et al., 2007].<br />
Location privacy of base station: In a WSN, a base station is not only in<br />
charge of collecting and analyzing data, but also used as the gateway connecting the<br />
WSN with outside wireless or wired network. Consequently, destroying or isolating<br />
the base station may lead to the malfunction of the entire network. This can be<br />
circumvented if the location of the base station is unknown to the adversary.<br />
Defense against local adversaries: The location information or identifier<br />
of the base station is sent in clear in many protocols. These information must be<br />
hidden from an eavesdropper, which can be done by traditional cryptographic<br />
techniques (encryption). Another problem can be if the attacker can follow<br />
the way of packets from the source towards the base station. This can be<br />
mitigated by changing data appearance by re-encryption [Deng et al., 2006a;<br />
Dingledine et al., 2004], routing with multiple parents [Deng et al., 2005; Deng et<br />
al., 2006a], routing with random walk [Jian et al., 2007], or decorrelating parentchild<br />
relationship by randomly selecting sending time [Deng et al., 2006a].<br />
Defense against global adversaries: The techniques discussed above are<br />
inefficient against a global attacker. To fight against a global attacker the<br />
traffic patterns of the whole network must be modified. This can be done by<br />
hiding traffic pattern by controlling transmission rate [Deng et al., 2006a], or<br />
by propagating dummy data [Deng et al., 2005; Deng et al., 2006a].<br />
– Temporal privacy problem: When an adversary eavesdrops a message, it can<br />
deduce the sending time of the message from the time it eavesdropped and the TTL<br />
value. In some applications this information must be hidden. It can be done by randomly<br />
delaying the messages by the relaying nodes [Kamat et al., 2007].<br />
As it can be seen from the discussion above, a considerable amount of work has been done in the<br />
field of privacy in wireless sensor networks. However, the particular problem of location privacy<br />
of aggregator nodes received less attention. Therefore, in Chapter 4, I study this problem and<br />
propose two anonym aggregator election protocols, which can hide the identity of the aggregator<br />
nodes.<br />
6
1.3. Introduction to Wireless Sensor Networks<br />
The remainder of the dissertation is organized as follows: In Chapter 2, I propose two private<br />
authentication schemes for resource limited systems, such as RFID systems. The results presented<br />
in Chapter 2 have been published in [Buttyan et al., 2006a; Buttyan et al., 2006b; Avoine et<br />
al., 2007]. In Chapter 3, I analyze the privacy achieved by pseudonym changing techniques in<br />
vehicular ad hoc networks, and propose a pseudonym changing algorithm for VANETs. All results<br />
of Chapter 3 have been published in [Buttyan et al., 2007; Papadimitratos et al., 2008; Holczer et<br />
al., 2009; Buttyan et al., 2009]. In Chapter 4, I analyze how an aggregator node can be elected and<br />
used in wireless sensor networks without revealing its identity. All results of Chapter 4 have been<br />
published in [Buttyán and Holczer, 2009; Buttyán and Holczer, 2010; Holczer and Buttyán, 2011;<br />
Schaffer et al., 2012]. The possible application of the new results can be found in Chapter 5, while<br />
Chapter 6 concludes the dissertation.<br />
7
Chapter 2<br />
Private Authentication in Resource<br />
Constrained Environments<br />
2.1 Introduction to private authentication<br />
Entity authentication is the process whereby a party (the prover) corroborates its identity to<br />
another party (the verifier). Entity authentication is often based on authentication protocols in<br />
which the parties pass messages to each other. These protocols are engineered in such a way that<br />
they resist various types of impersonation and replay attacks [Boyd and Mathuria, 2003]. However,<br />
less attention is paid to the requirement of preserving the privacy of the parties (typically that of<br />
the prover) with respect to an eavesdropping third party. Indeed, in many of the well-known and<br />
widely used authentication protocols (e.g., [ISO, 2008; Kohl and Neuman, 1993]) the identity of<br />
the prover is sent in cleartext, and hence, it is revealed to an eavesdropper.<br />
One approach to solve this problem is based on public key cryptography, and it consists of<br />
encrypting the identity information of the prover with the public key of the verifier so that no<br />
one but the verifier can learn the prover’s identity [Abadi and Fournet, 2004]. Another approach,<br />
also based on public key techniques, is that the parties first run an anonymous Diffie-Hellman key<br />
exchange and establish a confidential channel, through which the prover can send its identity and<br />
authentication information to the verifier in a second step. An example for this second approach is<br />
the main mode of the Internet Key Exchange (IKE and IKEv2) protocol [Harkins and Carrel, 1998;<br />
Black and McGrew, 2008]. While it is possible to hide the identity of the prover by using the above<br />
mentioned approaches, they provide appropriate solution to the problem only if the parties can<br />
afford public key cryptography. In many applications, such as low cost RFID tags and contactless<br />
smart card based automated fare collection systems in mass transportation, this is not the case,<br />
while at the same time, the provision of privacy (especially location privacy) in those systems is<br />
strongly desirable.<br />
The problem of using symmetric key encryption to hide the identity of the prover is that<br />
the verifier does not know which symmetric key it should use to decrypt the encrypted identity,<br />
because the appropriate key cannot be retrieved without the identity. The verifier may try all<br />
possible keys in its key database until one of them properly decrypts the encrypted identity 1 , but<br />
this would increase the authentication delay if the number of potential provers is large. Long<br />
authentication delays are usually not desirable, moreover, in some cases, they may not even be<br />
acceptable. As an example, let us consider again contactless smart card based electronic tickets<br />
in public transportation: the number of smart cards in the system (i.e., the number of potential<br />
provers) may be very large in big cities, while the time needed to authenticate a card should be<br />
short in order to ensure a high throughput of passengers and avoid long queues at entry points.<br />
1 This of course requires redundancy in the encrypted message so that the verifier can determine if the decryption<br />
was successful.<br />
9
2. PRIVATE AUTHENTICATION<br />
Some years ago, Molnar and Wagner proposed an elegant approach to privacy protecting authentication<br />
[Molnar and Wagner, 2004] that is based on symmetric key cryptography while still<br />
ensuring short authentication delays. More precisely, the complexity of the authentication procedure<br />
in the Molnar-Wagner scheme is logarithmic in the number of potential provers, in contrast<br />
with the linear complexity of the naïve key search approach. The main idea of Molnar and Wagner<br />
is to use key-trees (see Figure 2.1 for illustration). A key-tree is a tree where a unique key is assigned<br />
to each edge. The leaves of the tree represent the potential provers, which is called members<br />
in the sequel. Each member possesses the keys assigned to the edges of the path starting from the<br />
root and ending in the leaf that corresponds to the given member. The verifier knows all keys in<br />
the tree. In order to authenticate itself, a member uses all of its keys, one after the other, starting<br />
from the first level of the tree and proceeding towards lower levels. The verifier first determines<br />
which first level key has been used. For this, it needs to search through the first level keys only.<br />
Once the first key is identified, the verifier continues by determining which second level key has<br />
been used. However, for this, it needs to search through those second level keys only that reside<br />
below the already identified first level key in the tree. This process is continued until all keys are<br />
identified, which at the end, identify the authenticating member. The key point is that the verifier<br />
can reduce the search space considerably each time a key is identified, because it should consider<br />
only the subtree below the recently identified key.<br />
k111<br />
k11<br />
k1<br />
Figure 2.1: Illustration of a key-tree. There is a unique key assigned to each edge. Each leaf<br />
represents a member of the system that possesses the keys assigned to the edges of the path<br />
starting from the root and ending in the given leaf. For instance, the member that belongs to the<br />
leftmost leaf in the figure possesses the keys k1, k11, and k111.<br />
The problem of the above described tree-based approach is that upper level keys in the tree are<br />
used by many members, and therefore, if a member is compromised and its keys become known<br />
to the adversary, then the adversary gains partial knowledge of the key of other members too<br />
[Avoine et al., 2005]. This obviously reduces the privacy provided by the system to its members,<br />
since by observing the authentication of an uncompromised member, the adversary can recognize<br />
the usage of some compromised keys, and therefore its uncertainty regarding the identity of the<br />
authenticating member is reduced (it may be able to determine which subtree the member belongs<br />
to).<br />
One interesting observation is that the naïve, linear key search approach can be viewed as a<br />
special case of the key-tree based approach, where the key-tree has a single level and each member<br />
has a single key. Regarding the above described problem of compromised members, the naïve<br />
approach is in fact optimal, because compromising a member does not reveal any key information<br />
of other members. At the same time, as described above, the authentication delay is the worst in<br />
this case. On the other hand, in case of a binary key-tree, it can be observed that the compromise<br />
of a single member strongly 2 affects the privacy of the other members, while at the same time,<br />
the binary tree is very advantageous in terms of authentication delay. Thus, there seems to be a<br />
trade-off between the level of privacy provided by the system and the authentication delay, which<br />
depends on the parameters of the key-tree, but it is far from obvious to see how the optimal<br />
2 The precise quantification of this effect is the topic of this chapter and will be presented later.<br />
10
2.2. Resistance to single member compromise<br />
key-tree should look like. In this chapter, I address this problem, and I show how to find optimal<br />
key-trees.<br />
In this chapter, after finding the optimal key-tree, I go further and I present a novel symmetrickey<br />
private authentication scheme that provides a higher level of privacy and achieves better<br />
efficiency than the key-tree based approach. This approach is called the group based approach.<br />
More precisely, the complexity of the group based scheme for the reader can be set to be O(log N)<br />
(i.e., the same as in the key-tree based approach), while the complexity for the tags is always a<br />
constant (in contrast to O(log N) of the key-tree based approach). Hence, the group based scheme<br />
is better than the key-tree based scheme both in terms of privacy and efficiency, and therefore, it<br />
is a serious alternative to the key-tree based scheme to be considered by the RFID community.<br />
More precisely, the main contributions are the following:<br />
I propose a benchmark metric for measuring the resistance of the system to a single compromised<br />
member based on the concept of anonymity sets. To the best of my knowledge,<br />
anonymity sets have not been used in the context of private authentication yet. I prove that<br />
this simply defined metric is equivalent to a metric widely used in cryptography with a much<br />
more complex definition. The real contribution of the metric, is that its definition simplifies<br />
the usage of the metric without losing any details of the more complex metric.<br />
I introduce the idea of using different branching factors at different levels of the key-tree;<br />
the advantage is that the system’s resistance to single member compromise can be increased<br />
while still keeping the authentication delay short. To the best of my knowledge, key-trees<br />
with variable branching factors have not been proposed yet for private authentication.<br />
I present an algorithm for determining the optimal parameters of the key-tree, where optimal<br />
means that resistance to single member compromise is maximized, while the authentication<br />
delay is kept below a predefined threshold.<br />
In the general case, when any member can be compromised, I give a lower bound on the<br />
level of privacy provided by the system, and present some simulation results that show that<br />
this lower bound is quite sharp. This allows me to compare different systems based on their<br />
lower bounds.<br />
I introduce a group based approach, which is superior to the tree-based approach in many<br />
properties.<br />
In summary, I propose practically usable techniques for designers of RFID based authentication<br />
systems.<br />
The outline of the chapter is the following: in Section 2.2, I introduce my benchmark metric<br />
to measure the level of privacy provided by key-tree or group based authentication systems, and<br />
I illustrate, through an example, how this metric can be used to compare systems with different<br />
parameters. By the same token, I also show that key-trees with variable branching factors can be<br />
better than key-trees with a constant branching factor at every level. In Section 2.3, I formulate<br />
the problem of finding the best key-tree with respect to my benchmark metric as an optimization<br />
problem, and I present an algorithm that solves that optimization problem. In Section 2.4, I<br />
consider the general case, when any number of members can be compromised, and I derive a useful<br />
lower bound on the level of privacy provided by the system. After finding the optimal key-tree, I<br />
describe the operation of my group based scheme in Section 2.5, and I quantify the level of privacy<br />
that it provides in Section 2.6. I compare the group based scheme to the key-tree based approach<br />
in Section 2.7. Finally, in Section 2.8, I report on some related work, and in Section 2.9, I conclude<br />
the chapter.<br />
2.2 Resistance to single member compromise<br />
There are different ways to measure the level of anonymity provided by a system [Diaz et al., 2002;<br />
Serjantov and Danezis, 2003]. Here the concept of anonymity sets [Chaum, 1988] is used. The<br />
11
2. PRIVATE AUTHENTICATION<br />
anonymity set of a member v is the set of members that are indistinguishable from v from the<br />
adversary’s point of view. The size of the anonymity set is a good measure of the level of privacy<br />
provided for v, because it is related to the level of uncertainty of the adversary, if all members<br />
of the set are equiprobably likely (otherwise an entropy based metric can be used). Clearly, the<br />
larger the anonymity set is, the higher the level of privacy is. The minimum size of the anonymity<br />
set is 1, and its maximum size is equal to the number of all members in the system. In order to<br />
make the privacy measure independent of the number of members, one can divide the anonymity<br />
set size by the total number of members, and obtain a normalized privacy measure between 0 and<br />
1. Such normalization makes the comparison of different systems easier.<br />
Now, let us consider a key-tree with ℓ levels and branching factors b1, b2, . . . , bℓ at the levels, and<br />
let us assume that exactly one member is compromised (see Figure 2.2 for illustration). Knowledge<br />
of the compromised keys allows the adversary to partition the members into subsets P0, P1, P2, . . .,<br />
where<br />
P0 contains the compromised member only,<br />
P1 contains the members the parent of which is the same as that of the compromised member,<br />
and that are not in P0,<br />
P2 contains the members the grandparent of which is the same as that of the compromised<br />
member, and that are not in P0 ∪ P1,<br />
etc.<br />
Members of a given subset are indistinguishable for the adversary, while it can distinguish between<br />
members that belong to different subsets. Hence, each subset is the anonymity set of its members.<br />
k111<br />
k11<br />
k1<br />
P0 P1 P2 P3<br />
Figure 2.2: Illustration of what happens when a single member is compromised. Without loss<br />
of generality, it is assumed that the member corresponding to the leftmost leaf in the figure is<br />
compromised. This means that the keys k1, k11, and k111 become known to the adversary. This<br />
knowledge of the adversary partitions the set of members into anonymity sets P0, P1, . . . of different<br />
sizes. Members that belong to the same subset are indistinguishable to the adversary, while it can<br />
distinguish between members that belong to different subsets. For instance, the adversary can<br />
recognize a member in subset P1 by observing the usage of k1 and k11 but not that of k111, where<br />
each of these keys are known to the adversary. Members in P3 are recognized by not being able to<br />
observe the usage of any of the keys known to the adversary.<br />
The level of privacy provided by the system can be characterized by the level of privacy provided<br />
to a randomly selected member, or in other words, by the expected size of the anonymity set of a<br />
randomly selected member. By definition, the expected anonymity set size is:<br />
¯S =<br />
ℓ∑<br />
i=0<br />
|Pi|<br />
N |Pi| =<br />
12<br />
ℓ∑<br />
i=0<br />
|Pi| 2<br />
N<br />
(2.1)
2.2. Resistance to single member compromise<br />
where N is the total number of members, and |Pi|/N is the probability of selecting a member from<br />
subset Pi. The resistance to single member compromise, denoted by R, is defined as the normalized<br />
expected anonymity set size, which can be computed as follows:<br />
R = ¯ S<br />
N =<br />
=<br />
=<br />
where it is used that<br />
1<br />
N 2<br />
1<br />
N 2<br />
ℓ∑<br />
i=0<br />
|Pi| 2<br />
N 2<br />
( 1 + (bℓ − 1) 2 + ((bℓ−1 − 1)bℓ) 2 + . . . + ((b1 − 1)b2b3 . . . bℓ) 2)<br />
⎛<br />
⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />
+ (bi − 1) 2<br />
i=1<br />
|P0| = 1<br />
|P1| = bℓ − 1<br />
ℓ∏<br />
j=i+1<br />
|P2| = (bℓ−1 − 1)bℓ<br />
b 2 j<br />
|P3| = (bℓ−2 − 1)bℓ−1bℓ<br />
. . . . . .<br />
|Pℓ| = (b1 − 1)b2b3 . . . bℓ<br />
⎞<br />
⎠ (2.2)<br />
As its name indicates, R characterizes the loss of privacy due to the compromise of a single<br />
member of the system. If R is close to 1, then the expected anonymity set size is close to the total<br />
number of members, and hence, the loss of privacy is small. On the other hand, if R is close to<br />
0, then the loss of privacy is high, as the expected anonymity set size is small. R is used as a<br />
benchmark metric based on which different systems can be compared.<br />
This metric can be seen as being a little ad hoc, but actually the same metric is used in other<br />
papers like [Avoine et al., 2005] with a different more complex definition:<br />
Theorem 1. The expected anonymity set size based metric (R) is complement to the one tag<br />
tampering based metric (M) defined in [Avoine et al., 2005].<br />
Proof. The metric M used in [Avoine et al., 2005] is defined in that paper as:<br />
1. The attacker has one tag T0 (e.g., her own) she can tamper with and thus obtain its complete<br />
secret. For the sake of calculation simplicity, we assume that T0 is put back into circulation.<br />
When the number of tags in the system is large, this does not significantly affect the results.<br />
2. She then chooses a target tag T. She can query it as much as she wants but she cannot<br />
tamper with it.<br />
3. Given two tags T1 and T2 such that T ∈ {T1, T2}, we say that the attacker succeeds if she<br />
definitely knows which of T1 and T2 is T . We define the probability to trace T as being the<br />
probability that the attacker succeeds. To do that, the attacker can query T1 and T2 as many<br />
times as she wants but, obviously, cannot tamper with them.<br />
In the following P1 . . . Pk are the subsets of the tags after the compromise of some tags<br />
( ∑k i=1 Pi = N).<br />
In the third step, the attacker can be successful if (and only if) T1 and T2 belongs to different<br />
subsets.<br />
The probability of the attacker’s success is the probability that two randomly chosen tags<br />
belongs to two different subsets. This probability can be calculated as follows:<br />
M = 1 − Pr(T1, T2 are in P1) − . . . − Pr(T1, T2 are in Pk) = 1 −<br />
This is the complement of the metric R (M + R = 1).<br />
13<br />
k∑<br />
i=1<br />
( ) 2<br />
Pi<br />
N
2. PRIVATE AUTHENTICATION<br />
Obviously, a system with greater R is better, and therefore, one would like to maximize R (and<br />
at the same time minimize M). However, there are some constraints. The maximum authentication<br />
delay, denoted by D, is defined as the number of basic operations needed to authenticate any<br />
member in the worst case. The maximum authentication delay in case of key-tree based authenti-<br />
cation can be computed as D = ∑ ℓ<br />
i=1 bi. In most practical cases, there is an upper bound Dmax<br />
on the maximum authentication delay allowed in the system. For instance, in the specification<br />
for electronic ticketing systems for public transport applications in Hungary [Berki, 2008], it is<br />
required that a ticket validation transaction should be completed in 250 ms. Taking into account<br />
the details of the ticket validation protocol, one can derive Dmax for electronic tickets from such<br />
specifications. Therefore, in practice, the designer’s task is to maximize R under the constraint<br />
that D ≤ Dmax. This problem is addressed in Section 2.3.<br />
In the remainder of this section, I illustrate how the benchmark metric R can be used to<br />
compare different systems. This exercise will also lead to an important revelation: key-trees with<br />
varying branching factors at different levels could provide higher level of privacy than key-trees<br />
with a constant branching factor, while having the same or even a shorter authentication delay.<br />
Example: Let us assume that the total number N of members is 27000 and the upper bound Dmax<br />
on the maximum authentication delay is 90. Let us consider a key-tree with a constant branching<br />
factor vector B = (30, 30, 30), and another key-tree with branching factor vector B ′ = (60, 10, 9, 5).<br />
Both key-trees can serve the given population of members, since 30 3 = 60 · 10 · 9 · 5 = 27000.<br />
In addition, both key-trees ensure that the maximum authentication delay is not longer than<br />
Dmax: for the first key-tree, we have D = 3 · 30 = 90, whereas for the second one, we get<br />
D = 60+10+9+5 = 84. Using (2.2), we can compute the resistance to single member compromise<br />
for both key-trees. For the first tree, we get R ≈ 0.9355, while for the second tree we obtain<br />
R ≈ 0.9672. Thus, we can arrive to the conclusion that the second key-tree with variable branching<br />
factors is better, as it provides a higher level of privacy, while ensuring a smaller authentication<br />
delay.<br />
At this point, several questions arise naturally: Is there an even better branching factor vector<br />
than B ′ for N = 27000 and Dmax = 90? What is the best branching factor vector for this case?<br />
How can we find the best branching factor vector in general? I give the answers to these questions<br />
in the next section.<br />
2.3 Optimal trees in case of single member compromise<br />
The problem of finding the best branching factor vector can be described as an optimization<br />
problem as follows: Given the total number N of members and the upper bound Dmax on the<br />
maximum authentication delay, find a branching factor vector B = (b1, b2, . . . bℓ) such that R(B)<br />
is maximal subject to the following constraints:<br />
ℓ∏<br />
bi = N (2.3)<br />
i=1<br />
ℓ∑<br />
bi ≤ Dmax (2.4)<br />
i=1<br />
This optimization problem is analyzed through a series of lemmas that will lead to an algorithm<br />
that solves the problem. The first lemma states that we can always improve a branching factor<br />
vector by ordering its elements in decreasing order, and hence, in the sequel only ordered vectors<br />
are considered:<br />
Lemma 1. Let N and Dmax be the total number of members and the upper bound on the<br />
maximum authentication delay, respectively. Moreover, let B be a branching factor vector and let<br />
B ∗ be the vector that consists of the sorted permutation of the elements of B in decreasing order.<br />
If B satisfies the constraints of the optimization problem defined above, then B ∗ also satisfies<br />
them, and R(B ∗ ) ≥ R(B).<br />
14
2.3. Optimal trees in case of single member compromise<br />
Proof. B ∗ has the same elements as B has, therefore, the sum and the product of the elements of<br />
B ∗ are the same as that of B, and so if B satisfies the constraints of the optimization problem,<br />
then B ∗ does so too.<br />
Now, let us assume that B ∗ is obtained from B with the bubble sort algorithm. The basic step<br />
of this algorithm is to change two neighboring elements if they are not in the right order. Let us<br />
suppose that bi < bi+1, and thus, the algorithm changes the order of bi and bi+1. Then, using<br />
(2.2), we can express ∆R = R(B ∗ ) − R(B) as follows:<br />
∆R = 1<br />
N 2<br />
⎛<br />
⎝(bi+1 − 1) 2 b 2 i<br />
=<br />
=<br />
1<br />
N 2<br />
⎛<br />
⎝(bi − 1) 2 b 2 i+1<br />
∏ ℓ<br />
j=i+2 b2 j<br />
ℓ∏<br />
j=i+2<br />
ℓ∏<br />
j=i+2<br />
b 2 j + (bi − 1) 2<br />
b 2 j + (bi+1 − 1) 2<br />
ℓ∏<br />
j=i+2<br />
ℓ∏<br />
b 2 j<br />
j=i+2<br />
⎞<br />
⎠ −<br />
N 2<br />
(<br />
(bi+1 − 1) 2 b 2 i + (bi − 1) 2 − (bi − 1) 2 b 2 i+1 − (bi+1 − 1) 2)<br />
∏ℓ j=i+2 b2j N 2<br />
(<br />
(bi+1 − 1) 2 (b 2 i − 1) − (bi − 1) 2 (b 2 i+1 − 1) )<br />
= (bi − 1)(bi+1 − 1) ∏ℓ j=i+2 b2j N 2<br />
((bi+1 − 1)(bi + 1) − (bi − 1)(bi+1 + 1))<br />
Since bi ≥ 2 for all i, ∆R is non-negative if<br />
bi + 1<br />
bi − 1 ≥ bi+1 + 1<br />
bi+1 − 1<br />
But (2.5) must hold, since the function f(x) = x+1<br />
x−1 is a monotone decreasing function, and by<br />
assumption, bi < bi+1. This means, that when sorting the elements of B, we improve R(B) in<br />
every step, and thus, R(B∗ ) ≥ R(B) must hold. ⋄<br />
The following lemma provides a lower bound and an upper bound for the resistance to single<br />
member compromise:<br />
Lemma 2. Let B = (b1, b2, . . . bℓ) be a sorted branching factor vector (i.e., b1 ≥ b2 ≥ . . . ≥ bℓ).<br />
We can give the following lower and upper bounds on R(B):<br />
Proof. By definition<br />
(<br />
1 − 1<br />
b1<br />
) 2<br />
≤ R(B) ≤<br />
R = 1<br />
N 2<br />
⎛<br />
⎝1 + (bℓ − 1) 2 +<br />
=<br />
( b1 − 1<br />
b1<br />
) 2<br />
(<br />
1 − 1<br />
∑ℓ−1<br />
(bi − 1) 2<br />
i=1<br />
b1<br />
+ 1<br />
N 2<br />
⎛<br />
⎝1 + (bℓ − 1) 2 +<br />
) 2<br />
ℓ∏<br />
j=i+1<br />
b 2 j<br />
+ 4<br />
3b 2 1<br />
b 2 j<br />
⎞<br />
⎠<br />
⎞<br />
⎠<br />
∑ℓ−1<br />
(bi − 1) 2<br />
i=2<br />
ℓ∏<br />
j=i+1<br />
b 2 j<br />
⎞<br />
(2.5)<br />
(2.6)<br />
⎠ (2.7)<br />
where it is used that N = b1b2 . . . bℓ. The lower bound in the lemma 3 follows directly from (2.7).<br />
3<br />
( ) 2<br />
b1−1<br />
Note that we could also derive the slightly better lower bound of + b1<br />
1<br />
N 2 from (2.7), however, we do not<br />
need that in this chapter.<br />
15
2. PRIVATE AUTHENTICATION<br />
In order to obtain the upper bound, we can write bi instead of (bi − 1) in the sum in (2.7):<br />
R <<br />
=<br />
( b1 − 1<br />
b1<br />
( b1 − 1<br />
b1<br />
b1<br />
) 2<br />
) 2<br />
+ 1<br />
N 2<br />
⎛<br />
⎝1 +<br />
⎛<br />
+ 1<br />
b 2 1<br />
+ 1<br />
b 2 1<br />
⎝1 +<br />
ℓ∑<br />
ℓ∏<br />
i=2 j=i<br />
ℓ∑<br />
i∏<br />
b 2 j<br />
b<br />
i=2 j=2<br />
2 j<br />
i=2<br />
⎞<br />
⎠<br />
⎞<br />
1<br />
⎠<br />
Since bi ≥ 2 for all i, we can write 2 in place of bi in the sum, and we obtain:<br />
R <<br />
( ) 2<br />
b1 − 1<br />
+<br />
b1<br />
1<br />
b2 =<br />
⎛<br />
⎞<br />
ℓ∑ i∏<br />
⎝1<br />
1<br />
+ ⎠<br />
1<br />
4<br />
i=2 j=2<br />
( ) 2<br />
b1 − 1<br />
+<br />
b1<br />
1<br />
b2 <<br />
(<br />
ℓ∑<br />
( ) )<br />
i−1<br />
1<br />
1 +<br />
1<br />
4<br />
i=2<br />
( ) (<br />
2<br />
∞∑<br />
( ) )<br />
i−1<br />
b1 − 1<br />
1<br />
1 +<br />
4<br />
=<br />
( b1 − 1<br />
and this is the upper bound in the lemma. ⋄<br />
b1<br />
) 2<br />
+ 1<br />
b 2 1<br />
1<br />
1 − 1<br />
4<br />
Let us consider the bounds in Lemma 2. Note that the branching factor vector is ordered,<br />
therefore, b1 is not smaller than any other bi. We can observe that if we increase b1, then the<br />
difference between the upper and the lower bounds decreases, and R(B) gets closer to 1. Intuitively,<br />
this implies that in order to find the solution to the optimization problem, b1 should be maximized.<br />
The following lemma confirms this intuition formally:<br />
Lemma 3. Let N and Dmax be the total number of members and the upper bound on the maximum<br />
authentication delay, respectively. Moreover, let B = (b1, b2, . . . , bℓ) and B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′)<br />
be two sorted branching factor vectors that satisfy the constraints of the optimization problem<br />
defined above. Then, b1 > b ′ 1 implies R(B) ≥ R(B ′ ).<br />
Proof. First, we can prove that the statement of the lemma is true if b ′ 1 ≥ 5. We know from<br />
Lemma 2 that<br />
R(B ′ (<br />
) < 1 − 1<br />
b ′ ) 2<br />
+<br />
1<br />
4<br />
3b ′ 2<br />
1<br />
and<br />
R(B) ><br />
(<br />
1 − 1<br />
b1<br />
) 2<br />
(<br />
≥ 1 − 1<br />
b ′ 1 + 1<br />
where we used that b1 > b ′ 1 by assumption. If we can prove that<br />
(<br />
1 − 1<br />
b ′ ) 2<br />
1<br />
+ 4<br />
3b ′ (<br />
≤ 2<br />
1<br />
1 − 1<br />
b ′ 1 + 1<br />
then we also proved that R(B ′ ) ≤ R(B). Indeed, a straightforward calculation yields that (2.8) is<br />
true if b ′ √<br />
15<br />
1 ≥ 2 + 2 , and since b′ 1 is an integer, we are done.<br />
Next, we can make the observation that a branching factor vector A = (a1, . . . , ak, 2, 2) that<br />
has at least two 2s at the end can be improved by joining two 2s into a 4 and obtaining A ′ =<br />
16<br />
) 2<br />
) 2<br />
(2.8)
2.3. Optimal trees in case of single member compromise<br />
(a1, . . . , ak, 4). It is clear that neither the sum nor the product of the elements changes with this<br />
transformation. In addition, we can use the definition of R to get<br />
and<br />
N 2 · R(A) = ((a1 − 1) · a2 · . . . · ak · 2 · 2) 2 + . . . + ((ak − 1) · 2 · 2) 2 +<br />
((2 − 1) · 2) 2 + (2 − 1) 2 + 1<br />
N 2 · R(A ′ ) = ((a1 − 1) · a2 · . . . · ak · 4) 2 + . . . + ((ak − 1) · 4) 2 +<br />
(4 − 1) 2 + 1<br />
Thus, R(A ′ ) − R(A) = 1<br />
N 2 (9 − 4 − 1) > 0, which means that A ′ is better than A.<br />
Now, that is proven that the lemma is also true for b ′ 1 ∈ {2, 3, 4}:<br />
b ′ 1 = 2: Since B ′ is an ordered vector where b ′ 1 is the largest element, it follows that every<br />
element of B ′ is 2, and thus, N is a power of 2. From Lemma 2, R(B ′ ) < (1− 1<br />
2 )2 + 4<br />
3·22 = 7<br />
12<br />
and R(B) > (1 − 1<br />
b1 )2 . It is easy to see that (1 − 1<br />
b1 )2 ≥ 7<br />
12 if b1<br />
1 ≥<br />
1− √ = 4.23. Since<br />
7<br />
12<br />
b1 > b ′ 1, the remaining cases are b1 = 3 and b1 = 4. However, b1 = 3 cannot be the case,<br />
because N is a power of 2. If b1 = 4, then B can be obtained from B ′ by joining pairs of<br />
2s into 4s and then ordering the elements. However, according to the observation above and<br />
Lemma 1, both operations improve the vector. It follows that R(B) ≥ R(B ′ ) must hold.<br />
b ′ 1 = 3: From Lemma 2, R(B ′ ) < (1 − 1<br />
3 )2 + 4<br />
3·32 = 16<br />
27<br />
that (1 − 1<br />
b1 )2 ≥ 16<br />
27 if b1 ≥<br />
In this case, the vectors are as follows:<br />
and R(B) > (1 − 1<br />
b1 )2 . It is easy to see<br />
9<br />
9−4· √ 3 = 4.34. Since b1 > b ′ 1, the only remaining case is b1 = 4.<br />
<br />
B = ( 2 2 , . . . , 2 2 ,<br />
i<br />
B ′ <br />
= ( 3, . . . , 3,<br />
j<br />
j<br />
<br />
3, . . . , 3,<br />
2i+k<br />
<br />
2, . . . , 2)<br />
k<br />
<br />
2, . . . , 2)<br />
where i, j ≥ 1 and k ≥ 0. This means that B can be obtained from B ′ by joining i pairs of<br />
2s into 4s and then ordering the elements. However, as we saw earlier, both joining 2s into<br />
4s and ordering the elements improve the vector, and thus, R(B) ≥ R(B ′ ) must hold.<br />
b ′ 1 = 4: Since B ′ is an ordered vector where b ′ 1 is the largest element, it follows that N is not<br />
divisible by 5. From Lemma 2, R(B ′ ) < (1 − 1<br />
4 )2 + 4<br />
3·42 = 31<br />
48<br />
easy to see that (1 − 1<br />
b1 )2 ≥ 31<br />
48 if b1 ≥<br />
1<br />
1− √ 31<br />
48<br />
and R(B) > (1 − 1<br />
b1 )2 . It is<br />
= 5.09. Since b1 > b ′ 1, the remaining case is<br />
b1 = 5. However, b1 = 5 cannot be the case, because N is not divisible by 5. ⋄<br />
Lemma 3 states that given two branching factor vectors, the one with the larger first element is<br />
always at least as good as the other. The next lemma generalizes this result by stating that given<br />
two branching factor vectors the first j elements of which are equal, the vector with the larger<br />
(j + 1)-st element is always at least as good as the other.<br />
Lemma 4. Let N and Dmax be the total number of members and the upper bound on the maximum<br />
authentication delay, respectively. Moreover, let B = (b1, b2, . . . , bℓ) and B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′)<br />
be two sorted branching factor vectors such that bi = b ′ i for all 1 ≤ i ≤ j for some j < min(ℓ, ℓ′ ),<br />
and both B and B ′ satisfy the constraints of the optimization problem defined above. Then,<br />
bj+1 > b ′ j+1 implies R(B) ≥ R(B′ ).<br />
17
2. PRIVATE AUTHENTICATION<br />
Proof. By definition<br />
R(B) = 1<br />
N 2<br />
=<br />
=<br />
⎛<br />
⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />
+ (bi − 1) 2<br />
( b1 − 1<br />
b1<br />
( b1 − 1<br />
b1<br />
) 2<br />
) 2<br />
where B1 = (b2, b3, . . . , bℓ). Similarly,<br />
i=1<br />
+ 1<br />
b2 ⎛<br />
⎝<br />
1<br />
1<br />
(N/b1) 2<br />
+ 1<br />
b2 · R(B1)<br />
1<br />
R(B ′ ( ′ b 1 − 1<br />
) =<br />
b ′ 1<br />
ℓ∏<br />
j=i+1<br />
b 2 j<br />
⎞<br />
⎠<br />
⎛<br />
⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />
+ (bi − 1) 2<br />
) 2<br />
+ 1<br />
b ′ · R(B 2<br />
1<br />
′ 1)<br />
i=2<br />
ℓ∏<br />
j=i+1<br />
where B ′ 1 = (b ′ 2, b ′ 3, . . . , b ′ ℓ ′). Since b1 = b ′ 1, R(B) ≥ R(B ′ ) if and only if R(B1) ≥ R(B ′ 1). By<br />
repeating the same argument for B1 and B ′ 1, we get that R(B) ≥ R(B ′ ) if and only if R(B2) ≥<br />
R(B ′ 2), where B2 = (b3, . . . , bℓ) and B ′ 2 = (b ′ 3, . . . , b ′ ℓ ′). And so on, until we get that R(B) ≥ R(B′ )<br />
if and only if R(Bj) ≥ R(B ′ j ), where Bj = (bj+1, . . . , bℓ) and B ′ j = (b′ j+1 , . . . , b′ ℓ ′). But from<br />
, and we are done. ⋄<br />
Lemma 3, we know that R(Bj) ≥ R(B ′ j ) if bj+1 > b ′ j+1<br />
I will now present an algorithm that finds the solution to the optimization problem. However,<br />
before doing that, we need to introduce some further notations. Let B = (b1, b2, . . . , bℓ) and<br />
B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′). Then<br />
∏ ∏ℓ (B) denotes i=1 bi;<br />
∑ ∑ℓ (B) denotes i=1 bi;<br />
{B} denotes the set {b1, b2, . . . , bℓ} of the elements of B;<br />
B ′ ⊆ B means that {B ′ } ⊆ {B};<br />
b 2 j<br />
⎞⎞<br />
⎠⎠<br />
if B ′ ⊆ B, then B \ B ′ denotes the vector that consists of the elements of {B} \ {B ′ } in<br />
decreasing order;<br />
if b is a positive integer, then b|B denotes the vector (b, b1, b2, . . . , bℓ).<br />
The algorithm is defined as a recursive function f, which takes two input parameters, a vector<br />
B of positive integers, and another positive integer d, and returns a vector of positive integers. In<br />
order to compute the optimal branching factor vector for a given N and Dmax, f should be called<br />
with the vector that contains the prime factors of N, and Dmax. For instance, if N = 27000 and<br />
Dmax = 90 (the same parameters are used as in the example in Sec 2.2, to compare the naïve<br />
and algorithmical results), then f should be called with B = (5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90.<br />
Function f will then return the optimal branching factor vector.<br />
Function f is defined Algorithm 1.<br />
The operation of the algorithm can be described as follows: The algorithm starts with a branching<br />
factor vector consisting of the prime factors of N. This vector satisfies the first constraint of<br />
the optimization problem by definition. If it does not satisfy the second constraint (i.e., it does not<br />
respect the upper bound on the maximum authentication delay), then no solution exists. Otherwise,<br />
the algorithm successively improves the branching factor vector by maximizing its elements,<br />
starting with the first element, and then proceeding to the next elements, one after the other. Maximization<br />
of an element is done by joining as yet unused prime factors until the resulting divisor<br />
of N cannot be further increased without violating the constraints of the optimization problem.<br />
18
Algorithm 1 Optimal branching factor generating algorithm<br />
f(B, d)<br />
if ∑ (B) > d then<br />
exit (no solution exists)<br />
else<br />
find B ′ ⊆ B such that<br />
∏ (B ′ ) + ∑ (B \ B ′ ) ≤ d and ∏ (B ′ ) is maximal<br />
end if<br />
if B ′ = B then<br />
return ( ∏ (B ′ ))<br />
else<br />
return ∏ (B ′ )|f(B \ B ′ , d − ∏ (B ′ ))<br />
end if<br />
2.3. Optimal trees in case of single member compromise<br />
Theorem 2. Let N and Dmax be the total number of members and the upper bound on the<br />
maximum authentication delay, respectively. Moreover, let B be a vector that contains the prime<br />
factors of N. Then, f(B, Dmax) is an optimal branching factor vector for N and Dmax.<br />
Proof. I will give a sketch of the proof. Let B ∗ = f(B, Dmax), and let us assume that there is<br />
another branching factor vector B ′ ̸= B ∗ that also satisfies the constraints of the optimization<br />
problem and R(B ′ ) > R(B ∗ ). I will show that this leads to a contradiction, hence B ∗ should be<br />
optimal.<br />
Let B ∗ = (b ∗ 1, b ∗ 2, . . . , b ∗ ℓ ∗) and B′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′). Recall that B∗ is obtained by first maximizing<br />
the first element in the vector, therefore, b ∗ 1 ≥ b ′ 1 must hold. If b ∗ 1 > b ′ 1, then R(B ∗ ) ≥ R(B ′ )<br />
by Lemma 3, and thus, B ′ cannot be a better vector than B ∗ . This means that b ∗ 1 = b ′ 1 must hold.<br />
We know that once b ∗ 1 is determined, the algorithm continues by maximizing the next element<br />
of B ∗ . Hence, b ∗ 2 ≥ b ′ 2 must hold. If b ∗ 2 > b ′ 2, then R(B ∗ ) ≥ R(B ′ ) by Lemma 4, and thus, B ′<br />
cannot be a better vector than B ∗ . This means that b ∗ 2 = b ′ 2 must hold too.<br />
By repeating this argument, finally, we arrive to the conclusion that B ∗ = B ′ must hold, which<br />
is a contradiction. ⋄<br />
Table 2.1 illustrates the operation of the algorithm for B = (5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90.<br />
The rows of the table correspond to the levels of the recursion during the execution. The column<br />
labeled with B ′ contains the prime factors that are joined at a given recursion level. The optimal<br />
branching factor vector can be read out from the last column of the table (each row contains one<br />
element of the vector). From this example, we can see that the optimal branching factor vector<br />
for N = 27000 and Dmax = 90 is B ∗ = (72, 5, 5, 5, 3). For the key-tree defined by this vector, we<br />
get R ≈ 0.9725, and D = 90.<br />
Table 2.1: Illustration of the operation of the recursive function f when called with B =<br />
(5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90. The rows of the table correspond to the levels of the recursion<br />
during the execution.<br />
recursion level B d B ′ ∏ (B ′ )<br />
1 (5, 5, 5, 3, 3, 3, 2, 2, 2) 90 (3, 3, 2, 2, 2) 72<br />
2 (5, 5, 5, 3) 18 (5) 5<br />
3 (5, 5, 3) 13 (5) 5<br />
4 (5, 3) 8 (5) 5<br />
5 (3) 3 (3) 3<br />
19
2. PRIVATE AUTHENTICATION<br />
2.4 Analysis of the general case<br />
So far, we have studied the case of a single compromised member. This already proved to be useful,<br />
because it allowed us to compare different key-trees and to derive a key-tree construction method.<br />
However, one may still be interested in what level of privacy is provided by a system in the general<br />
case when any number of members could be compromised. In this section, I address this problem.<br />
P <br />
<br />
<br />
<br />
P <br />
Figure 2.3: Illustration of what happens when several members are compromised. Just as in the<br />
case of a single compromised member, the members are partitioned into anonymity sets, but now<br />
the resulting subsets depend on the number of the compromised members, as well as on their<br />
positions in the tree. Nevertheless, the expected size of the anonymity set of a randomly selected<br />
member is still a good metric for the level of privacy provided by the system, although, in this<br />
general case, it is more difficult to compute.<br />
In what follows, we will need to refer to the non-leaf vertices of the key-tree, and for this reason,<br />
I introduce the labelling scheme that is illustrated in Figure 2.3. In addition, we need to introduce<br />
some further notations. I call a leaf compromised if it belongs to a compromised member, and I<br />
call a non-leaf vertex compromised if it lies on a path that leads to a compromised leaf in the tree.<br />
If vertex v is compromised, then<br />
Kv denotes the set of the compromised children of v, and kv = |Kv|;<br />
Pv denotes the set of subsets (anonymity sets) that belong to the subtree rooted at v (see<br />
Figure 2.3 for illustration); and<br />
¯ Sv denotes the average size of the subsets in Pv.<br />
We are interested in computing ¯ S ⟨−⟩. We can do that as follows:<br />
¯S ⟨−⟩ = ∑<br />
P ∈P ⟨−⟩<br />
|P | 2<br />
b1b2 . . . bℓ<br />
= ((b1 − k ⟨−⟩)b2 . . . bℓ) 2<br />
+<br />
b1b2 . . . bℓ<br />
∑<br />
= ((b1 − k ⟨−⟩)b2 . . . bℓ) 2<br />
b1b2 . . . bℓ<br />
∑<br />
v∈K ⟨−⟩ P ∈Pv<br />
+ 1<br />
b1<br />
∑<br />
v∈K ⟨−⟩<br />
In general, for any vertex ⟨i1, . . . , ij⟩ such that 1 ≤ j < ℓ − 1:<br />
¯S ⟨i1,...,ij⟩ = ((bj+1 − k ⟨i1,...,ij⟩)bj+2 . . . bℓ) 2<br />
bj+1 . . . bℓ<br />
20<br />
¯Sv<br />
+ 1<br />
bj+1<br />
|P | 2<br />
b1b2 . . . bℓ<br />
∑<br />
v∈K ⟨i1 ,...,i j ⟩<br />
¯Sv<br />
(2.9)<br />
(2.10)
Finally, for vertices ⟨i1, . . . , iℓ−1⟩ just above the leaves, we get:<br />
¯S ⟨i1,...,iℓ−1⟩ = (bℓ − k ⟨i1,...,iℓ−1⟩) 2<br />
bℓ<br />
2.4. Analysis of the general case<br />
+ k ⟨i1,...,iℓ−1⟩<br />
bℓ<br />
(2.11)<br />
Expressions (2.9 – 2.11) can be used to compute the expected anonymity set size in the system<br />
iteratively, in case of any number of compromised members. However, note that the computation<br />
depends not only on the number c of the compromised members, but also their positions in the tree.<br />
This makes the comparison of different systems difficult, because for a comprehensive analysis,<br />
all possible allocations of the compromised members over the leaves of the key-tree should be<br />
considered. Therefore, such a formula is preferred that depends solely on c, but characterizes the<br />
effect of compromised members on the level of privacy sufficiently well, so that it can serve as a<br />
basis for comparison of different systems. In the following, such a formula is derived based on the<br />
assumption that the compromised members are distributed uniformly at random over the leaves of<br />
the key-tree. In some sense, this is a pessimistic assumption as the uniform distribution represents<br />
the worst case, which leads to the largest amount of privacy loss due to the compromised members.<br />
Thus, the approximation that is derived can be viewed as a lower bound on the expected anonymity<br />
set size in the system when c members are compromised.<br />
Let the branching factor of the key-tree be B = (b1, b2, . . . , bℓ), and let c be the number of<br />
compromised leaves in the tree. We can estimate k ⟨−⟩ for the root as follows:<br />
k ⟨−⟩ ≈ min(c, b1) = k0<br />
(2.12)<br />
If a vertex ⟨i⟩ at the first level of the tree is compromised, then the number of compromised<br />
leaves in the subtree rooted at ⟨i⟩ is approximately c/k0 = c1. Then, we can estimate k ⟨i⟩ as<br />
follows:<br />
k ⟨i⟩ ≈ min(c1, b2) = k1<br />
(2.13)<br />
In general, if vertex ⟨i1, . . . , ij⟩ at the j-th level of the tree is compromised, then the number<br />
of compromised leaves in the subtree rooted at ⟨i1, . . . , ij⟩ is approximately cj−1/kj−1 = cj, and<br />
we can use this to approximate k ⟨i1,...,ij⟩ as follows:<br />
k ⟨i1,...,ij⟩ ≈ min(cj, bj+1) = kj<br />
(2.14)<br />
Using these approximations in expressions (2.9 – 2.11), we can derive an approximation for<br />
¯S ⟨−⟩, which is denoted by ¯ S0, in the following way:<br />
¯Sℓ−1 = (bℓ − kℓ−1) 2<br />
. . . . . .<br />
bℓ<br />
+ kℓ−1<br />
bℓ<br />
¯Sj = ((bj+1 − kj)bj+2 . . . bℓ) 2<br />
. . . . . .<br />
bj+1 . . . bℓ<br />
¯S0 = ((b1 − k0)b2 . . . bℓ) 2<br />
b1 . . . bℓ<br />
+ k0 ¯S1<br />
b1<br />
+ kj ¯Sj+1<br />
bj+1<br />
(2.15)<br />
(2.16)<br />
(2.17)<br />
Note that expressions (2.17 – 2.15) do not depend on the positions of the compromised leaves<br />
in the tree, but they depend only on the value of c.<br />
In order to see how well ¯ S0 estimates ¯ S ⟨−⟩, some simulations are run. The simulation parameters<br />
are the following:<br />
total number of members N = 27000;<br />
upper bound on the maximum authentication delay Dmax = 90;<br />
Two branching factor vectors are considered: (30, 30, 30) and (72, 5, 5, 5, 3);<br />
The number c of compromised members is varied between 1 and 270 with a step size of one.<br />
21
2. PRIVATE AUTHENTICATION<br />
For each value of c, I run 100 simulations 4 . In each simulation run, the c compromised members<br />
were chosen uniformly at random from the set of all members. The exact value of the normalized<br />
expected anonymity set size ¯ S ⟨−⟩/N is computed using the expressions (2.9 – 2.11). Finally, the<br />
obtained values are averaged over all simulation runs. Moreover, for every c, I also computed the<br />
estimated value ¯ S0/N using the expressions (2.15 – 2.17).<br />
The simulation results are shown in Figure 2.4. The figure does not show the confidence<br />
interwalls, because they are very small (in the range of 10 −4 for all simulations) and thus they<br />
could be hardly visible. As we can see, ¯ S0/N approximates ¯ S ⟨−⟩/N quite well, and in general it<br />
provides a lower bound on the normalized expected anonymity set size.<br />
Normalized average anonymity set size<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Simulation result for (S /N)<br />
Approximation (S 0 /N)<br />
0<br />
0 50 100 150 200 250 300<br />
Number of compromised members (c)<br />
Normalized average anonimity set size<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Simulation result for (S /N)<br />
Approximation (S 0 /N)<br />
0<br />
0 50 100 150 200 250 300<br />
Number of compromised members (c)<br />
Figure 2.4: Simulation results for branching factor vectors (30, 30, 30) (left hand side) and<br />
(72, 5, 5, 5, 3) (right hand side). As we can see, ¯ S0/N approximates ¯ S ⟨−⟩/N quite well, and in<br />
general it provides a lower bound on it.<br />
In Figure 2.5, the value of ¯ S0/N is plotted as a function of c for different branching factor<br />
vectors. This figure illustrates, how different systems can be compared using the approximation<br />
¯S0/N of the normalized expected anonymity set size. On the left hand side of the figure, we<br />
can see that the value of ¯ S0/N is greater for the vector B ∗ = (72, 5, 5, 5, 3) than for the vector<br />
B = (30, 30, 30) not only for c = 1 (as we saw before), but for larger values of c too. In fact, B ∗<br />
seems to lose its superiority only when the value of c approaches 60, but at this range, the systems<br />
nearly provide no privacy in any case. Thus, we can conclude that B ∗ is a better branching factor<br />
vector yielding more privacy than B in general.<br />
We can make another interesting observation on the left hand side of Figure 2.5: ¯ S0/N starts<br />
decreasing sharply as c starts increasing, however, when c gets close to the value of the first element<br />
of the branching factor vector, the decrease of ¯ S0/N slows down. Moreover, almost exactly when<br />
c reaches the value of the first element (30 in case of B, and 72 in case of B ∗ ), ¯ S0/N seems<br />
to turn into constant, but at a very low value. We can conclude that, just as in the case of a<br />
single compromised member, in the general case too, the level of privacy provided by the system<br />
essentially depends on the value of the first element of the branching factor vector. The plot on the<br />
right hand side of the figure reinforces this observation: it shows ¯ S0/N for two branching factor<br />
vectors that have the same first element but that differ in the other elements. As we can see, the<br />
curves are almost perfectly overlapping.<br />
Thus, a practical design principle for key-tree based private authentication systems is to maximize<br />
the branching factor at the first level of the key-tree. Further optimization by adjusting the<br />
branching factors of the lower levels may still be possible, but the gain is not significant; what<br />
really counts is the branching factor at the first level.<br />
4 All computations have been done in Matlab, and for the purpose of repeatability, the source code is available<br />
on-line at http://www.crysys.hu/∼holczer/PET2006<br />
22
Estimated normalised average anonimity set size (S 0 /N)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
B = [72 5 5 5 3]<br />
B = [30 30 30]<br />
0<br />
0 20 40 60 80 100<br />
Number of compromised members (c)<br />
Estimated normalised average anonimity set size (S 0 /N)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
2.5. The group-based approach<br />
B = [60 30 15]<br />
B = [60 5 5 3 3 2]<br />
0<br />
0 20 40 60 80 100<br />
Number of compromised members (c)<br />
Figure 2.5: The value of ¯ S0/N as a function of c for different branching factor vectors. The figure<br />
illustrates, how different systems can be compared based on the approximation ¯ S0/N. On the left<br />
hand side, we can see that the value of ¯ S0/N is greater for the vector (72, 5, 5, 5, 3) than for the<br />
vector (30, 30, 30) not only for c = 1 (as we saw earlier), but for larger values of c too. On the<br />
right hand side, we can see that ¯ S0/N is almost the same for the vector (60, 5, 5, 3, 3, 2) as for the<br />
vector (60, 30, 15). We can conclude that ¯ S0/N is essentially determined by the value of the first<br />
element of the branching factor vector.<br />
2.5 The group-based approach<br />
In the group based authentication scheme, the set of all tags is divided into groups of equal size,<br />
and all tags of a given group share a common group key. Since the group keys do not enable<br />
the reader to identify the tags uniquely, every tag also stores a unique identifier. Keys are secret<br />
(each group key is known only to the reader and the members of the corresponding group), but<br />
identifiers can be public. To avoid impersonation of a tag from the same group, every tag has a<br />
unique secret key as well. This key is only shared between the tag and the reader. To reduce the<br />
storage demands on the reader side, the pairwise key can be generated from a master key using<br />
the identifier of the tag.<br />
In order to authenticate a tag, the reader sends a single challenge to the tag. The answer of the<br />
tag has two parts. In the first part, the tag answers to the reader by encrypting with the group key<br />
the reader’s challenge concatenated with a nonce picked by the tag, and the tag’s identifier. In the<br />
second part, the tag encrypts the challenge concatenated with the nonce using its own secret key.<br />
Encrypting the identifier is needed since the key used for encryption does not identify uniquely the<br />
tag. Upon reception of the answer, the reader identifies the tag by trying all the group keys until<br />
the decryption succeeds. Then it checks the second part, that it was encrypted by the same tag.<br />
Without the second part, every tag could impersonate every other tag in the same group.<br />
The operation of the group-based private authentication scheme is illustrated in Figure 2.6.<br />
The complexity of the group-based scheme for the reader depends on the number of the groups.<br />
In particular, if there are γ groups, then, in the worst case, the reader must try γ keys. Therefore,<br />
if the upper bound on the worst case complexity is given as a design parameter, then γ is easily<br />
determined. For example, to get the same complexity as in the key-tree based scheme with constant<br />
branching factor, one may choose γ = (b log b N) − 1, where N is the total number of tags and b is<br />
the branching factor of the key-tree. The minus one indicates the decryption of the second part of<br />
the message.<br />
An immediate advantage of the group-based scheme with respect to the key-tree based approach<br />
is that the tags need to store only two keys and an identifier. In contrast to this, in the key-tree<br />
based scheme, the number of keys stored by the tags depends on the depth of the tree. For instance,<br />
in the case of the Molnar-Wagner scheme, the tags must store log b N keys. Moreover, by using<br />
only two keys, this scheme also has a smaller complexity for the tag in terms of computation and<br />
communication.<br />
Besides its advantages with respect to complexity, the group-based scheme provides a higher<br />
23
2. PRIVATE AUTHENTICATION<br />
Reader R Tag T<br />
Pick R1<br />
Try all group keys<br />
until K is found<br />
Check ID's own key<br />
R1<br />
EK(R1|R2|ID) EKID (R1|R2)<br />
Pick R2<br />
Figure 2.6: Operation of the group-based private authentication scheme. K is the group key stored<br />
by the tag,KID is the tag’s own secret key, ID is the identifier of the tag, R1 and R2 are random<br />
values generated by the reader and the tag, respectively, | denotes concatenation, and EK() denotes<br />
symmetric-key encryption with K.<br />
k1<br />
k1,1 k1,2 k2,1 k2,2<br />
k1,1,1 k1,1,2 k1,2,1 k1,2,2 k2,1,1 k2,1,2 k2,2,1 k2,2,2<br />
k2<br />
K1 K2 K3 K4<br />
KID1 KID2 KID3 KID4 KID5 KID6 KID7 KID8<br />
Figure 2.7: On the left hand side: The tree-based authentication protocol uses a tree, where<br />
the tags correspond to the leaves of the tree. Each tag stores the keys along the path from the<br />
root to the leaf corresponding to the given tag. When authenticating itself, a tag uses all of its<br />
keys. The reader identifies which keys have been used by iteratively searching through the keys<br />
at the successive levels of the tree. On the right hand side: In the group-based authentication<br />
protocol, the tags are divided into groups. Each tag stores its group key and its own key. When<br />
authenticating itself, a tag uses its group key first, and then its own key. The reader identifies<br />
which group key has been used by trying all group keys, then it checks the tags own key.<br />
level of privacy than the key-tree based scheme when some of the tags are compromised. I will<br />
show this in Section 2.7.<br />
2.6 Analysis of the group based approach<br />
The metric proposed in Section 2.2 is based on the observation that when some tags are compromised,<br />
the set of all tags become partitioned such that the adversary cannot distinguish the tags<br />
that belong to the same subset, but she can distinguish the tags that belong to different subsets.<br />
Hence, the subsets are the anonymity sets of their members. The level R of privacy provided by the<br />
scheme is then characterized as the average anonymity set size normalized with the total number<br />
N of the tags. Formally,<br />
R = 1<br />
N<br />
∑<br />
i<br />
|Pi| |Pi|<br />
N<br />
= 1<br />
N 2<br />
∑<br />
i<br />
|Pi| 2<br />
(2.18)<br />
where |Pi| denotes the size of subset Pi and |Pi|/N is the probability that a randomly chosen tag<br />
belongs to subset Pi.<br />
In the group-based scheme, a similar kind of partitioning can be observed when tags become<br />
compromised. In particular, when a single tag is compromised, the adversary learns the group<br />
key of that tag, which allows her to distinguish the tags within this group from each other (since<br />
24
2.6. Analysis of the group based approach<br />
the tags use their identifiers in the protocol) and from the rest of the tags in the system. This<br />
means that each member of the compromised group forms an anonymity set of size 1, and the<br />
remaining tags form another anonymity set. In general, when more tags are compromised, we can<br />
observe that the partitioning depends on the number C of the compromised groups, where a group<br />
is compromised if at least one tag that belongs to that group is compromised. More precisely,<br />
when C groups are compromised, we get nC anonymity sets of size 1 and an anonymity set of size<br />
n(γ − C), where γ is the number of groups and n = N/γ is the size of a group. This results in the<br />
following expression for the level R of the privacy according to the metric (2.18):<br />
R = 1<br />
N 2<br />
( 2<br />
nC + (n(γ − C)) )<br />
(2.19)<br />
If tags are compromised randomly, then C, and hence, R are random variables, and the level of<br />
privacy provided by the system is characterized by the expected value of R. In order to compute<br />
that, we must compute the expected value of C and that of C 2 . This can be done as follows: let<br />
us denote by Ai the event that at least one tag from the i-th group is compromised, and let IAi<br />
be Ai’s indicator function. The probability of Ai can be calculated as follows:<br />
(<br />
N − n<br />
)<br />
P (Ai) = 1 −<br />
c<br />
(<br />
N<br />
c<br />
) = (2.20)<br />
c−1 ∏<br />
(<br />
= 1 − 1 − n<br />
)<br />
N − j<br />
(2.21)<br />
j=0<br />
The expected value of C is the expected value of the sum of the indicator functions:<br />
E [C] = E<br />
[ γ∑<br />
i=1<br />
IAi<br />
]<br />
j=0<br />
=<br />
γ∑<br />
P (Ai) = (2.22)<br />
i=1<br />
⎛<br />
c−1 ∏<br />
(<br />
= γ ⎝1 − 1 − n<br />
)<br />
N − j<br />
⎞<br />
⎠ (2.23)<br />
Similarly, the second moment of C can be computed as follows:<br />
= E<br />
[ γ∑<br />
i=1<br />
E [ C 2] = E<br />
IAi<br />
[ γ∑<br />
] ⎡<br />
+ E ⎣ ∑<br />
i̸=j<br />
i=1<br />
IAi<br />
IAi∩Aj<br />
] 2<br />
⎤<br />
= (2.24)<br />
⎦ = (2.25)<br />
= E [C] + ( γ 2 − γ ) P (Ai ∩ Aj) (2.26)<br />
Finally, probability P (Ai ∩ Aj) can be computed in the following way:<br />
P (Ai ∩ Aj) = (2.27)<br />
= 1 − P ( ) ( )<br />
Ai ∩ Aj − 2P Ai ∩ Aj<br />
(2.28)<br />
25
2. PRIVATE AUTHENTICATION<br />
P ( (<br />
N − 2n<br />
)<br />
)<br />
Ai ∩ Aj =<br />
c<br />
(<br />
N<br />
c<br />
) = (2.29)<br />
c−1 ∏<br />
(<br />
= 1 − 2n<br />
)<br />
N − j<br />
(2.30)<br />
j=0<br />
j=0<br />
P ( ) ( ) ( )<br />
Ai ∩ Aj = P Ai|Aj P Aj = (2.31)<br />
⎡<br />
c−1 ∏<br />
(<br />
)<br />
= ⎣1<br />
n<br />
− 1 −<br />
N − n − j<br />
⎤<br />
⎦ · (2.32)<br />
c−1 ∏<br />
(<br />
· 1 − n<br />
)<br />
N − j<br />
j=0<br />
(2.33)<br />
Based on the above formulae, the expected value of R is computed as a function of c for N = 2 14<br />
and γ = 64. The results are plotted on the left hand side of Figure 2.8. The same plot also contains<br />
the results of a Matlab simulation with the same parameters, where we chose the c compromised<br />
tags uniformly at random. For each value of c, 10 simulations are run, computed the exact values<br />
of the average anonymity set size using (2.19) directly, and averaged the results. As it can be<br />
seen in the figure, the analytical results match the results of the simulation. I performed the same<br />
verification for several other values of N and γ, and in each case, I obtained the same matching<br />
results.<br />
2.7 Comparison of the group and the key-tree based approach<br />
In this section, I compare the group-based scheme to the key-tree based scheme. The methodology<br />
is the following: for a given number N of tags and upper bound γ on the worst case complexity for<br />
the reader, I determine the optimal key-tree using the algorithm proposed in 2.3. Then, I compare<br />
the level of privacy provided by this optimal key-tree to that provided by the group-based scheme<br />
with γ groups and N tags.<br />
The comparison is performed by means of simulations. A simulation run consists in randomly<br />
choosing c compromised tags, and computing the resulting normalized average anonymity set size<br />
R for both the optimal key-tree and the group-based scheme. For the former, we can use the<br />
formulae (2.15 – 2.17), while for the latter, I use formula (2.19) directly. For each value of c, I run<br />
several simulation runs, and average the results.<br />
The simulation parameters were the following: for the number N of tags, only powers of 2<br />
are considered, because in practice, that number is related to the size of the identifier space, and<br />
identifiers are usually represented as binary strings. Thus, in the simulations, N = 2 x , and x is<br />
varied between 10 and 15 with a step size of 1. The values for the worst case complexity γ (which<br />
coincides with the number of groups in the group-based scheme) were 64, 128, and 256. Finally,<br />
the number c of compromised tags from 1 to 3γ is varied. For each combination of these values,<br />
100 simulation runs were performed.<br />
The right hand side of Figure 2.8 shows the results that we can obtain for N = 2 10 and γ = 64.<br />
The plots corresponding to the other simulation settings are not included, because they are very<br />
similar to the one in Figure 2.8. As we can see, the group-based scheme provides a higher level of<br />
privacy when the number of compromised tags does not exceed a threshold. Above the threshold,<br />
the key-tree based scheme becomes better, however, in this region, both schemes provide virtually<br />
26
Level of privacy (R)<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
Simulation result<br />
Formal result<br />
0<br />
0 50 100 150 200<br />
Number of compromised members (c)<br />
Level of privacy(R)<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
2.8. Related work<br />
Tree based authentication<br />
Group based authentication<br />
0<br />
0 50 100 150 200<br />
Number of compromised members (c)<br />
Figure 2.8: On the left hand side: The analytical results obtained for the expected value of R<br />
match the averaged results of ten simulations. The parameters are: N = 2 14 and γ = 64. On the<br />
right hand side: Results of the simulation aiming at comparing the key-tree based scheme and<br />
the group-based scheme. The curves show the level R of privacy as a function of the number c of<br />
the compromised tags. The parameters are: N = 2 10 and γ = 64. The confidence intervals are<br />
not shown, because they are in the range of 10 −3 , and therefore, they would be hardly visible. As<br />
we can see, the group-based scheme achieves a higher level of privacy when c is below a threshold.<br />
Above the threshold, the key-tree based approach is slightly better, however, in this region, both<br />
schemes provide virtually no privacy.<br />
no privacy. Thus, for any practical purposes, the group-based scheme is better than the key-tree<br />
based scheme (even if optimal key-trees are used).<br />
2.8 Related work<br />
The problem of private authentication has been extensively studied in the literature recently, but<br />
most of the proposed solutions are based on public key cryptography. One example is Idemix, which<br />
is a practical anonymous credential system proposed by Camenisch and Lysyanskaya in [Camenisch<br />
and Lysyanskaya, 2001]. Idemix allows for unlinkable demonstration of the possession of various<br />
credentials, and it can be used in many applications. However, it is not applicable in resource<br />
constraint scenarios, such as low-cost RFID systems. For such applications, solutions based on<br />
symmetric key cryptography seem to be the only viable options. A comprehensive bibliography<br />
of RFID related privacy problems is maintained by Avoine in [Avoine, 2012]. A recent survey of<br />
RFID privacy approaches is published by Langheinrich [Langheinrich, 2009], where he overviews<br />
60 papers in this field. Another important paper by Syamsuddin et al. survey the hash chain<br />
based RFID authentication protocols in [Syamsuddin et al., 2008]. In the following I try to focus<br />
on the methods similar to the ones described in this thesis, and encourage the reader to read the<br />
aforementioned surveys for a broader view.<br />
The key-tree based approach for symmetric key private authentication has been proposed by<br />
Molnar and Wagner in [Molnar and Wagner, 2004]. However, they use a simple b-ary tree, which<br />
means that the tree has the same branching factor at every level. Moreover, they do not analyze<br />
the effects of compromised members on the level of privacy provided. They only mention that<br />
compromise of a member has a wider effect than in the case of public key cryptography based<br />
solutions.<br />
An entropy based analysis of key trees can be found in [Nohara et al., 2005]. Nohara et al.<br />
prove that their K-steps ID matching scheme (which is very similar to [Molnar and Wagner, 2004])<br />
is secure against one compromised tag, if the number of nodes are large enough. They consider<br />
only b-ary trees, no variable branching factors.<br />
27
2. PRIVATE AUTHENTICATION<br />
Avoine et al. analyze the effects of compromised members on privacy in the key-tree based<br />
approach [Avoine et al., 2005]. They study the case of a single compromised member, as well as<br />
the general case of any compromised members. However, their analysis is not based on the notion<br />
of anonymity sets. In their model, the adversary is first allowed to compromise some members,<br />
then it chooses a target member that it wants to trace, and it is allowed to interact with the chosen<br />
member. Later, the adversary is given two members such that one of them is the target member<br />
chosen by the adversary. The adversary can interact with the given members, and it must decide<br />
which one is its target. The level of privacy provided by the system is quantified by the success<br />
probability of the adversary.<br />
Beye and Veugen goes a little further and analyze, what happens if the attacker has access<br />
to side channel information, and adapts the attack dynamically [Beye and Veugen, 2012]. They<br />
analyze the case of key trees described in this chapter as well and generalize the problem by setting<br />
only a minimum on N in [Beye and Veugen, 2011].<br />
2.9 Conclusion<br />
Key-trees provide an efficient solution for private authentication in the symmetric key setting.<br />
However, the level of privacy provided by key-tree based systems decreases considerably if some<br />
members are compromised. This loss of privacy can be minimized by the careful design of the<br />
tree. Based on the results presented in this chapter, we can conclude that a good practical design<br />
principle is to maximize the branching factor at the first level of the tree such that the resulting<br />
tree still respects the constraint on the maximum authentication delay in the system. Once the<br />
branching factor at the first level is maximized, the tree can be further optimized by maximizing<br />
the branching factors at the successive levels, but the improvement achieved in this way is not<br />
really significant; what really counts is the branching factor at the first level.<br />
In the second part of this chapter, I proposed a novel group based private authentication scheme.<br />
I analyzed the proposed scheme and quantified the level of privacy that it provides. I compared<br />
the group based scheme to the key-tree based scheme originally proposed by Molnar and Wagner,<br />
and later optimized by me in the first half of this chapter. I showed that the group based scheme<br />
provides a higher level of privacy than the key-tree based scheme. In addition, the complexity of<br />
the group based scheme for the verifier can be set to be the same as in the key-tree based scheme,<br />
while the complexity for the prover is always smaller in the latter scheme. The primary application<br />
area of the schemes are that of RFID systems, but it can also be used in applications with similar<br />
characteristics (e.g., in wireless sensor networks).<br />
2.10 Related publications<br />
[Buttyan et al., 2006a] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Optimal key-trees<br />
for tree-based private authentication. In Proceedings of the International Workshop on Privacy<br />
Enhancing Technologies (PET), June 2006. Springer.<br />
[Buttyan et al., 2006b] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Providing location<br />
privacy in automated fare collection systems. In Proceedings of the 15th IST Mobile and Wireless<br />
Communication Summit, Mykonos, Greece, June 2006.<br />
[Avoine et al., 2007] Gildas Avoine, Levente Buttyan, Tamas Holczer, and Istvan Vajda. Groupbased<br />
private authentication. In Proceedings of the International Workshop on Trust, Security,<br />
and Privacy for Ubiquitous Computing (TSPUC 2007). IEEE, 2007.<br />
28
Chapter 3<br />
Location Privacy in Vehicular Ad Hoc<br />
Networks<br />
3.1 Introduction<br />
In this chapter, I investigate what level of privacy a driver can achieve in Vehicular Ad Hoc<br />
Networks (VANET). More specifically, in the first half of this chapter, I investigate how can a local<br />
eavesdropping attacker trace the vehicles based on their frequently sent status information. In the<br />
second half of this chapter (from Section 3.4), I go a little further in terms of strength of attacker,<br />
and check what can a global eavesdropping attacker do. After realizing its broad capabilities, I<br />
suggest an algorithm, which can greatly reduce the attackers success rate.<br />
Recently, initiatives to create safer and more efficient driving conditions have begun to draw<br />
strong support in Europe [COM], in the US [VSC], and in Japan [ASV]. Vehicular communications<br />
will play a central role in this effort, enabling a variety of applications for safety, traffic<br />
efficiency, driver assistance, and entertainment. However, besides the expected benefits, vehicular<br />
communications also have some potential drawbacks. In particular, many envisioned safety related<br />
applications require that the vehicles continuously broadcast their current position and speed in<br />
so called heart beat messages. This allows the vehicles to predict the movement of other nearby<br />
vehicles and to warn the drivers if a hazardous situation is about to occur. While this can certainly<br />
be advantageous, an undesirable side effect is that it makes it easier to track the physical location<br />
of the vehicles just by eavesdropping these heart beat messages.<br />
One approach to solve this problem is that the vehicles broadcast their messages under pseudonyms<br />
that they change with some frequency [Raya and Hubaux, 2005]. The change of a pseudonym<br />
means that the vehicle changes all of its physical and logical addresses at the same time. Indeed, in<br />
most of the applications, the important thing is to let other vehicles know that there is a vehicle at<br />
a given position moving with a given speed, but it is not really important which particular vehicle<br />
it is. Thus, using pseudonyms is just as good as using real identifiers as far as the functionality of<br />
the applications is concerned. Obviously, these pseudonyms must be generated in such a way that<br />
a new pseudonym cannot be directly linked to previously used pseudonyms of the same vehicle.<br />
Unfortunately, only changing pseudonyms is largely ineffective against a global eavesdropper<br />
that can hear all communications in the network. Such an adversary can predict the movement of<br />
the vehicles based on the position and speed information in the heart beat messages, and use this<br />
prediction to link different pseudonyms of the same vehicle together with high probability. For<br />
instance, if at time t, a given vehicle is at position ⃗p and moves with speed ⃗v, then after some short<br />
time τ, this vehicle will most probably be at position ⃗p + τ · ⃗v. Therefore, the adversary will know<br />
that the vehicle that reports itself at (or near to) position ⃗p + τ ·⃗v at time t + τ is the same vehicle<br />
as the one that reported itself at position ⃗p at time t, even if in the meantime, the vehicle changed<br />
29
3. LOCATION PRIVACY IN VANETS<br />
pseudonym. This problem can be solved with some silent periods. This is discussed in the second<br />
part of this chapter (from Section 3.4).<br />
On the other hand, the assumption that the adversary can eavesdrop all communications in the<br />
network is a very strong one. In many situations, it is more reasonable to assume that the adversary<br />
can monitor the communications only at a limited number of places and only in a limited range.<br />
In this case, if a vehicle changes its pseudonym within the non-monitored area, then there is a<br />
chance that the adversary loses its trace. My goal in the first half of the chapter is to characterize<br />
this chance as a function of the strength of the adversary (i.e., its monitoring capabilities). In<br />
the second part of the chapter, I assume a relatively small area, where a global eavesdropping is<br />
reasonable. I analyze what a global attacker can do, and suggest a simple algorithm to reduce the<br />
capabilities of a global attacker. In particular, my main contributions are the following:<br />
I define a model in which the effectiveness of changing pseudonyms can be studied. I emphasize<br />
that while changing pseudonyms has already been proposed in the literature as a<br />
countermeasure to track vehicles [Raya and Hubaux, 2005], to the best of my knowledge, the<br />
effectiveness of this method has never been investigated rigorously in this context. My model<br />
is based on the concept of the mix zone. This concept was first introduced in [Beresford and<br />
Stajano, 2003], but again, to the best of my knowledge, it has not been used in the context<br />
of vehicular networks so far. I characterize the tracking strategy of the adversary in the mix<br />
zone model, and I introduce a metric to quantify the level of privacy provided by the mix<br />
zone.<br />
I report on the results of an extensive simulation where I used my model to determine the<br />
level of privacy achieved in realistic scenarios. In particular, in my simulation, I used a rather<br />
complex road map, generated traffic with realistic parameters, and varied the strength of the<br />
adversary by varying the number of her monitoring points. As expected, my simulation<br />
results confirm that the level of privacy decreases as the strength of the adversary increases.<br />
However, in addition to this, my simulation results provide detailed information about the<br />
relationship between the strength of the adversary and the level of privacy achieved by<br />
changing pseudonyms.<br />
In Section 3.5, I provide a breakdown of the requirements that a system must address in order<br />
to provide privacy. The aim is to provide an analytical framework that future researchers<br />
can use to concisely state which aspects of privacy a new proposal does or does not address.<br />
In Section 3.6, I propose an approach for implementing mix zones that does neither require<br />
extensive RSU support nor complex communication between vehicles, and that does not<br />
endanger safety-of-life to any significant extent, while providing both syntactic mixing and<br />
semantic mixing (in the language of Section 3.5). To my knowledge, this is the first proposal<br />
that provides for semantic mixing while at the same time addressing the safety-of-life concerns<br />
that naturally arise when a vehicle tries to obscure its path. The key insights are simply that<br />
vehicles traveling at a low speed are less likely to cause fatal accidents, and that vehicles will<br />
be traveling at a low speed at natural mix-points such as signalled intersections. The main<br />
body of experimental work in Section 3.6 is therefore an investigation of the consequences<br />
for the untraceability of vehicles if they stop sending heartbeat messages when their speed<br />
drops below a certain threshold and change all their identifiers after such silent periods. I<br />
call my scheme SLOW, which stands for silence at low speeds. (I note that of course SLOW<br />
is not a full solution to untraceability, as it does not cover the safe use of silent periods at<br />
high speeds; other techniques will need to be used to give untraceability in this case).<br />
The organization of the chapter is the following: in Section 3.2, I introduce the mix zone<br />
model, I define the behavior of the adversary in this model, and I introduce my privacy metric.<br />
In Section 3.3, I describe my simulation setting and the simulation results for the mix zones. In<br />
Section 3.4 I introduce the global attacker scenario. Then I introducing my overall analytical<br />
framework in Section 3.5. Next, in Section 3.6, I introduce my attacker model and my proposed<br />
30
3.2. Model of local attacker and mix zone<br />
solution, and in Section 3.7, I present the results of my experiments showing that my approach<br />
does indeed make tracing of vehicles hard for the attacker, and that it is usable in the real world.<br />
Finally, I report on some related work in Section 3.8, and conclude the chapter in Section 3.9.<br />
3.2 Model of local attacker and mix zone<br />
3.2.1 The concept of the mix zone<br />
I consider a continuous part of a road network, such as a whole city or a district of a city. I assume<br />
that the adversary installed some radio receivers at certain points of the road network with which<br />
she can eavesdrop the communications of the vehicles, including their heart beat messages, in a<br />
limited range. On the other hand, outside the range of her radio receivers, the adversary cannot<br />
hear the communications of the vehicles.<br />
Thus, the road network is divided into two distinct regions: the observed zone and the unobserved<br />
zone. Physically, these zones may be scattered, possibly consisting of many observing<br />
spots and a large unobserved area, but logically, the scattered observing spots can be considered<br />
together as a single observed zone. This is illustrated on the left hand side of Figure 3.1.<br />
observation<br />
spots<br />
observed zone<br />
4<br />
1<br />
3<br />
2<br />
6<br />
5<br />
1<br />
mix zone<br />
2 3<br />
Figure 3.1: On the left hand side: The figure illustrates how a road network is divided into<br />
an observed and an unobserved zone in the model. In the figure, the observed zone is grey, and<br />
the unobserved zone is white. The unobserved zone functions as a mix zone, because the vehicles<br />
change pseudonyms and mix within this zone making it difficult for the adversary to track them.<br />
On the right hand side: The figure illustrates how the road network on the left can be abstracted<br />
as single mix zone with six ports.<br />
Note that the vehicles do not know where the adversary installed her radio receivers, or in<br />
other words, when they are in the observed zone. For this reason, we can assume that the vehicles<br />
continuously change their pseudonyms 1 . In this part of the chapter, we can abstract away the<br />
frequency of the pseudonym changes, and we can simply assume that it is high enough so that<br />
every vehicle surely changes pseudonym while in the unobserved zone. I intend to relax this<br />
assumption in my future work.<br />
Since the vehicles change pseudonyms while in the unobserved zone, that zone functions as a<br />
mix zone for vehicles (see the right hand side of Figure 3.1 for illustration). A mix zone [Beresford<br />
and Stajano, 2003; Beresford and Stajano, 2004] is similar to a mix node of a mix network [Chaum,<br />
1981], which changes the encoding and the order of messages in order to make it difficult for the<br />
adversary to link message senders and message receivers. In my case, the mix zone makes it<br />
difficult for the adversary to link the vehicles that emerge from the mix zone to those that entered<br />
it earlier. Thus, the mix zones makes it difficult to track vehicles. On the other hand, based on the<br />
observation that I made in the Introduction, I assume that the adversary can track the physical<br />
location of the vehicles while they are in the observed zone, despite the fact that they may change<br />
pseudonyms in that zone too.<br />
1 Otherwise, if the vehicles knew when they are in the unobserved zone, then it would be sufficient to change their<br />
pseudonyms only once while they are in the unobserved zone.<br />
31<br />
ports<br />
6<br />
5<br />
4
3. LOCATION PRIVACY IN VANETS<br />
Since the vehicles move on roads, they cannot cross the border between the mix zone and the<br />
observed zone at any arbitrary point. Instead, the vehicles cross the border where the roads cross<br />
it. We can model this by assuming that the mix zone has ports, and the vehicles can enter and<br />
exit the mix zone only via these ports. For instance, on the right hand side of Figure 3.1, the ports<br />
are numbered from 1 to 6.<br />
3.2.2 The model of the mix zone<br />
While the adversary cannot observe the vehicles within the mix zone, we can assume that she still<br />
has some knowledge about the mix zone. This knowledge is subsumed in a model that consists of a<br />
matrix Q = [qij] of size M × M, where M is the number of ports of the mix zone, and M 2 discrete<br />
probability density functions fij(t) (1 ≤ i, j ≤ M). qij is the conditional probability of exiting the<br />
mix zone at port j given that the entry point was port i. fij(t) describes the probability distribution<br />
of the delay when traversing the mix zone between port i and port j. We can assume that time<br />
is slotted, that is why fij(t) is a discrete function. I note here, that it is unlikely for an attacker<br />
to achieve such a comprehensive knowledge of the mix zone. However it is not impossible with<br />
comprehensive real world measurements to approximate the needed probabilities and functions. In<br />
the rest of the chapter, we can consider the worst case (as it is advisable in the field of security),<br />
the attacker knows the model of the mix zone.<br />
3.2.3 The operation of the adversary<br />
The adversary knows the model of the mix zone and she observes events, where an event is a<br />
pair consisting of a port (port number) and a time stamp (time slot number). There are entering<br />
events and exiting events corresponding to vehicles entering and exiting the mix zone, respectively.<br />
Naturally, an entering event consists of the port where the vehicle entered the mix zone, and the<br />
time when this happened. Similarly, an exiting event consists of the port where the vehicle left the<br />
mix zone, and the time when this happened.<br />
The general objective of the adversary is to relate exiting events to entering events. More<br />
specifically, in the model, the adversary picks a vehicle v in the observed zone and tracks its<br />
movement until it enters the mix zone. In the following, I denote the port at which v entered<br />
the mix zone by s. Then, the adversary observes the exiting events for a time T such that the<br />
probability that v leaves the mix zone before T is close to 1 (i.e., Pr{tout < T } = 1 − ϵ, where ϵ is a<br />
small number, typically, in the range of 0.005 − 0.01, and tout is the random variable denoting the<br />
time at which the selected vehicle v exits the mix zone). For each exiting vehicle v ′ , the adversary<br />
determines the probability that v ′ is the same as v. For this purpose, she uses her observations and<br />
the model of the mix zone. Finally, she decides which exiting vehicle corresponds to the selected<br />
vehicle v.<br />
The decision algorithm used by the adversary is intuitive and straightforward: the adversary<br />
knows that the selected vehicle v entered the mix zone at port s and in timeslot 0. For each exiting<br />
event k = (j, t) that the adversary observes afterwards, she can compute the probability pjt that<br />
k corresponds to the selected vehicle as pjt = qsjfsj(t) (i.e., the probability that v chooses port<br />
j as its exit port given that it entered the mix zone at port s multiplied by the probability that<br />
it covers the distance between ports s and j in time t). The adversary decides for the vehicle for<br />
which pjt is maximal. The adversary is successful if the decided vehicle is indeed v.<br />
Indeed, the above described decision algorithm realized the Bayesian decision (see the Section<br />
3.2.4 for more details). The importance of this fact is that the Bayesian decision minimizes<br />
the error probability, thus, it is in some sense the ideal decision algorithm for the adversary.<br />
3.2.4 Analysis of the adversary<br />
In this section, I show that the decision algorithm of the adversary described in Subsection 3.2.3<br />
realizes a Bayesian decision. The following notations are used:<br />
32
3.2. Model of local attacker and mix zone<br />
k is an index of a vector. Every port-timeslot pair can be mapped to such an index and<br />
k can be mapped back to a port-timeslot pair. Therefore indices and port-timeslot pairs<br />
are interchangeable, and in the following discussion, I always use the one which makes the<br />
presentation simpler.<br />
k ∈ 1 . . . M · T , where M is the number of ports, and T is the length of the attack measured<br />
in timeslots.<br />
C = [ck] is a vector, where ck is the number of cars leaving the mix zone at k during the<br />
attack.<br />
N is the number of cars leaving the mix zone before timeslot T (i.e., N = ∑ MT<br />
k=1 ck).<br />
ps(k) is the probability of the event that the target vehicle leaves the mix zone at k (port<br />
and time) conditioned on the event that it enters the zone at port s at time 0. The attacker<br />
exactly knows which port is s. Probability ps(k) can be computed as: ps(k) = qsjfsj(t),<br />
where port j and timeslot t correspond to index k.<br />
p(k) is the probability of the event that a vehicle leaves the mix zone at k (port and time).<br />
This distribution can be calculated from the input distribution and the transition probabilities:<br />
p(k) = ∑ M<br />
s=1 ps(k).<br />
Pr(k|C) is the conditional probability that the target vehicle left the mix zone at time and<br />
port defined by k, given that the attacker’s observation is vector C.<br />
We must determine for which k probability Pr(k|C) is maximal. Let us denote this k with k ∗ .<br />
The probability Pr(k|C) can be rewritten, using the Bayes rule:<br />
Then k ∗ can be computed as:<br />
k ∗ = arg max<br />
k<br />
Pr(k|C) = Pr(C|k)ps(k)<br />
Pr(C)<br />
Pr(C|k)ps(k)<br />
Pr(C)<br />
= arg max Pr(C|k)ps(k)<br />
k<br />
Pr(C|k) has a multinomial distribution with a condition that at least one vehicle (the target of<br />
the attacker) must leave the mix zone at k:<br />
Pr(C|k) =<br />
N!<br />
c1! . . . ck−1!(ck − 1)!ck+1! . . . cMT ! p(k)ck−1<br />
MT ∏<br />
j=1,j̸=k<br />
Pr(C|k) can be multiplied and divided by p(k)<br />
to simplify the equation:<br />
ck<br />
Pr(C|k) = ck<br />
⎛<br />
MT<br />
⎝<br />
N! ∏<br />
p(j)<br />
p(k) c1! . . . cMT !<br />
cj<br />
⎞<br />
⎠<br />
j=1<br />
p(j) cj<br />
where the bracketed part is a constant, which does not have any effect on the maximization, thus<br />
it can be omitted.<br />
k ∗ = arg max<br />
k<br />
ck<br />
p(k) ps(k) = arg max<br />
k<br />
ck<br />
p(k)N ps(k) = arg max<br />
k<br />
pk<br />
p(k) ps(k)<br />
where pk is the empirical distribution of k (i.e., pk = ck/N). If the number of vehicles in the<br />
mix zone is large enough, then pk<br />
p(k) ≈ 1. Thus correctness of the intuitive algorithm described in<br />
Subsection 3.2.3 holds:<br />
k ∗ = arg max ps(k)<br />
k<br />
33
3. LOCATION PRIVACY IN VANETS<br />
This means that if many vehicles are traveling in the mix zone, then the attacker must choose<br />
the vehicle with the highest ps(k) probability.<br />
3.2.5 The level of privacy provided by the mix zone<br />
There are various metrics to quantify the level of privacy provided by the mix zone (and the<br />
fact that the vehicles continuously change pseudonyms). A natural metric in the model is the<br />
success probability of the adversary when making her decision as described above. If the success<br />
probability is large, then the mix zone and changing pseudonyms are ineffective. On the other<br />
hand, if the success probability of the adversary is small, then tracking is difficult and the system<br />
ensures location privacy.<br />
We can note that the level of privacy is often measured using the anonymity set size as the<br />
metric [Chaum, 1988], however, in this case, this approach cannot be used. The problem is that<br />
as described above, with probability ϵ, the selected vehicle v is not in the set V of vehicles exiting<br />
the mix zone during the experiment of the adversary, and therefore, by definition, V cannot be the<br />
anonymity set for v. Although, the size of V could be used as a lower bound on the real anonymity<br />
set size, there is another problem with the anonymity set size as privacy metric. Namely, it is an<br />
appropriate privacy metric only if each member of the set is equally likely to be the target of the<br />
observation, however, as we will see in Section 3.3, this is not the case in my model.<br />
Obviously, the success probability of the adversary is very difficult to determine analytically<br />
due to the complexity of the model. Therefore, I ran simulations to determine its empirical value<br />
in realistic situations. The simulation setting and parameters, as well as the simulation results are<br />
described in the next section.<br />
3.3 Simulation of mix zone<br />
The purpose of the simulation is to get an estimation of the success probability of the attacker in<br />
realistic scenarios. In this section, I first describe the simulation settings, and then, I present the<br />
simulation results.<br />
3.3.1 Simulation settings<br />
The simulation was carried out in three main phases. In the first phase, I generated a realistic map,<br />
where the vehicles moved during the simulation. This map was generated by MOVE [Karnadi et<br />
al., 2005], a tool that allows the user to quickly generate realistic mobility models for vehicular<br />
network simulations. My map is illustrated in Figure 3.2. In fact, it is a simplified map of Budapest,<br />
the capital of Hungary, and it contains the main roads of the city. I believe that despite of the<br />
simplifications, this map is still complex enough to get realistic traffic scenarios.<br />
The second phase of the simulation was to generate the movement of the vehicles on the<br />
generated map. This was done by SUMO [Krajzewicz et al., 2002], which is an open source microtraffic<br />
simulator, developed by the Center for Applied Informatics (ZAIK) and the Institute of<br />
Transport Research at the German Aerospace Center. SUMO dumps the state of the simulation in<br />
every time step into files. This state dump contains the location and the velocity of every vehicle<br />
during the simulation.<br />
In the third phase of the simulation, I processed the state dump generated by SUMO, and<br />
simulated the adversary. This part of the simulation was written in Perl, because Perl scripts can<br />
easily process the XML files generated by SUMO. Note that for the purpose of repeatability, I<br />
made the source code available on-line at http://www.crysys.hu/∼holczer/ESAS07.<br />
I implemented the adversary as follows. First, I defined the observation spots (position and<br />
radius) of the adversary in a configuration file. Then, I let the adversary build her model of the mix<br />
zone (i.e., the complement of its observation spots) by allowing her to track the vehicles as if they<br />
do not change their pseudonyms. In effect, the adversary’s knowledge is represented by a set of two<br />
dimensional tables. Each table K (i) corresponds to a port i of the mix zone, and contains empirical<br />
34
3.3. Simulation of mix zone<br />
Figure 3.2: Simplified map of Budapest generated for the simulation.<br />
probabilities. More specifically, the entry K (i)<br />
jt of table K(i) contains the empirical probability that<br />
a vehicle exits the mix zone at port j in time t given that it entered the mix zone at port i at time<br />
0. The size of the tables is M × T , where M is the number of the ports of the mix zone and T is<br />
the duration of the learning procedure defined as the time until which every observed vehicle left<br />
the mix zone. Once the adversary’s knowledge is built, she could use that for making decisions as<br />
described above in Section 3.2. I executed several simulation runs in order to get an estimation<br />
for the success probability of the adversary.<br />
Experiments with adversaries of different strength are made, where the strength of the adversary<br />
depends on the number of her eavesdropping receivers. In the simulations, all receivers were<br />
deployed in the middle of the junctions of the roads. The eavesdropping radius of the receivers was<br />
set to 50 meter. The number of the receivers varied between 5 and 59 with a step size of 5 (note<br />
that the map contains 59 junctions). Always the junctions with the highest traffic was chosen as<br />
the observation spots of the adversary (for instance, when the adversary had ten receivers, I chose<br />
the first ten junctions with the largest traffic).<br />
In addition to the strength of the adversary, the intensity of the traffic is varied. More specifically,<br />
I simulated three types of traffic: low, medium, and high. Low traffic means that in each<br />
time step 250 vehicles are emitted into the traffic flow, medium traffic is defined as 500 vehicles<br />
are emitted into the flow, and in case of high traffic 750 vehicles are emitted.<br />
For each simulation setting (strength of the adversary and intensity of the road traffic) 100<br />
simulations were performed.<br />
3.3.2 Simulation results<br />
Figure 3.3 contains the resulting success probabilities of the adversary as a function of her strength.<br />
The different curves belong to different traffic intensities. The results are quite intuitive: we can<br />
conclude that the stronger the adversary, the higher her success probability. Note, however, that<br />
from above a given strength, the success probability saturates at about 60 %. Higher success<br />
probabilities can not be achieved, because the order of the vehicles may change between junctions<br />
without the adversary being capable of tracking that. Note also that the saturation point is<br />
35
3. LOCATION PRIVACY IN VANETS<br />
reached with the control of only the half of the junctions. The intensity of the traffic is much less<br />
important parameter, than the strength of the attacker. The success probability of the attacker is<br />
nearly independent from the intensity of the traffic above a given attacker strength.<br />
Success probability of an attack [%]<br />
80<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
Low traffic<br />
Medium traffic<br />
High traffic<br />
10<br />
0 10 20 30 40 50 60<br />
Number of attacker antennas<br />
Figure 3.3: Success probabilities of the adversary as a function of her strength. The three curves<br />
represent three different scenarios (the darker the line, the more intensive the traffic).<br />
The dark bars in Figure 3.4 show how the size of the set V of the vehicles that exit the mix<br />
zone during the observation period and from which the adversary has to decide to the selected<br />
vehicle varies with the strength of the adversary. The three sub-figures are related to the three<br />
different traffic situations (low traffic – left, medium traffic – middle, high traffic – right). While<br />
the size of V seems to be large (which seemingly makes the adversary’s decision difficult), it is<br />
also interesting to examine how uniform this set V is in terms of the probabilities assigned to the<br />
vehicles in V . Recall that the adversary computes a probability pjt for each vehicle v ′ in V , which<br />
is the probability of v ′ = v. These probabilities can be normalized to obtain a distribution, and the<br />
entropy of this distribution can be computed. From this entropy, I computed the effective size of V<br />
(i.e., the size to which V can be compressed due to the non-uniformity of the distribution over its<br />
members), and the light bars in the figure illustrate the obtained values. As we can see, the effective<br />
size of V is much smaller than its real size, which means that the distribution corresponding to<br />
the members of V is highly non-uniform. This is the reason why the adversary can be successful.<br />
3.4 Global attacker<br />
In the following part of this chapter, I assume a global eavesdropping attacker instead of a local<br />
attacker. A global eavesdropping attacker can hear all of the messages sent by the vehicles. This<br />
is a more challenging task, compared to the local attacker scenario. My work is inspired by the<br />
work of [Freudiger et al., 2007]. However, [Freudiger et al., 2007] requires the use of significant<br />
infrastructure. By replacing [Freudiger et al., 2007]’s cryptographic mix zones with zones of silence<br />
I address semantic mixing and infrastructure requirements simultaneously. In the following, in<br />
Section 3.5, I give a framework, where the minimal requirements for providing privacy for vehicles<br />
is analyzed. Next, in Section 3.6, I introduce my attacker model and my proposed solution, and in<br />
Section 3.7, I present the results of my experiments showing that my approach does indeed make<br />
tracing of vehicles hard for the attacker, and that it is usable in the real world.<br />
36
3.5. Framework for location privacy in VANETs<br />
Figure 3.4: The dark bars show how the size of the set V of the vehicles that exit the mix zone<br />
during the observation period varies with the strength of the adversary (y axis: number of attacker<br />
antennas). The three sub-figures are related to the three different traffic situations (low traffic –<br />
left, medium traffic – middle, high traffic – right). The light bars illustrate the effective size of V .<br />
As we can see, the effective size is much smaller than the real size, which means that distribution<br />
corresponding to the members of V is highly non-uniform.<br />
3.5 Framework for location privacy in VANETs<br />
Any system that aims to provide privacy for vehicles must address the following areas 2 :<br />
Syntactic privacy. In brief, all vehicles that use pseudonyms must change those pseudonyms<br />
from time to time. This area includes:<br />
N1 Pseudonymity: An identifier that is available to an eavesdropper must not be directly linkable<br />
to the vehicle (for example, it must not contain the VIN, the driver’s name, or anything else<br />
an eavesdropper might know).<br />
N2 Change of identifiers: Identifiers must change with some frequency 3 .<br />
N3 Local synchronization of change of identifiers: All identifiers, up and down the network<br />
stack, must change simultaneously. (This is not a communications issue as such, but a local<br />
engineering issue; however, it must be addressed).<br />
N4 Cooperative synchronization of change of identifiers or syntactic mixing: A vehicle in an<br />
observed area must change its identifier at the same time as at least one other vehicle and<br />
the two (or more) changing vehicles must do so in a way that allows semantic privacy as<br />
defined below 4 .<br />
N5 Pseudonym use: This covers two intermingled areas:<br />
2 This section is mainly based on the work of my coauthor, William Whyte [Buttyan et al., 2009] 3 The frequency<br />
of change that provides privacy to the level expected by a user will in practice often depend on local regulation.<br />
4 Otherwise, an attacker who sees, for instance, identifiers (A, B, C, D) at time t and (A, B, C, E) at time t + 1 will<br />
know that D and E refer to the same vehicle.<br />
37
3. LOCATION PRIVACY IN VANETS<br />
N5.1 Pseudonym format: What cryptographic mechanism is used by psuedonym owners to<br />
authenticate that they are valid units within the system?<br />
N5.2 Pseudonym issuance and renewal: How are pseudonyms issued? How does a vehicle<br />
avoid running out of them? (The answer to this may involve the identifier change<br />
frequency, N2.) What assumptions are necessary about the infrastructure to ensure<br />
that a vehicle is not left without pseudonyms?<br />
Semantic privacy. This captures the idea that vehicles must not be traceable by reconstructing<br />
the trajectories implied by their heartbeat messages. This area includes:<br />
M1 Semantic unlinkability: A vehicle’s stream of heartbeat messages must be interrupted at some<br />
frequency for some period of time.<br />
M2 Semantic mixing: Semantic unlinkability is valuable mainly in so far as it creates ambiguity<br />
for an attacker about whether a resumed stream of heartbeats comes from vehicle A or vehicle<br />
B.<br />
Robust privacy. This captures how misbehaving entities within the system may affect privacy and<br />
security. This area includes:<br />
R1 Privacy-preserving bad-actor removal: How is a misbehaving entity removed? Does this<br />
removal affect the privacy of its transmissions before it began to misbehave? Does its removal<br />
affect the privacy of other entities in the system?<br />
R2 Privacy against insider attacks: How is privacy protected against bad actors in Law Enforcement<br />
or at a Certificate Authority (CA)?<br />
This part of the chapter explicitly contributes in the area of syntactic mixing (N4), semantic<br />
mixing (M2), and semantic unlinkability (M1). The results are based on the assumption that<br />
pseudonyms are changed whenever the criteria are met. This will be fairly frequent, on the order<br />
of once every few minutes for urban driving, implicitly addressing N2. An identifier change frequency<br />
this high may require frequent reissuance of pseudonyms, limiting the choices possible in<br />
areas N5.1 and N5.2. To the best of my understanding, the following proposal is compatible with<br />
any reasonable solution for N1, N3, R1, or R2.<br />
3.6 Attacker Model and the SLOW algorithm<br />
A global attacker is assumed who can get mass coverage. Conceptually, the attacker might be the<br />
RSU network operator that has access to messages received by all RSUs, or the attacker might<br />
have set up a network covering an entire city 5 . This is clearly an extremely powerful attack model,<br />
perhaps too powerful to be plausible, but we can use this because if the system is secure in the<br />
face of this attacker it will be secure in the face of other, weaker attackers too.<br />
The attacker can use two basic mechanisms to link transmissions from a vehicle: (1) linking<br />
pseudonyms or other identifiers between heartbeat messages (syntactic linking), and (2) using the<br />
position and velocity information in the heartbeat messages to reconstruct the trajectory of the<br />
vehicle (semantic linking).<br />
We assume no supporting infrastructure in terms of an RSU network, therefore, vehicles must<br />
have a strategy to create their own mix zones, and that strategy must work even in the case where<br />
the attacker has 100% coverage. The defender’s mechanism is to turn off radio transmissions (to<br />
make semantic linking difficult) and change pseudonyms (to make syntactic linking difficult) while<br />
the radio is turned off without endangering safety of life.<br />
More precisely, the proposed solution, which is called SLOW for Silence at LOW speeds, works<br />
as follows. We can choose a threshold speed vT , say vT = 30 km/h. A vehicle will not broadcast<br />
5 Fraunhofer Institute has established that the hardware cost (ignoring the backhaul connections) to set up receivers<br />
covering all 900 km 2 of Berlin is about 250, 000 Euros.<br />
38
3.7. Analysis of SLOW<br />
any heartbeat message, or any other message containing location or trajectory data in the clear,<br />
if it is traveling below speed vT , unless this is necessary for safety- of-life reasons. If the vehicle<br />
has not sent a message for a certain period of time, then it changes pseudonyms (identifiers at all<br />
layer of the network stack and related certificates) before the next transmission. Traffic signals<br />
in a crowded urban area seem like an ideal location for such a pseudonym change: whenever a<br />
crowd of vehicles stop at a traffic signal, they may go into one of several lanes, they may choose<br />
to turn or not turn, and so on. Thus, mix-zones are created at the point where there is maximum<br />
uncertainty about exactly where a vehicle is and exactly what it is going to do next. This is also<br />
a safe set of circumstances under which to stop transmitting. Only 5% of pedestrians struck by a<br />
vehicle at 20 km/h die [Leaf and Preusser, 1999] while at 50 km/h the figure is 40%. Presumably,<br />
vehicle-to-vehicle collisions where both cars are traveling at 30 km/h result in even fewer fatalities.<br />
Situations can be defined as exceptions. For instance, if vehicle A is stopped at a signal, but<br />
vehicle B coming up behind it emits a heartbeat that lets vehicle A know that there is a risk of<br />
a collision, then vehicle A can send out a heartbeat to warn vehicle B to brake. We can note<br />
that the simulations do not include this exception case, because in practice these cases come up<br />
only rarely. Future research based on SLOW will investigate this exception case in greater detail.<br />
We can also note that an attacker can abuse exception cases to break the silent period, but this<br />
attacker (unless it is an inside attacker) can be tracked down by standard methods and revoked.<br />
Besides being very simple to implement, SLOW has other advantages. Traffic jams and slow<br />
traffic leads to a large amount of vehicles in transmission range and therefore requires extensive<br />
processing power to verify the digital signatures of all incoming heartbeat messages. By refraining<br />
from sending heartbeat messages, SLOW avoids the necessity of extensive signature verifications<br />
in traffic jams and slow traffic, and thus, reduces hardware cost. A more detailed analysis of<br />
the impact on computation complexity, as well as the level of privacy and safety provided by the<br />
scheme will be presented in the next section.<br />
3.7 Analysis of SLOW<br />
3.7.1 Privacy<br />
It must be intuitively clear that a vehicle frequently sending out heartbeat messages is easy to<br />
trace, but to the best of my knowledge, no accurate experiment confirms this statement in VANET<br />
settings. As field experiments cannot be done due to the lack of envisioned VANET infrastructure,<br />
simulations were carried out to measure the level of traceability in an urban setting. The SUMO<br />
[Krajzewicz et al., 2002] simulation environment was used, as it is a realistic, microscopic urban<br />
traffic simulator. SUMO was set to use a 100 Hz frequency for internal update of vehicle position<br />
and velocities, and every Nth position (N depending on the heartbeat frequency) was considered<br />
to be available to the attacker as a heartbeat.<br />
Note that tracing vehicles in an urban setting is essentially a multitarget tracking problem,<br />
which has an extensive literature, however, mostly related to radar development in the fields of<br />
aviation and sailing [Gruteser and Hoh, 2005]. Yet, the following tracking approach, consisting of<br />
three steps, can be adopted to the vehicular setting too: first, the actual position and speed of the<br />
targets are recorded by eavesdropping the heartbeat messages. Based on the position and speed<br />
information, a predicted new position is calculated, which can be further refined by the help of side<br />
information such as the layout of the streets, lanes etc. At the next heartbeat, the new positions<br />
are eavesdropped and matched with the predicted positions.<br />
We implemented an attacker that tracked the vehicles in the SUMO output based on the tracking<br />
approach described above. The attacker uses the last two heartbeat information to calculate<br />
the acceleration of the vehicles making the prediction of the next position more accurate. The<br />
vehicles are tracked from their departure to their destination. Tracking is considered successful, if<br />
the attacker has not lost a target through its entire journey.<br />
The results of the tracking of 50 vehicles are shown in Figure 3.5. As we can see, if the<br />
beaconing frequency is 5-10 Hz, which is needed for most of the safety applications, then 75-80%<br />
39
3. LOCATION PRIVACY IN VANETS<br />
Success rate of tracing [%]<br />
80<br />
75<br />
70<br />
65<br />
60<br />
55<br />
0 2 4 6 8 10<br />
Beacon frequency [1/s]<br />
Figure 3.5: Success rate of an attacker performing vehicle tracking by semantic linking of heartbeat<br />
messages when no defense mechanisms are in use.<br />
of the vehicles are tracked successfully. By evaluating the unsuccessful cases, we can observe that<br />
the target vehicles were lost at their destinations. More precisely, in the vast majority of the<br />
unsuccessful cases, when the target vehicle V1 arrived to its destination and stopped sending more<br />
messages, if an other vehicle V2 was in its vicinity, then the attacker continued tracking V2 as if<br />
it was V1. I counted this as unsuccessful case, because the attacker erroneously determined the<br />
destination of the target vehicle (i.e., it concluded that the destination of V1 was that of V2, and<br />
those two destinations have virtually never been the same). However, during the movement of the<br />
target vehicles (i.e., before they reached their destination), the attacker was able to track them<br />
with a remarkable 99% success rate. This confirms that semantic linking is a real problem.<br />
In any case, from a privacy point of view, a system where the users are traceable with probability<br />
0.75-0.8 is not acceptable. My proposed silent period scheme, where the vehicles stop sending<br />
heartbeat message below a given speed, mitigates this problem. It must be clear that the tracking<br />
algorithm described above does not work when the vehicles stop sending heartbeats regularly.<br />
Yet, the attacker may use other side information, such as the probability of turning to a given<br />
direction in an intersection, to improve the success probability of tracking despite the absence of<br />
the heartbeats. Thus, we need a new attacker model that also accounts for such side knowledge of<br />
the attacker.<br />
We can formalize the knowledge of the attacker as follows (for a summary of notations the<br />
reader is referred to Table 3.1): first, each intersection is modeled with a binary matrix J, where<br />
each row corresponds to an ingress lane and each column corresponds to an egress lane of the<br />
intersection, and Jij (the entry in the i-th row and j-th column) is 1 if it is possible to traverse<br />
the intersection by arriving in ingress lane i and leaving in egress lane j. As an example, consider<br />
the intersection shown in Figure 3.6 and its corresponding matrix J defined in matrix (3.1).<br />
⎛<br />
⎜<br />
J = ⎜<br />
⎝<br />
0 0 0 1 1<br />
0 0 1 0 0<br />
1 1 0 1 1<br />
0 0 1 0 0<br />
1 1 0 0 0<br />
40<br />
⎞<br />
⎟<br />
⎠<br />
(3.1)
Table 3.1: Notation in SLOW<br />
vT threshold speed<br />
J junction descriptor matrix<br />
m number of lanes towards the junction<br />
n number of lanes from the junction<br />
T probability distribution of the target’s lanes<br />
W number of waiting vehicles per lanes<br />
w number of waiting vehicles in the junction<br />
L list of egress events<br />
lD decision of the attacker<br />
ˆ l the target’s real egress event<br />
LS list of suspect events<br />
3.7. Analysis of SLOW<br />
Second, we can assume that the accuracy of GPS receivers does not permit to decide with certainty<br />
which lane of a road a given vehicle is using. Therefore, we can also assume that the attacker<br />
knows on which road a target vehicle enters the intersection, but it does not know which ingress<br />
lane it is using. Nevertheless, the attacker may have some a priori knowledge on the probability<br />
of an incoming vehicle choosing a given ingress lane on a given road in a given intersection; such<br />
knowledge may be acquired by visually observing the traffic in that intersection for some time.<br />
These probabilities can be arranged in an m dimensional vector T , where the i-th element Ti is<br />
the probability of choosing ingress lane i when entering the intersection on the road that contains<br />
ingress lane i. As an example, consider the intersection in Figure 3.6, and the vector<br />
T = (0.6, 0.4, 1, 0.8, 0.2)<br />
This would mean that vehicles arriving to the intersection on the road that contains ingress lanes<br />
1 and 2 choose lane 1 with probability 0.6 and lane 2 with probability 0.4. Note that vehicles<br />
arriving on the road that contains only ingress lane 3 have no choice, hence T3 in this example is<br />
1.<br />
Third, when multiple possible egress lanes correspond to a given ingress lane (i.e., there are<br />
more than one 1s in a given row of matrix J), we can assume that vehicles choose any of those egress<br />
lanes uniformly at random. For example, a vehicle arriving in ingress lane 1 of the intersection in<br />
Figure 3.6 can leave the intersection in egress lane 4 or 5 with equal probability.<br />
Finally, when the target vehicle arrives at an intersection, there may already be some other<br />
vehicles waiting or moving below the threshold speed in that intersection. The number of such<br />
silent vehicles in ingress lane i is denoted by Wi, and the m dimensional vector containing all<br />
Wi values is denoted by W . Note that due to the previous assumption that the attacker is not<br />
always able to precisely determine the ingress lane used by an incoming vehicle, it is also unable to<br />
determine the exact values of all Wi’s; nevertheless, it can use its experimental knowledge on the<br />
probabilities of choosing a given lane, represented by vector T , to at least estimate the Wi values.<br />
Let us denote by L the list of vehicles that leave the intersection (and thus restart sending<br />
heartbeats) after the target entered the intersection (and thus stopped sending more heartbeats).<br />
More precisely, each element Lk of list L is a (timestamp, road) pair (t, r) that represents a vehicle<br />
reappearing on road r at time t. The objective of the attacker is to decide which Lk corresponds<br />
to the target vehicle. Let us denote by ℓ the list element chosen by the attacker, and let ℓ ∗ be the<br />
list element that really corresponds to the target vehicle. The attacker is successful if and only if<br />
ℓ = ℓ ∗ .<br />
In theory, the optimal decision is the following:<br />
ℓ = arg max Pr(Lk|J, T, W, L)<br />
k<br />
where Pr(Lk|J, T, W, L) is the probability of Lk being the right decision given all the knowledge<br />
of the attacker. However, it seems to be difficult to calculate (or estimate) all these conditional<br />
41
3. LOCATION PRIVACY IN VANETS<br />
Figure 3.6: An example intersection, the corresponding matrix is given in (3.1)<br />
probabilities, as they have to be determined for every possible intersection (J), number of awaiting<br />
vehicles in the intersection (W ), and observation of egress events (L).<br />
Hence, I assume a more simplistic attacker that uses the following tracking algorithm: let us<br />
denote by w the total number of silent vehicles in the intersection when the target vehicle arrives<br />
and stops sending heartbeats. The attacker decides on the w-th element of L, unless that entry<br />
surely cannot correspond to the target (e.g., it is not possible to leave the intersection on the road<br />
in the w-th element of L given the road on which the target arrived to the intersection). When<br />
the w-th element of L must be excluded, the attacker chooses the next element on the list L that<br />
cannot be excluded.<br />
Our simple attacker model essentially assumes that traffic at an intersection follows the FIFO<br />
(First In First Out) principle. While this is clearly not the case in practice, the attacker still<br />
achieves a reasonable success rate in a single intersection as shown in Figure 3.7. One can see, for<br />
instance, that when the total number of vehicles is 100, the attacker can still track a target vehicle<br />
through a single intersection with probability around 1<br />
2 .<br />
Figure 3.8 shows the success rate of the attacker in the general case, when the target traverses<br />
multiple intersections between its starting and destination points. As expected, the tracking capabilities<br />
of the attacker in this case are worse than in the single intersection case. The quantitative<br />
results of the simulation experiments suggest that only around 10% of the vehicles can be tracked<br />
fully by the attacker when the threshold speed is larger than 22 km/h (approximately 6 m/s).<br />
The effectiveness of the attacker depends on the vT threshold speed and the density of the<br />
vehicles. In general the higher the threshold speed at which vehicles stop sending heartbeats,<br />
the higher the chance that the attacker loses the target (i.e., the lower the chance of successful<br />
tracking). Moreover, in a dense network, it is more difficult to track vehicles. Note, however, that<br />
there is an important difference in practice between the traffic density and the threshold speed,<br />
namely, that the threshold speed can be influenced by the owner of the vehicle, while the traffic<br />
density cannot be.<br />
42
Success rate of tracing [%]<br />
100<br />
90<br />
80<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
0<br />
0 2 4 6 8 10<br />
Threshold speed [m/s]<br />
3.7. Analysis of SLOW<br />
Figure 3.7: Success rate of the simple attacker in a single intersection. Different curves belong to<br />
different experiments with the total number of vehicles given in the legend.<br />
Success rate of tracing [%]<br />
100<br />
90<br />
80<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
50<br />
100<br />
150<br />
200<br />
50<br />
100<br />
150<br />
200<br />
0<br />
0 2 4 6 8 10<br />
Threshold speed [m/s]<br />
Figure 3.8: Success rate of the simple attacker in the general case, when the target traverses multiple<br />
intersections between its starting and destination points. Different curves belong to different<br />
experiments with the total number of vehicles given in the legend.<br />
43
3. LOCATION PRIVACY IN VANETS<br />
3.7.2 Effects on safety<br />
The main objective of vehicular communications is to increase road safety. However, refraining<br />
from sending heartbeat messages may seem to be in contradiction with this objective. Note,<br />
however, that I propose to refrain from sending heartbeats only below a given threshold speed,<br />
and I argue below that this may not endanger the objective of road safety.<br />
According to [Leaf and Preusser, 1999], only 5% of pedestrians struck by a vehicle at 20 km/h<br />
die, while this figure is 40% at 50 km/h. In [Kloeden et al., 1997], it is shown that in a 60 km/h<br />
speed limit area, the risk of involvement in a casualty crash doubles with each 5 km/h increase in<br />
traveling speed above 60 km/h. In [Baruya, 1998], it is shown that 1 km/h change in speed can<br />
influence the probability of an accident by 3.45%.<br />
The statistical figures above show that at lower speed the probability of an accident is lower<br />
too. This is because usually vehicles go at lower speed in areas where the drivers need to be more<br />
careful (hence the speed limit). Thus, it makes sense to rely more on the awareness of the drivers<br />
to avoid accidents at lower speeds. On the other hand, at higher speeds, accidents can be more<br />
severe, and warning from the vehicular safety communication system can play a crucial role in<br />
avoiding fatalities.<br />
3.7.3 Effects on computation complexity<br />
A great challenge in V2V communication deployment is the processing power of the vehicles [Kargl<br />
et al., 2008]. The most demanding task of the On Board Unit (OBU) is the verification of the<br />
signatures on the received heartbeat messages. This problem can be partially handled by not<br />
attaching certificates to every heartbeat message [Calandriello et al., 2007], but it does not solve<br />
the problem of verifying the signatures on the messages.<br />
In principle, the heavier the traffic, the more vehicles are in each others communication range.<br />
More vehicles send more heartbeats overwhelming each other. The number of vehicles in communication<br />
range depends on the average speed of the traffic, assuming that the vehicles keep a safety<br />
distance between each other depending on their speed.<br />
In Figure 3.9, the results of some simple calculations can be seen showing the number of<br />
signature verifications performed as a function of the average speed. In this calculation, vehicles<br />
are assumed to follow each other within 2 seconds. The communication range is assumed to be<br />
100 m and the heartbeat frequency is 10 Hz. It can be seen in the figure that, in a traffic jam on<br />
an 8-lane road, each vehicle must verify as many as approximately 8,000 signatures per second. If<br />
SLOW is used with a threshold speed of around 30 km/h (approximately 8 m/s), then the vehicles<br />
never need to verify more than 1,000 signatures per second (assuming all other parameters are the<br />
same as before). This approach also works well in combination with congestion control where the<br />
transmission power is reduced in high density traffic scenarios. My approach therefore makes the<br />
hardware requirements of the OBU much lower and enables the use of less expensive devices.<br />
3.8 Related work<br />
The privacy of VANET’s is a recent topic. Many author addressed VANETs and its security and<br />
privacy in some papers (for example in [Aoki and Fujii, 1996; Luo and Hubaux, 2004; McMillin et<br />
al., 1998; Chisalita and Shahmehri, 2002; El Zarki et al., 2002; Dötzer, 2006; Hubaux et al., 2004;<br />
Raya and Hubaux, 2005; Raya and Hubaux, 2007; Gerlach, 2006; Ma et al., 2010; Wiedersheim et<br />
al., 2010]). A good online bibliography for the security of VANETs can be found in [Lin and Lu,<br />
2012]. The problem of providing location privacy for VANET’s is categorised in [Gerlach, 2006],<br />
into classes. The difference between the classes is the goal and the strength of the attacker. In<br />
[Choi et al., 2005], Choy et al. investigates how to obtain a balance between privacy and audit<br />
requirements in vehicular networks using only symmetric primitives. Ren et al. analyzes the<br />
location privacy problems in VANETs with attack trees in [Ren et al., 2011].<br />
Many privacy preserving techniques are suggested for on-line transactions (for example in<br />
[Chaum, 1988; Gulcu and Tsudik, 1996]). Mainly they are based on mix networks [Kesdogan<br />
44
Number of signatures to be verified [1/s]<br />
8000<br />
7000<br />
6000<br />
5000<br />
4000<br />
3000<br />
2000<br />
1000<br />
8 lanes<br />
6 lanes<br />
4 lanes<br />
2 lanes<br />
0<br />
0 5 10 15 20<br />
Speed [m/s]<br />
25 30 35 40<br />
3.8. Related work<br />
Figure 3.9: Number of signatures to be verified as a function of the average speed. The communication<br />
range is 100 m, and the heartbeat frequency is 10 Hz. Safety distance between the vehicles<br />
depends on their speed.<br />
et al., 1998; Reiter and Rubin, 1998], which was basically proposed by Chaum in 1981 [Chaum,<br />
1981]. A single mix collect messages mixes them and send them towards their destination. A mix<br />
networks consits of single mixes, which are linked together. In a mix network, some misbehaving<br />
mixes can not break the anonimity of the senders/receivers.<br />
An evident extension of mix networks to the off-line world is the the mix zones, proposed by<br />
Beresford et al. in [Beresford and Stajano, 2003; Beresford and Stajano, 2004]. A mix zone is a<br />
place where the users of the network are mixed, thus after leaving the mix zone, they can not be<br />
distinguished from each other.<br />
The problem of providing location privacy in wireless communication is well studied by Hu and<br />
Wang in [Hu and Wang, 2005]. They built a transaction-based wireless communication system<br />
in which transactions are unlinkable, and give a detailed simulation results. Their solution can<br />
provide location privacy for real-time applications as well.<br />
To qualify the operation of the mix zones, the offered anonomity must be measured. The first<br />
metric was proposed by Chaum [Chaum, 1988], was the size of the anonimity set. It is good metric<br />
only if any user leaving the mix zone is the target with the same probability. If the probabilities<br />
are different, then entropy based metric should be used. Entropy based metrics were suggested by<br />
Díaz et. al [Diaz et al., 2002] and Serjantov et al. [Serjantov and Danezis, 2003] at the same time.<br />
For the best of my knowledge, one the most relevant paper to SLOW is done by Sampigethaya<br />
et al. in [Sampigethaya et al., 2005; Sampigethaya et al., 2007]. In the paper, they study the<br />
problem of providing location privacy in VANET in the presence of a global adversary. A location<br />
privacy scheme called CARAVAN is also proposed. The main idea of the scheme is that random<br />
silent period [Huang et al., 2005] are used in the communication to avoid continous traceability.<br />
The solution is evaluated only in freeway model and in randomly generated manhattan street<br />
model. Lu et al. arrives to similar consequences as SLOW, namely, that the pseudonyms should<br />
be changed at intersections with high traffic in [Lu et al., 2012]. The main difference between the<br />
two approaches is that in their paper, the vehicles are aware of the possible zones from a predefined<br />
45
3. LOCATION PRIVACY IN VANETS<br />
map, so the mix zones are defined priori. They use a game theoretic approach to analyze their<br />
model.<br />
The change of pseudonyms may also have a detrimental effect, especially on the efficiency of<br />
routing and the packet loss ratio. In [Schoch et al., 2006], Schoch et al. investigated this problem<br />
and proposed some approaches that can guide system designers to achieve both a given level of<br />
privacy protection as well a reasonable level of performance.<br />
Another proposed approach provides multiple certificates in vehicles based on the combination<br />
of group signatures and multiple self-issued certificates [Calandriello et al., 2007; Armknecht et<br />
al., 2007]. The disadvantage is that On Board Units (OBUs) need to perform expensive group<br />
signature verification operations, and that OBUs are empowered to mount Sibyl attacks. [Studer<br />
et al., 2008] uses group signatures to request temporary certificates from a CA in an anonymous<br />
manner without the disadvantages of the previous scheme, but at the cost of an available connection<br />
to the CA. My solution suggested in Section 3.6 accounts for a global attacker without the support<br />
of the RSU infrastructure.<br />
3.9 Conclusion<br />
In the first half of this chapter from Section 3.2, I studied the effectiveness of changing pseudonyms<br />
to provide location privacy for vehicles in vehicular networks. The approach of changing pseudonyms<br />
to make location tracking more difficult was proposed in prior work, but its effectiveness<br />
has not been investigated yet. In order to address this problem, I defined a model based on the<br />
concept of the mix zone. I assumed that the adversary has some knowledge about the mix zone,<br />
and based on this knowledge, she tries to relate the vehicles that exit the mix zone to those that<br />
entered it earlier. I also introduced a metric to quantify the level of privacy enjoyed by the vehicles<br />
in this model. In addition, I performed extensive simulations to study the behavior of the model in<br />
realistic scenarios. In particular, in the simulation, I used a rather complex road map, generated<br />
traffic with realistic parameters, and varied the strength of the adversary by varying the number of<br />
her monitoring points. My simulation results provided detailed information about the relationship<br />
between the strength of the adversary and the level of privacy achieved by changing pseudonyms.<br />
I abstracted away the frequency with which the pseudonyms are changed, and I simply assumed<br />
that this frequency is high enough so that every vehicle surely changes pseudonym while in the mix<br />
zone. It seems that changing the pseudonyms frequently has some advantages as frequent changes<br />
increase the probability that the pseudonym is changed in the mix zone. On the other hand, the<br />
higher the frequency, the larger the cost that the pseudonym changing mechanism induces on the<br />
system in terms of management of cryptographic material (keys and certificates related to the<br />
pseudonyms). In addition, if for a given frequency, the probability of changing pseudonym in the<br />
mix zone is already close to 1, then there is no sense to increase the frequency further as it will<br />
no longer increase the level of privacy, while it will still increase the cost. Hence, there seems to<br />
be an optimal value for the frequency of the pseudonym change. Unfortunately, this optimal value<br />
depends on the characteristics of the mix zone, which is ultimately determined by the observing<br />
zone of the adversary, which is not known to the system designer.<br />
In the second half of the chapter from Section 3.4, I proposed a simple and effective privacy<br />
preserving scheme, called SLOW, for VANETs. SLOW requires vehicles to stop sending heartbeat<br />
messages below a given threshold speed (this explains the name SLOW that stands for “silence<br />
at low speeds”) and to change all their identifiers (pseudonyms) after each such silent period. By<br />
using SLOW, the vicinity of intersections and traffic lights become dynamically created mix zones,<br />
as there are usually many vehicles moving slowly at these places at a given moment in time. In<br />
other words, SLOW implicitly ensures a synchronized silent period and pseudonym change for<br />
many vehicles both in time and space, and this makes it effective as a location privacy enhancing<br />
scheme. Yet, SLOW is remarkably simple, and it has further advantages. For instance, it relieves<br />
vehicles of the burden of verifying a potentially large amount of digital signatures when the vehicle<br />
density is large, as this usually happens when the vehicles move slowly in a traffic jam or stop at<br />
46
3.10. Related publications<br />
intersections. Finally, the risk of a fatal accident at a slow speed is low, and therefore, SLOW does<br />
not seriously impact safety-of-life.<br />
I evaluated SLOW in a specific attacker model that seems to be realistic, and it proved to be<br />
effective in this model, reducing the success rate of tracking a target vehicle from its starting point<br />
to its destination down to the range of 10–30%.<br />
As a conclusion of this chapter, I analyzed what a local and a global eavesdropping attacker<br />
can do when trying to trace vehicles in VANETs, and gave an efficient countermeasure against the<br />
stronger global attacker.<br />
3.10 Related publications<br />
[Buttyan et al., 2007] Levente Buttyan, Tamas Holczer, and Istvan Vajda. On the effectiveness of<br />
changing pseudonyms to provide location privacy in vanets. In Proceedings of the Fourth European<br />
Workshop on Security and Privacy in Ad hoc and Sensor Networks (ESAS2007). Springer, 2007.<br />
[Papadimitratos et al., 2008] Panagiotis Papadimitratos, Antonio Kung, Frank Kargl, Zhendong<br />
Ma, Maxim Raya, Julien Freudiger, Elmar Schoch, Tamas Holczer, Levente Buttyán, and<br />
Jean pierre Hubaux. Secure vehicular communication systems: design and architecture. IEEE<br />
Communications Magazine, 46(11):100–109, 2008.<br />
[Holczer et al., 2009] Tamas Holczer, Petra Ardelean, Naim Asaj, Stefano Cosenza, Michael<br />
Müter, Albert Held, Björn Wiedersheim, Panagiotis Papadimitratos, Frank Kargl, and Danny De<br />
Cock. Secure vehicle communication (sevecom). Demonstration. Mobisys, June 2009.<br />
[Buttyan et al., 2009] Levente Buttyan, Tamas Holczer, Andre Weimerskirch, and William<br />
Whyte. Slow: A practical pseudonym changing scheme for location privacy in vanets. In Proceedings<br />
of the IEEE Vehicular Networking Conference, pages 1–8. IEEE, IEEE, October 2009.<br />
47
Chapter 4<br />
Anonymous Aggregator Election and Data<br />
Aggregation in Wireless Sensor Networks<br />
4.1 Introduction<br />
Wireless sensor and actuator networks are potentially useful building blocks for cyber-physical<br />
systems. Those systems must typically guarantee high-confidence operation, which induces strong<br />
requirements on the dependability of their building blocks, including the wireless sensor and actuator<br />
network. Dependability means resistance against both accidental failures and intentional<br />
attacks, and it should be addressed at all layers of the network architecture, including the networking<br />
protocols and the distributed services built on top of them, as well as the hardware and<br />
software architecture of the sensor and actuator nodes themselves. Within this context, in this<br />
chapter, I focus on the security aspects of aggregator node election and data aggregation protocols<br />
in wireless sensor networks.<br />
Data aggregation in wireless sensor networks helps to improve the energy efficiency and the<br />
scalability of the network. It is typically combined with some form of clustering. A common<br />
scenario is that sensor readings are first collected in each cluster by a designated aggregator node<br />
that aggregates the collected data and sends only the result of the aggregation to the base station.<br />
In another scenario, the base station may not be present permanently in the network, and the<br />
aggregated data must be stored by the designated aggregator node in each cluster temporarily<br />
until the base station can eventually fetch the data. In both cases, the amount of communication,<br />
and hence, the energy consumption of the network can be greatly reduced by sending aggregated<br />
data, instead of individual sensor readings, to the base station.<br />
While data aggregation in wireless sensor networks is clearly advantageous with respect to<br />
scalability and efficiency, it introduces some security issues. In particular, the designated aggregator<br />
nodes that collect and store aggregated sensor readings and communicate with the base station are<br />
attractive targets of physical node destruction and jamming attacks. Indeed, it is a good strategy<br />
for an attacker to locate those designated nodes and disable them, because he can prevent the<br />
reception of data from the entire cluster served by the disabled node. Even if the aggregator role<br />
is changed periodically by some election process, some security issues remain, in particular in the<br />
case when the base station is off-line and the aggregator nodes must store the aggregated data<br />
temporarily until the base station goes on-line and retrieves them. More specifically, in this case,<br />
the attacker can locate and attack the node that was aggregator in a specific time epoch before<br />
the base station fetches its stored data, leading to permanent loss of data from the given cluster<br />
in the given epoch.<br />
In order to mitigate this problem, I introduced the concept of private aggregator node election,<br />
and I proposed the first private aggregator node election protocol. Briefly, the first protocol<br />
ensures that the identity of the elected aggregator remains hidden from an attacker who observes<br />
49
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
the execution of the election process. However, this protocol ensures only protection against an<br />
external eavesdropper that cannot compromise sensor nodes, and it does not address the problem<br />
of identifying the aggregator nodes by means of traffic pattern analysis after the election phase.<br />
In the second protocol, I addressed the shortcomings of the first scheme: I proposed a new<br />
private aggregator node election protocol that is resistant even to internal attacks originating<br />
from compromised nodes, and I also proposed a new private data aggregation protocol and a new<br />
private query protocol which preserved the anonymity of the aggregator nodes during the data<br />
aggregation process and when they provide responses to queries of the base station. In the second<br />
private aggregator node election protocol, each node decides locally in a probabilistic manner to<br />
become an aggregator or not, and then the nodes execute an anonymous veto protocol to verify if<br />
at least one node became aggregator. The anonymous veto protocol ensures that non-aggregator<br />
nodes learn only that there exists at least one aggregator in the cluster, but they do not learn<br />
any information on its identity. Hence, even if such a non-aggregator node is compromised, the<br />
attacker learns no useful information regarding the identity of the aggregator.<br />
The protocols can be used to protect sensor network applications that rely on data aggregation<br />
in clusters, and where locating and then disabling the designated aggregator nodes is highly<br />
undesirable. Such applications include high-confidence cyber-physical systems where sensors and<br />
actuators monitor and control the operation of some critical physical infrastructure, such as an<br />
energy distribution network, a drinking water supply system, or a chemical pipeline. A common<br />
feature of these systems is that they have a large geographical span, and therefore, the sensor<br />
network must be organized into clusters and use in-network data aggregation in order to ensure<br />
scalability and energy efficient operation. Moreover, due to the mission critical nature of these<br />
applications, it is desirable to prevent the identification of the aggregator nodes in order to limit<br />
the impact of a successful attack against the sensor network. The first protocol that resist only an<br />
external eavesdropper is less complex than the second protocol that works in a stronger attacker<br />
model. Hence, the first protocol can be used in case of strong resource constraints or when the<br />
risk of compromising sensor nodes is limited (e.g., it may be difficult to obtain physical access to<br />
the nodes). The second protocol is needed when the risk of compromised and misbehaving nodes<br />
cannot be eliminated by other means.<br />
The remainder of the chapter is organized as follows: in Section 4.2, I introduce my system<br />
and attacker models. In Section 4.3, I present my basic aggregator election protocol which can<br />
withstand external attacks, while in Section 4.4, I introduce my advanced protocols, which can<br />
withstand internal aggregator identification and scamming attackers as well. In Section 4.5, I give<br />
an overview of some related work, and in Section 4.6, I conclude the chapter and sketch some<br />
future research directions.<br />
4.2 System and attacker models<br />
A sensor network consists of sensor nodes that communicate with each other via wireless channels.<br />
Every node can generate sensor readings, and store it or forward it to another node. Each node can<br />
directly communicate with the nodes within its radio range; those nodes are called the (one-hop)<br />
neighbors of the node. In order to communicate with distant nodes (outside the radio range),<br />
the nodes use multi-hop communications. The sensor network has an operator as well, who can<br />
communicate with some of the nodes through a special node called base station, or can communicate<br />
directly with the nodes if the operator moves close to the network.<br />
Throughout the chapter, a data driven sensor network is envisioned, where every sensor node<br />
sends its measurement to a data aggregator regularly. Such data driven networks are used for<br />
regular inspection of monitored processes notably in critical infrastructures. Event driven networks<br />
can be used for reporting special usually dangerous but infrequent events like fire in a building.<br />
There is no need of clustering and data aggregation in event based systems, thus private cluster<br />
aggregator election and data aggregation is not applicable there. The third kind of network is the<br />
query driven network, where the operator sends a query to the network, and the network sends a<br />
50
4.2. System and attacker models<br />
response. This kind of functionality can be used with data driven networks, and can have privacy<br />
consequences, like the identity of the answering node should remain hidden.<br />
In the following, it is assumed, that the time is slotted, and one measurement is sent to the<br />
data aggregator in each time slot. The time synchronization between the nodes is not discussed<br />
here, but a comprehensive survey can be found in [Faizulkhakov, 2007].<br />
It is assumed that every node shares some cryptographic credentials with the operator. These<br />
credentials are unique for every node, and the operator can store them in a lookup table, or can<br />
be generated from a master key and the node’s identifier on demand. The exact definition of the<br />
credentials can be found in Section 4.3.1 and in Section 4.4.1.<br />
The nodes may be aware of their geographical locations, and they may already be partitioned<br />
into well defined geographical regions. In this case, these regions are the clusters, and the objective<br />
of the aggregator election protocol is to elect an aggregator within each geographical region. We<br />
call this approach location based clustering; an example would be the PANEL protocol [Buttyán<br />
and Schaffer, 2010].<br />
A kind of generalization of the position based election is the preset case, where the nodes know<br />
the cluster ID they belong to before any communication. Here the goal of the election is to elect<br />
one node in every preset cluster. This approach is used in [Buttyán and Holczer, 2010].<br />
Alternatively, the nodes may be unaware of their locations or cluster IDs, and know only their<br />
neighbors. In this case, the clusters are not pre-determined, but they are dynamically constructed<br />
parallel to the election of the aggregators. Basically, any node may announce itself as an aggregator,<br />
and the nodes within a certain number of hops on the topology graph may join that node as cluster<br />
members. We call this approach topology based clustering; an example would be the LEACH<br />
protocol [Heinzelman et al., 2000].<br />
The location based and the topology based approaches are illustrated in Figure 4.1.<br />
100<br />
80<br />
60<br />
40<br />
20<br />
0<br />
0 20 40 60 80 100<br />
100<br />
80<br />
60<br />
40<br />
20<br />
0<br />
0 20 40 60 80 100<br />
Figure 4.1: Result of a location based (left), and topology based (right) one-hop aggregator election<br />
protocol. Solid dots represent the aggregators, and empty circles represent cluster members.<br />
Both approaches may use controlled flooding of broadcast messages. In case of location based<br />
or preset clustering, the scope of a flood is restricted to a given geographic region or preset cluster.<br />
Nodes within that region re-broadcast the message to be flooded when they receive it for the first<br />
time. Nodes outside of the region or having different preset cluster IDs simply drop the message.<br />
In case of topology based clustering, it is assumed that the broadcast messages has a Time-to-<br />
Live field that controls the scope of the flooding. Any node that receives a broadcast message<br />
with a positive TTL value for the first time will automatically decrement the TTL value and rebroadcast<br />
the message. Duplicates and messages with TTL smaller than or equal to zero are silently<br />
discarded. When I say that a node broadcasts a message, I mean such a controlled flooding (either<br />
location based, preset or topology based, depending on the context). In Section 4.4, connected<br />
dominating sets (CDS) are used to implement efficient broadcast messaging. The concept of CDS<br />
will be introduced there.<br />
51
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
We can call the set of nodes which are (in the location based and the preset case) or can<br />
potentially be (in the topology based case) in the same cluster as a node S the cluster peers of S.<br />
Hence, in the location based case, the cluster peers of S are the nodes that reside within the same<br />
geographic region as node S. In the preset case, the cluster peers are the nodes sharing the same<br />
cluster ID. In the topology based case, the set of cluster peers of S usually consists in its n-hop<br />
neighborhood, for some parameter n. The nodes may not explicitly know all their cluster peers.<br />
The main functional requirement of any clustering algorithm is that either node S or at least<br />
one of the cluster peers of S will be elected as aggregator.<br />
The leader of each cluster is called cluster aggregator, or simply aggregator. In the following I<br />
will use aggregator, cluster aggregator and data aggregator interchangeably.<br />
As mentioned in Section 4.1, an attacker can gain much more information by attacking an<br />
aggregator node than attacking a normal node. To attack a data aggregator node either physically<br />
or logically, first the attacker must identify that node. In this chapter I assume that the attacker’s<br />
goal is to identify the aggregator (which means that simply preventing, jamming or confusing the<br />
aggregation is not the goal of the attacker). In Section 4.4.5 I go a little further, and analyze what<br />
happens if a compromised node does not follow the proposed protocols in order to mislead the<br />
operator.<br />
An attacker who wants to discover the identity of the aggregators can eavesdrop the communication<br />
between any nodes, can actively participate in the communication (by deleting modifying<br />
and inserting messages) and can physically compromise some of the nodes. A compromised node<br />
is under the full control of the attacker, the attacker can fully review the inner state of that node,<br />
and can control the messages sent by that node.<br />
Compromising a node is a much harder challenge for an attacker than simply eavesdropping the<br />
communication. It requires physical contact with the node and some advanced knowledge, however<br />
it is far from impossible for an attacker with good electrical and laboratory background [Anderson<br />
and Kuhn, 1996]. So I propose two solutions. The first basic protocol can fully withstand a passive<br />
eavesdropper, but a compromising attacker can gain some knowledge about the identities of the<br />
cluster aggregators. The second advanced protocol can withstand a compromising attacker as well,<br />
with only leaking information about the compromised nodes.<br />
In case of a passive adversary, a rather simple solution could be based on a common shared<br />
global key. Using that shared global key as a seed of a pseudo random number generator, every<br />
node can construct locally (without any communications) the same pseudo randomly ordered list<br />
of all nodes. These lists will be identical for every node because all nodes use the same seed and the<br />
same pseudo random number generator. Then, the first A nodes of the list are elected aggregators<br />
such that every node can communicate with a cluster aggregator and no subset of A covers the<br />
whole system. An illustration of the result of this algorithm can be seen on Figure 4.1 for location<br />
based and topology based cluster aggregator election.<br />
The problem with this solution is that it is not robust: compromising a single node would leak<br />
the common key, and the adversary could compute the identifier of all cluster aggregators. While I<br />
do not want to fully address the problem of compromised nodes in the first protocol, I still aim at<br />
a more robust solution than the one described above. In particular, the system should not collapse<br />
by compromising just a single or a few nodes.<br />
The second protocol can withstand the compromise of some nodes without the degradation<br />
of the privacy of the cluster aggregators. This protocol meets the following goals and has the<br />
following limitations:<br />
The identity of the non-compromised cluster aggregators remains secret even in the presence<br />
of passive and active attackers or compromised nodes.<br />
The attacker can learn whether the compromised node is an aggregator.<br />
An attacker can force a compromised node to be aggregator, but does not know anything<br />
about the existence or identity of the other aggregators.<br />
The attacker cannot achieve that no aggregator is elected in the cluster, however all the<br />
elected aggregator(s) may be compromised nodes.<br />
52
4.3. Basic protocol<br />
The main difference between the first and second protocol is the following. The first protocol<br />
is very simple, but not perfect as a compromised node can reveal the identity of the aggregators.<br />
The second protocol requires more complex computations, but offers anonymity in case of node<br />
compromise as well. In some cases such complex computations are outside the capabilities of the<br />
nodes (or the probability of compromise is low), but anonymity is still required by the system.<br />
In these cases I suggest to use the first protocol. If the probability of node compromise is not<br />
negligible, then the use of the second protocol is recommended.<br />
4.3 Basic protocol<br />
In this section, I describe the basic protocol that I propose for private aggregator node election.<br />
First I give a brief overview of the basic principles of the protocol, and present the details later.<br />
After that, some important details of this basic protocol is presented in Section 4.3.2, where I also<br />
describe how to set the parameters of the protocol.<br />
4.3.1 Protocol description<br />
I assume that the nodes are synchronized (see [Faizulkhakov, 2007] for a survey on time synchronization<br />
mechanism for sensor networks), and each node starts executing the protocol roughly<br />
at the same time. The protocol terminates after a predefined fix amount of time. During the<br />
execution of the protocol, any node that has not received any aggregator announcement yet may<br />
decide to become an aggregator, in which case, it broadcasts an aggregator announcement message<br />
announcing itself as a cluster aggregator. This message is broadcast among the cluster peers of<br />
the node sending the announcement (see Section 4.2). Upon reception of a cluster aggregator<br />
announcement, any node that has neither announced itself as a cluster aggregator nor received any<br />
such announcement yet will consider the sender of the announcement as its cluster aggregator. In<br />
order to prevent an external observer to learn the identity of the cluster aggregators, all messages<br />
sent in the protocol are encrypted such that only the nodes to whom they are intended can decrypt<br />
them. For this, it is assumed that each node shares a common key with all of its cluster peers (an<br />
overview of available key establishment mechanisms for sensor networks can be found in [Lopez<br />
and Zhou, 2008]). In addition, in order to avoid that message originators are identified as cluster<br />
aggregators, the nodes that will be cluster members are required to send dummy messages that<br />
cannot be distinguished from the announcements by the external observer (i.e., they are encrypted<br />
and disseminated in the same way as the announcements).<br />
Note that the proposed basic protocol considers only either pairwise keys between the neighboring<br />
nodes or group keys shared between sets of neighboring nodes, so no global key is assumed.<br />
Such pairwise or group keys can be established by the techniques proposed in [Lopez and Zhou,<br />
2008]. The key establishment can be based on randomly selected key sets. In such a protocol, the<br />
probability that neighboring nodes share a common key is high, and the unused keys are deleted<br />
[Chan et al., 2003]. The key establishment can be also based on a common key which is deleted<br />
after some short time when the neighbors are discovered [Zhu et al., 2003]. Any node that owns<br />
the common key can generate a pairwise key with a node which owns or previously owned the<br />
common key. The basic method for exchanging a group/cluster key with the neighboring nodes is<br />
to send the same random key to each neighbor encrypted with the previously exchanged pairwise<br />
keys.<br />
The pseudo-code of the protocol is given in Algorithm 2, and a more detailed explanation<br />
of the protocol’s operation is presented below. The protocol consists of two rounds, where the<br />
length of each round is τ. The nodes are synchronized, they all know when the first round begins,<br />
and what the value of τ is. At the beginning, each node starts two random timers, T1 and T2,<br />
where T1 expires in the first round (uniformly at random) and T2 expires in the second round<br />
(uniformly at random). Each node also initializes at random a binary variable, called announFirst,<br />
that determines in which round the node would like to send a cluster aggregator announcement.<br />
53
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
Algorithm 2 Basic private cluster aggregator election algorithm<br />
start T1, expires in rand(0,τ) //timer, expires in round 1<br />
start T2, expires in rand(τ,2τ) //timer, expires in round 2<br />
announFirst = (rand(0,1) ≤ γ)<br />
CAID = -1 // ID of the cluster aggregator of the node<br />
while T1 NOT expired do<br />
if receive ENC(announcement) AND (CAID = -1) then<br />
CAID = ID of sender of announcement<br />
end if<br />
end while<br />
// T1 expired<br />
if announFirst AND (CAID = -1) then<br />
broadcast ENC(announcement);<br />
CAID = ID of node itself;<br />
else<br />
broadcast ENC(dummy);<br />
end if<br />
while T2 NOT expired do<br />
if receive ENC(announcement) AND (CAID = -1) then<br />
CAID = ID of sender of announcement<br />
end if<br />
end while<br />
// T2 expired<br />
if (NOT announFirst) AND (CAID = -1) then<br />
broadcast ENC(announcement);<br />
CAID = ID of node itself;<br />
else<br />
broadcast ENC(dummy);<br />
end if<br />
54
Table 4.1: Estimated time of the building blocks on a Crossbow MICAz mote<br />
Algorithm Generation [ms] Verification [ms]<br />
SHA-1 [Ganesan et al., 2003] 1.4 –<br />
RSA 1024 bit [Piotrowski et al., 2006] 12040 470<br />
RC4 [Ganesan et al., 2003] 0.1 0.1<br />
RC5 [Ganesan et al., 2003] 0.4 0.4<br />
4.3. Basic protocol<br />
The probability that announFirst is set to the first round is γ, which is a system parameter. The<br />
setting of γ is elaborated in Section 4.3.2.<br />
In the first round, every node S waits for its first timer T1 to expire. If S receives an announcement<br />
before T1 expires, then the sender of the announcement will be the cluster aggregator of S.<br />
When T1 expires, S broadcasts a message as follows: if announFirst is set to the first round and<br />
S has not received any announcement yet, then S sends an announcement, in which it announces<br />
itself as a cluster aggregator. Otherwise, S sends a dummy message. In both cases, the message is<br />
encrypted (denoted by ENC() in the algorithm) such that only the cluster peers of S can decrypt<br />
it.<br />
The second round is similar to the first round. When T2 expires S broadcasts a message as<br />
follows: if announFirst is set to the second round and S has not received any announcement yet,<br />
then S sends an announcement, otherwise, S sends a dummy message. In both cases, the message<br />
is encrypted.<br />
It is easy to see that at the end of the second round each node is either a cluster aggregator or<br />
it is associated with a cluster aggregator whose ID is stored in variable CAID. Without the second<br />
round, a node can remain unassociated, if it sends and receives only dummy messages in the first<br />
round. In addition, a passive observer only sees that every node sends two encrypted messages,<br />
one in each round. This makes it difficult for the adversary to identify who the cluster aggregators<br />
are (see also more discussion on this in the next section). In addition, if a node is compromised,<br />
the adversary learns only the identity of the cluster aggregators whose announcements have been<br />
received by the compromised node.<br />
In WSNs, it must be analyzed what happens if some messages are delayed or lost in the noisy<br />
unreliable channel. Two cases must be analyzed, dummy messages and announcements. If a<br />
dummy message is delayed or not delivered successfully to all recipients, then the result of the<br />
protocol is not modified as dummy messages serve for only covering the announcements. If an<br />
announcement is delayed or not delivered to a node, then the recipient will not select the sender as<br />
cluster aggregator. It will select a node who sent the announcement later or the node elects itself<br />
and sends an announcement. The message loss may modify the resulting set of cluster aggregators,<br />
but neither harm the anonymity of the elected aggregators, nor harm the original goal of cluster<br />
aggregator election (a node must be either a cluster aggregator or a cluster aggregator must be<br />
elected from the nodes cluster peers).<br />
Note that two neighboring nodes can send an announcement at the same time with some small<br />
probability. Actually, it is not a problem in the protocol. The only result is that both nodes<br />
will be cluster aggregators independently. As it is not conflicting with the original goal of cluster<br />
aggregator election, this infrequent situation does not need any special attention.<br />
The overhead introduced by the basic protocol is sending two encrypted messages for each<br />
election round. Other protocols [Buttyán and Schaffer, 2010; Heinzelman et al., 2000] uses one (or<br />
zero) unencrypted messages to elect an aggregator. So the number of messages sent in the election<br />
phase is slightly larger compared to other solutions. The symetric encryption also causes some<br />
extra overhead (for details, see Table 4.1, rows with RC4 and RC5).<br />
55
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
4.3.2 Protocol analysis<br />
In this section the previously suggested basic protocol is analyzed. As defined in Section 4.2, the<br />
main goal of the attacker is to reveal the identity of the cluster aggregators. To do so, the attacker<br />
can eavesdrop, modify, and delete messages, and can capture some nodes.<br />
First the logical attacks are analyzed where the attacker does not capture any nodes, then the<br />
results of a node capture.<br />
The attackers main goal is to reveal the identity of the cluster aggregators. As all the inter node<br />
communication is encrypted and authenticated, it cannot get any information from the messages<br />
themselves, but it can get some side information from simple traffic and topology analysis.<br />
Density based attack<br />
Thanks to the dummy messages and the encryption in the basic protocol, an external observer<br />
cannot trivially identify the cluster aggregators; however, it can still use side information and<br />
suspect some nodes to be cluster aggregators with higher probability than some other nodes. Such<br />
a side information is the number of the cluster peers of the nodes. This number correlates with<br />
the local density of the nodes, that is why this attack is called density based attack. Indeed, the<br />
probability of becoming a cluster aggregator depends on the number of the cluster peers of the<br />
node. For instance, if a node does not have any cluster peers, it will be a cluster aggregator with<br />
probability one. On the other hand, if the node has a larger number of cluster peers, then the<br />
probability of receiving an announcement from a cluster peer is large, and hence, the probability<br />
that the node itself becomes cluster aggregator is small. Note also that the number of cluster peers<br />
can be deduced from the topology of the network, which may be known to the adversary.<br />
The probability of becoming a cluster aggregator is approximately inversely proportional to the<br />
number of cluster peers:<br />
Pr(CA(S)) ∼ = 1<br />
D(S)<br />
where CA(S) is the event of S being elected cluster aggregator, and D(S) is the number of cluster<br />
peers of node S. Figure 4.2 illustrates this proportionality where the curve belongs to Equation 4.1<br />
and the plotted dots correspond to simulation results (100 nodes, random deployment, one hop<br />
communication, topology based clustering). It can be seen, that Equation 4.1 is quite sharp, it is<br />
very close to the simulated results.<br />
Two approaches can be used to mitigate this problem. One is to take the number of cluster<br />
peers of the nodes into account when generating the random timers for the protocol. The second<br />
is to balance the logical network topology in such a way that every node has the same number of<br />
cluster peers. In the following a possible solution for both approaches is introduced.<br />
The first approach can be the fine tuning of the distributions. It is not analyzed here deeply,<br />
because it can only slightly modify the probabilities of being cluster aggregator, so it has no large<br />
effect. An example can be seen on Figure 4.3, where the 10 th power of D(S) is used as a normalizing<br />
factor, when γ (probability of sending an announcement in the first round) is computed. The<br />
coefficients of the polynomial are set as resulting curve is the closest to uniform distribution. It can<br />
be seen, that modifying γ on a per node basis does not eventually reaches its goal, the normalized<br />
distribution is far from uniform. Actually by modifying γ, the other attack discussed in the next<br />
section can be mitigated, so here I propose a solution which does not set the γ parameter.<br />
The second approach modifies the number of cluster peers of a node to reach a common value.<br />
Let us denote this value by α.<br />
An efficient approach to mitigate this problem is to modify the number of cluster peers such<br />
that it becomes a common value α for all of them. In theory, this common value can be anything<br />
between 1 and the total number N of the nodes in the network. In practice, it should be around<br />
the average number of cluster peers, which can be estimated locally by the nodes. For example,<br />
assuming one-hop communications (meaning that the cluster peers are the radio neighbors), the<br />
following formula can be used:<br />
56<br />
(4.1)
Probability of being cluster aggregator<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
Simulation<br />
Analytical<br />
0<br />
0 5 10 15 20 25<br />
Number of cluster peers<br />
4.3. Basic protocol<br />
Figure 4.2: Probability of being cluster aggregator as a function of the number of cluster peers.<br />
α = (N − 1) R2 π<br />
A<br />
+ 1 ≃ E(D(S)) (4.2)<br />
where R is the radio range, and A is the size of the total area of the network. The formula is based<br />
on the fact that the number of cluster peers is proportional to the ratio between radio coverage<br />
and total area. Similar formulae can be derived for the general case of multi-hop communication.<br />
If a node S has more than α cluster peers it can simply discard the messages from D(S) − α<br />
randomly chosen cluster peers. If S has less than α cluster peers it must get new cluster peers by<br />
the help of its actual cluster peers (if S has not got any cluster peers originally, then it will always<br />
become a cluster aggregator). The new cluster peers can be selected from the set of cluster peers of<br />
the original cluster peers. To explore the potential new cluster peers, every node can broadcast its<br />
list of cluster peers within its few hop neighborhood before running the basic protocol. From the<br />
lists of the received cluster peers, every node can select its α − D(S) new cluster peers uniformly<br />
at random. Then, the basic aggregator election protocol can be executed using the balanced set of<br />
cluster peers. An example for this balancing is shown in Figure 4.4 (70 nodes, random deployment,<br />
one hop communication, topology based clustering).<br />
After running the balancing protocol, every node can approach the envisioned α value. The<br />
advantage of the balancing protocol is that however an attacker can gather the information about<br />
the number of cluster peers, this number is efficiently balanced after the protocol. The drawback of<br />
this solution is that it requires the original cluster peers to relay messages between distant nodes.<br />
One can imagine this solution as selectively increasing the TTL of protocol messages creating much<br />
larger neighborhoods.<br />
Order based attack<br />
Another important side information an attacker can use is the order in which the nodes send<br />
messages in the first round of the protocol. Indeed, the sender of the i-th message will be cluster<br />
aggregator if none of the previous i − 1 messages are announcements (but dummies) and the i-th<br />
message is an announcement. Thus, the probability Pi that the sender of the i-th message becomes<br />
cluster aggregator depends on i and parameter γ:<br />
57
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
Probability of being cluster aggregator<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
Normalized<br />
Original<br />
0<br />
0 5 10 15 20 25<br />
Number of cluster peers<br />
Figure 4.3: Probability of being cluster aggregator as a function of number of cluster peers. The<br />
analytical values comes from Equation 4.1, while the simulation values come from simulation,<br />
where the γ probabilities are normalized with the number of cluster peers of the nodes.<br />
Pi = (1 − γ) i−1 γ, 1 ≤ i ≤ n<br />
The (n + 1)-th element of the distribution is the probability that no announcement is sent in<br />
the first round:<br />
Pn+1 = (1 − γ) n<br />
in which case the sender of the first message of the second round must be a cluster aggregator.<br />
The entropy of this distribution characterizes the uncertainty of the attacker who wants to<br />
identify the cluster aggregator using the order information. Assuming that the number of cluster<br />
peers has been already balanced, this entropy can be calculated as follows:<br />
Number of cluster peers<br />
40<br />
30<br />
20<br />
10<br />
0<br />
0 20 40<br />
Node ID<br />
60<br />
Number of cluster peers<br />
40<br />
30<br />
20<br />
10<br />
0<br />
0 20 40<br />
Node ID<br />
60<br />
Figure 4.4: Result of balancing. The 70 nodes are represented on the x axis. The number of cluster<br />
peers before (left), and after (right) the balancing are represented on the y axis.<br />
58
− n∑<br />
i=1<br />
H = − n+1 ∑<br />
4.3. Basic protocol<br />
Pi log Pi = (4.3)<br />
i=1<br />
(<br />
(1 − γ) i−1 (<br />
γ log (1 − γ) i−1 ))<br />
γ −<br />
− (1 − γ) n log (1 − γ) n<br />
where γ is the probability of sending an announcement in the first round and n is the balanced<br />
number of cluster peers.<br />
Entropy<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
0 0.2 0.4 0.6 0.8 1<br />
γ<br />
Figure 4.5: Entropy of the attacker as a function of sending announcement in the first round (γ).<br />
Number of nodes in one cluster: 10.<br />
In Figure 4.5, I plotted formula (4.3). If γ is large, then the uncertainty of the attacker is low,<br />
because one of the first few senders will become the cluster aggregator with very high probability.<br />
If γ is very small, then the uncertainty of the attacker is small again, because no cluster aggregator<br />
will be elected in the first round with high probability, and therefore, the first sender of the second<br />
round will be the cluster aggregator. The ideal γ value corresponds to the maximum entropy,<br />
which can be easily computed by the nodes locally from formula (4.3). For instance, Table 4.2<br />
shows some ideal γ values for different number of nodes in one cluster. The fifth row (Hmax)<br />
shows the maximal entropy (uncertainty) that any kind of election protocol can achieve with the<br />
given number of nodes. This is achieved if every node is equiprobably elected from the viewpoint<br />
of the attacker. This value is closely approached by H(ˆγ), where ˆγ is very close to the optimal<br />
solution (the difference between the found value and the optimal value can be arbitrarily small,<br />
and depends on the number of iterations the estimation algorithm uses). Using the found ˆγ value,<br />
the order of the messages has no meaning for the attacker.<br />
Node capture attacks<br />
If an attacker can compromise a node, it can reveal some sensitive information, even when the<br />
system uses the local key based protocol. If the compromised node is a cluster aggregator, then<br />
all the previously stored messages can be revealed. The attacker can decide to demolish the node,<br />
modify the stored values, simply use the captured data, or modify the aggregation functions.<br />
59
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
Table 4.2: Optimal γ values (ˆγ) for different number of nodes in one cluster. Achieved entropy<br />
(H(ˆγ)) and maximal entropy (Hmax = log 2 n)<br />
n 10 25 50 100<br />
ˆγ 0.167 0.082 0.049 0.027<br />
nˆγ 1.67 2.05 2.45 2.7<br />
H(ˆγ) 3.281 4.410 5.312 6.218<br />
Hmax 3.322 4.644 5.644 6.644<br />
If the compromised node is not a cluster aggregator, then the attacker can reveal the cluster<br />
aggregator of that node, which can result in the same situation described in the previous paragraph.<br />
4.3.3 Data forwarding and querying<br />
The problem of forwarding the measured data to the aggregators without revealing the identity<br />
of the aggregators is a well known problem in the literature, called anonymous routing [Seys and<br />
Preneel, 2006; Zhang et al., 2006; Rajendran and Sreenaath, 2008].<br />
Anonymous routing let us route packets in the network without revealing the destination of<br />
the packet. A short overview of anonymous routing can be found in Section 4.5.<br />
With anonymous routing any node can send the measurements to the aggregators without<br />
revealing the identity of it. An operator can query the aggregator with the help of an ordinary<br />
node which uses anonymous routing towards the aggregator.<br />
Anonymous routing introduces significant overhead in the traffic. However this can be partially<br />
mitigated by synchronizing the data transmissions. Instead of suggesting such an approach, in this<br />
chapter I elaborate a more challenging situation where the identity of the aggregators is unknown<br />
to the cluster members as well in Section 4.4.3. The clear advantage is that even if a node is<br />
compromised, it’s aggregator cannot be identified.<br />
4.4 Advanced protocol<br />
The advanced private data aggregation protocol is designed to withstand the compromise of some<br />
nodes without revealing the identities of the aggregator. The protocol consists of four main parts.<br />
The first part is the initialization, which provides the required communication channel. The second<br />
part is needed for the data aggregator election. This subprotocol must ensure that the cluster does<br />
not remain without a cluster aggregator. This must be done without revealing the identity of the<br />
elected aggregator. The third part is needed for the data aggregation. This subprotocol must be<br />
able to forward the measured data to the aggregator without knowing its identifier. The last part<br />
must support the queries, where an operator queries some stored aggregated data.<br />
In the following, the description of each subprotocol follows the same pattern. First the goal<br />
and the requirements of the subprotocol are discussed, then the subprotocol itself is presented.<br />
After the presentation of the subprotocol, I analyze how it achieves its goal even in the presence<br />
of an attacker, and what data and services it provides to the next subprotocol.<br />
At the end of this section, misbehavior is analyzed. I discuss, what an attacker can achieve,<br />
if its goal is not to identify the aggregators of the cluster, but to confuse the operation of the<br />
protocols.<br />
In the following, it is assumed that every node knows which cluster it belongs to. The protocol<br />
descriptions are considering only one cluster, and separate instances of the protocol are run in<br />
different clusters independently.<br />
The complexity of each subprotocol is summarized in Table 4.3. This table gives an overview of<br />
the message complexity of the used subprotocols, so the bandwidth requirements can be calculated<br />
from it. It can be seen, that the rarely used election protocol has the highest complexity, and the<br />
frequently used aggregation is the most lightweight protocol in use.<br />
60
4.4. Advanced protocol<br />
Table 4.3: Summary of complexity of the advanced protocol. N is the number of nodes in the<br />
cluster<br />
Election Aggregation Query<br />
Message complexity O(N 2 ) O(N) O(N)<br />
Modular exponentiations 4N 1 0 0<br />
Hash computations 0 0 1<br />
4.4.1 Initialization<br />
The initialization phase is responsible for providing the medium for authenticated broadcast communication.<br />
In the following, I shortly review the approaches of broadcast authentication in wireless<br />
sensor networks, and give some efficient methods for broadcast communication.<br />
The initialization relies on some data stored on each node before deployment. Each node<br />
has some unique cryptographic credentials to enable authentication, and is aware of the cluster<br />
identifier it belongs to. In the following, without further mentioning, it is assumed, that each<br />
message contains the cluster identifier. Every message addressed to a cluster different from the one<br />
a node belongs to is discarded by the node. First, I briefly review the state of the art in broadcast<br />
authentication, then I propose a connected dominating set based broadcast communication method,<br />
which fits well to the following aggregation and query phases.<br />
Broadcast authentication<br />
Broadcast authentication enables a sender to broadcast some authenticated messages efficiently to<br />
a big number of potential receivers. In the literature, this problem is solved with either digital<br />
signatures or hash chains. In this section, I review some solutions from both approaches.<br />
For the sake of completeness, Message Authentication Codes (MAC) must also be mentioned<br />
here [Preneel and Oorschot, 1999]. MACs are based on symmetric cryptographic primitives, which<br />
enable very efficient computation. Unfortunately, the verifier of a MAC must also possess the<br />
same cryptographic credential the generator used for generating the MAC. It means that every<br />
node must know every credential in the network, to verify every message broadcast to the network.<br />
This full knowledge can be exploited by an attacker who compromises a node. The attacker can<br />
impersonate any other honest node, which means that if only one node is compromised, message<br />
authenticity can no longer be ensured.<br />
One solution to the node compromise is the hop by hop authentication of the packets. In hop<br />
by hop authentication, every packets authentication information is regenerated by every forwarder.<br />
In this case, it is enough to only have a shared key with the direct neighbors of a node. In case<br />
of node compromise, only the node itself and the direct neighbors can be impersonated. Such a<br />
neighborhood authentication is provided by Zhu et al. in LEAP [Zhu et al., 2003], where it is<br />
based on so called cluster keys.<br />
To make the authentication scheme robust against node compromise, one approach is the usage<br />
of asymmetric cryptography, namely digital signatures.<br />
Digital signatures are asymmetric cryptographic primitives, where only the owner of a private<br />
key can compute a digital signature over a message, but any other node can verify that signature.<br />
Computing a digital signature is a time consuming task for a typical sensor node, but there exist<br />
some efficient elliptic curve based approaches in the literature [Liu and Ning, 2008; Szczechowiak<br />
et al., 2008; Oliveira et al., 2008; Xiong et al., 2010].<br />
One of the first publicly available implementations was the TinyECC module written by Liu and<br />
Ning [Liu and Ning, 2008]. A more efficient implementation is the NanoECC module. Proposed<br />
by Szczechowiak et al. [Szczechowiak et al., 2008]. It is based on the MIRACL cryptographic<br />
library [mir, ] . Up to now, to the best of my knowledge, the fastest implementations are the<br />
1 4 exponentiations for generating the two messages with knowledge proofs and 4N-4 exponentiations for checking<br />
the received knowledge proofs<br />
61
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
TinyPBC by Oliveira et al. [Oliveira et al., 2008], which is based on the RELIC toolkit [rel, ], and<br />
the TinyPairing proposed by Xiong et al. in [Xiong et al., 2010].<br />
Another approach is proposed for broadcast authentication in wireless sensor networks by Perrig<br />
et al. in [Perrig et al., 2002]. The µTESLA scheme is based on delayed release of hash chain<br />
values used in MAC computations. The scheme needs secure loose time synchronization between<br />
the nodes. The µTESLA scheme is efficient if it is used for authenticating many messages, but<br />
inefficient if the messages are sparse. Consequently, if only the rarely sent election messages must<br />
be authenticated, then the time synchronization itself can cause a heavier workload then simple<br />
digital signatures. If the aggregation messages must also be authenticated, then µTESLA can<br />
be an efficient solution. A DoS resistant version specially adapted for wireless sensor networks is<br />
proposed by Liu et al. in [Liu et al., 2005]. A faster but less secure modification is proposed by<br />
Huang et al. in [Huang et al., 2009].<br />
In the following it is assumed, that an efficient broadcast authentication scheme is used without<br />
any indication.<br />
Broadcast communication<br />
Broadcast communication is a method that enables sending information from one source to every<br />
other participant of the network. In wireless networks it can be implemented in many ways, like<br />
flooding the network or with a sequence of unicast messages.<br />
A natural question would be, why broadcast communication is so important to the advanced<br />
protocol? The reason is that only broadcast communication can hide the traffic patterns of the<br />
communication, thus not revealing any information about the aggregators.<br />
An efficient way of implementing broadcast communication in wireless sensor networks is the<br />
usage of connected dominating set (CDS). The connected dominating set S of graph G is defined<br />
as a subset of G such that every vertex in G − S is adjacent to at least one member of S, and S is<br />
connected. A graphical representation of a CDS can be found in Figure 4.6. The minimum connected<br />
dominating set (MCDS) is a connected dominating set with minimum cardinality. Finding<br />
a MCDS in a graph is an NP-Hard problem, however there are some efficient solutions which can<br />
find a close to minimal CDS in WSNs. For a thorough review of the state of the art of CDS in<br />
WSNs, the interested reader is referred to [Blum et al., 2004a] and [Jacquet, 2004].<br />
In the following, it is assumed that a connected dominating set is given in each cluster, and a<br />
minimum spanning tree is generated between the nodes in the CDS. Finding a minimum spanning<br />
tree in a connected graph is a well known problem for decades. Efficient polynomial algorithms<br />
are suggested in [Kruskal, 1956; Prim, 1957]. This kind of two layer communication architecture<br />
enables the efficient implementation of different kind of broadcast like communications, which are<br />
required for the following protocols. The spanning tree is used in the aggregation protocol in<br />
Section 4.4.3.<br />
The simple all node broadcast communication can be implemented simply: if a node sends a<br />
packet to the broadcast address, then every node in the CDS forwards this message to the broadcast<br />
address. The CDS members are connected and every non CDS member is connected to at least one<br />
CDS member by definition, so the message will be delivered to every recipient in the network. This<br />
approach is more efficient than simple flooding as only a subset of the nodes forwards the message,<br />
but the properties of the CDS ensures that every node in the cluster will eventually receive the<br />
broadcast information. Here, the notion of CDS parent (or simply parent) must be introduced.<br />
The CDS parent of node A is a node, which is in communication distance with A and is a member<br />
of the CDS.<br />
The complexity of such a broadcast communication is O(N), but actually it takes |S| messages<br />
to broadcast some information, where |S| is the number of nodes in the connected dominating<br />
set. If the CDS algorithm is accurate, than it can be very close to the minimum number of nodes<br />
required to broadcast communication.<br />
In the following, broadcast communication is used frequently to avoid that an attacker can gain<br />
some knowledge about the identity of the aggregators from the traffic patterns inside the network.<br />
Obviously not every message is broadcast in the network, because that would shortly lead to<br />
62
100<br />
90<br />
80<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
0<br />
0 20 40 60 80 100<br />
4.4. Advanced protocol<br />
Figure 4.6: Connected dominating set. Solid dots represents the dominating set, and empty circles<br />
represent the remaining nodes. The connections between the non CDS nodes of the network is not<br />
displayed on the figure.<br />
battery depletion and inoperability of the sensor network. Instead of automatically broadcasting<br />
every message, as much information as possible is aggregated in each message to preserve energy.<br />
In the following sections, I will use the given CDS in different ways, and each particular usage will<br />
be described in the corresponding section.<br />
The used communication patterns are closely related to and inspired by the Echo algorithm<br />
published by Chang in [Chang, 2006]. The Echo algorithm is a Wave algorithm [Tel, 2000], which<br />
enables the distributed computation of an idempotent operator in trees. It can be used in arbitrary<br />
connected graphs, and generates a spanning tree as a side result.<br />
4.4.2 Data aggregator election<br />
The main goal of the aggregator node election protocol is to elect a node that can store the<br />
measurements of the whole cluster in a given epoch, but in such a way that the identity remains<br />
hidden. The election is successful if at least one node is elected. The protocol is unsuccessful if<br />
no node is elected, thus no node stores the data. In some cases, electing more than one node can<br />
be advantageous, because the redundant storage can withstand the failure of some nodes. In the<br />
following, I propose an election protocol, where the expected number of elected aggregators can<br />
be determined by the system operator, and the protocol ensures that at least one aggregator is<br />
always elected.<br />
The election process relies on the initialization subprotocol discussed in Section 4.4.1. It requires<br />
an authenticated broadcast channel among the cluster members, which is exactly what the<br />
initialization part offers.<br />
The election process consists of two main steps: (i) Every node decides, whether it wants to<br />
be an aggregator, based on some random values. This step does not need any communication,<br />
the nodes compute the results locally. (ii) In the second step, an anonymous veto protocol is run,<br />
which reveals only the information that at least one node elected itself to be aggregator node. If<br />
no aggregator is elected, it will be clear for every participant, and every participant can run the<br />
election protocol again.<br />
63
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
Step (i) can be implemented easily. Every node elects itself aggregator with a given probability<br />
p. The result of the election is kept secret, the participants only want to know that the number c<br />
of aggregators is not zero, without revealing the identity of the cluster aggregators. This is advantageous,<br />
because in case of node compromise, the attacker learns only whether the compromised<br />
node is an aggregator, but nothing about the identity or the number of the other aggregators. Let<br />
us denote the random variable representing the number of elected aggregators with C. It is easy<br />
to see that the distribution of C is binomial (N is the total number of nodes in one cluster):<br />
Pr(C = c) =<br />
( N<br />
c<br />
)<br />
p c (1 − p) N−c<br />
The expected number of aggregators after the first step is: cE = Np. So if on average ĉ cluster<br />
aggregator is needed, then p should be ĉ<br />
N (this formula will be slightly modified after considering<br />
the results of the second step).<br />
The probability that no cluster aggregator is elected is: (1 − p) N . To avoid this anarchical<br />
situation when no node is elected, the nodes must run step (ii) which proves that at least one node<br />
is elected as aggregator node, but the identity of the aggregator remains secret. This problem can<br />
be solved by an anonymous veto protocol. Such a protocol is suggested by Hao and Zieliński in<br />
[Hao and Zielinski, 2006].<br />
Hao and Zieliński’s approach has many advantageous properties compared to other solutions<br />
[Brandt, 2006; Chaum, 1988], such as it requires only 2 communication rounds.<br />
The anonym veto protocol requires knowledge proofs. Informally, a knowledge proof allows a<br />
prover to convince a verifier that he knows a solution of a hard-to-solve problem without revealing<br />
any useful information about the knowledge. A detailed explanation of the problem can be found<br />
in [Camenisch and Stadler, 1997]<br />
A well known example of knowledge proof is given by Schnorr in [Schnorr, 1991]. The proposed<br />
method gives a non interactive proof of knowledge of a logarithm without revealing the logarithm<br />
itself. The operation can be described briefly as follows. The proof of knowledge of the exponent of<br />
gx i consists of the pair {gv , r = v − xih}, where h = H(g, gv , gx i , i) and H is a secure hash function.<br />
This proof of knowledge can be verified by anyone through checking whether g v and g r g xih are<br />
equal.<br />
The operation of the anonym veto protocol consists of two consecutive rounds (G is a publicly<br />
agreed group with order q and generator g):<br />
1. First, every participant i selects a secret random value: xi ∈ Zq. Then g x i is broadcast with<br />
a knowledge proof. The knowledge proof is needed to ensure that the participant knows xi<br />
without revealing the value of xi. Without the knowledge proof, the node could choose gx i in<br />
a way to influence the result of the protocol (it is widely believed that for a given gx i (mod p)<br />
it is hard to find xi(mod p), this problem is known as the discrete logarithm problem). Then<br />
every participant checks the knowledge proofs, and computes a special product of the received<br />
values:<br />
g yi i−1 ∏<br />
= g xj<br />
/<br />
N∏<br />
j=1<br />
j=i+1<br />
2. g yici is broadcast with a knowledge proof (the knowledge proof is needed to ensure that the<br />
node cannot influence the election maliciously afterwards). ci is set to xi for non aggregators,<br />
while a random ri value for aggregators.<br />
The product P = N∏<br />
gciyi equals to 1 if and only if no cluster aggregator is elected (none vetoed<br />
i=1<br />
the question: Is the number of cluster aggregators elected zero?). If no aggregator is elected, then<br />
it will be clear for all participants, and the election can be done again. If P differs from 1, then<br />
some nodes are announced themselves to be cluster aggregators, and this is known by all the nodes.<br />
64<br />
g xj
4.4. Advanced protocol<br />
If we consider the effect of the second step (new election is run if no aggregator is elected), the<br />
expected number of aggregators is slightly higher than in the case of binomial distributions. The<br />
expected number of aggregators are:<br />
cE =<br />
Np<br />
1 − (1 − p) N<br />
The anonymity of the election subprotocol depends on the parts of the protocol. Obviously,<br />
the random number generation does not leak any information about the identity of the aggregator<br />
nodes, if the random number generator is secure. A cryptographically secure random number<br />
generator, called TinyRNG, is proposed in [Francillon and Castelluccia, 2007] for wireless sensor<br />
networks. Using a secure random number generator, it is unpredictable, who elects itself to be<br />
aggregator node.<br />
The anonymity analysis of the anonym veto protocol can be found in [Hao and Zielinski, 2006].<br />
The anonymity is based on the decisional Diffie-Hellman assumption, which is considered to be a<br />
hard problem.<br />
The message complexity of the election is O(N 2 ), which is acceptable as the election is run<br />
infrequently (N is the number of nodes in the cluster).<br />
If this overhead with the 4 modular exponentiations (see Table 4.3 for the complexities and<br />
Table 4.1 for the estimated running times, note that RSA is based on modular exponentiation)<br />
is too big for the application, then it can use the basic protocol described in Section 4.3.1, where<br />
only symmetric key encryption is used.<br />
In wireless sensor networks, the links in general are not reliable, packet losses occur in time to<br />
time. Reliability can be introduced by the link layer or by the application. As it is crucial to run<br />
the election protocol without any packet loss, it is required to use a reliable link layer protocol for<br />
this subprotocol. Such protocols are suggested in [Iqbal and Khayam, 2009; Wan et al., 2002] for<br />
wireless sensor networks.<br />
As a summary, after the election subprotocol every node is equiprobably aggregator node. The<br />
election subprotocol ensures that at least one aggregator is elected and this node(s) is aware of<br />
its status. An outside attacker does not know the identity of the aggregators or even the actual<br />
number of the elected aggregator nodes. An attacker, who compromised one or more nodes, can<br />
decide whether the compromised nodes are aggregators, but cannot be certain about the other<br />
nodes.<br />
4.4.3 Data aggregation<br />
The main goal of the WSN is to measure some data from the environment, and store the data<br />
for later use. This section describes how the data is forwarded to the aggregator(s) without the<br />
explicit knowledge of the identifier(s) of the aggregator(s).<br />
The data aggregation and storage procedure use the broadcast channel. If the covered area<br />
is so small or the radio range is so large that every node can reach each other directly, then the<br />
aggregation can be implemented simply. Every node broadcasts their measurement to the common<br />
channel, and the cluster aggregator(s) can aggregate and store the measurements. If the covered<br />
area is bigger (which is the more realistic case), a connected dominating set based solution is<br />
proposed.<br />
In each timeslot, each ordinary node (not member of the CDS) sends its measurement to one<br />
neighboring CDS member (to the parent) by unicast communication. When the epoch is elapsed<br />
and all the measurements from the nodes are received, the CDS nodes aggregate the measurements<br />
and use a modification of the Echo algorithm on the given spanning tree to compute the gross<br />
aggregated measurement in the following way: each CDS member waits until all but one CDS<br />
neighbor sends its subaggregate to it, and after some random delay it sends the aggregate to the<br />
remaining neighbor. This means that the leaf nodes of the tree start the communication, and then<br />
the communication wave is propagated towards the root of the spanning tree. This behavior is the<br />
same as the second phase of the Echo algorithm. When one node receives the subaggregates from<br />
all of its neighbors, thus cannot send it to anyone, it can compute the gross aggregated value of<br />
65
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
1;1<br />
3;1<br />
1;1<br />
3;1<br />
2;1<br />
3;1 4;1<br />
4;1<br />
1;1<br />
3;1<br />
2;2<br />
2;2<br />
3;3<br />
4;1<br />
1;1<br />
2.6;5<br />
2.6;5<br />
2.6;5<br />
2.6;5<br />
2.6;5<br />
2.6<br />
2.6;5<br />
2.6 2.6;5<br />
2.6;5<br />
2.6;5<br />
Aggregator<br />
Figure 4.7: Aggregation example. The subfigures from left to right represents the consecutive steps<br />
of an average computation: (i) The measured data is ready to send. It is stored in a format of<br />
actual average; number of data. Non CDS nodes sends the average to their parents. (ii) The CDS<br />
nodes start to send the aggregated value to its parents. (iii) A CDS node receives an aggregate<br />
from all of its neighbors, and starts to broadcast the final aggregated value. Nodes willing to store<br />
the value can do so. (iv) Other CDS nodes receiving the final value rebroadcasts it. Nodes willing<br />
to store the value can do so.<br />
the network. Then, this value is distributed between the cluster members by broadcasting it every<br />
CDS member.<br />
This second phase is needed, so that every member of the cluster can be aware of the gross<br />
aggregated value, and the anonymous aggregators can store it, while the others can simply discard<br />
it. The stored data includes the timeslot in which the aggregate was computed, and the environmental<br />
variables if more than one variable (e.g. temperature and humidity) are recorded besides<br />
the value itself.<br />
The aggregation function can be any statistical function of the measured data. Some easily implementable<br />
and widely used functions are the minimum, maximum, sum or average. In Figure 4.7,<br />
the aggregation protocol is visualized with five nodes and two aggregators using the average as an<br />
aggregation function.<br />
The anonymity analysis of the aggregation subprotocol is quite simple. After the aggregation,<br />
every node possesses the same information as an external attacker can get. This information is<br />
the aggregated data itself, without knowing anything about the identity of the aggregators. If the<br />
operator wants to hide the aggregated data, it can use some techniques discussed in Section 4.5.<br />
The message complexity of the aggregation is O(N), where N is the number of nodes in the<br />
cluster. This is the best complexity achievable, because to store all the measurements by a single<br />
aggregator, all nodes must send the measurements towards the aggregator, which leads to O(N)<br />
message complexity. In terms of latency, the advanced protocol doubles the time the aggregated<br />
measurement arrives to the aggregator compared to a naive system, where the identity of the<br />
aggregators are known to every participant. This latency is acceptable as in most WSN applications<br />
the time between the measurements is much longer than the time required to aggregate the data.<br />
As mentioned in the election subprotocol, the protocol must be prepared to packet losses due<br />
to the nature of wireless sensor networks. In the aggregation subprotocol two kind of packet loss<br />
can be envisioned: a packet can be lost before or after the final aggregate is computed. Both<br />
cases can be detected by timers and a resend request can be sent. If the resend is unsuccessful for<br />
some times, the aggregation must be run without those messages. If the lost message contains a<br />
measurement or subaggregate, then the final aggregate will be computed without that data leading<br />
to an inaccurate measurement. If the lost message contained the gross aggregate, then some nodes<br />
will not receive the gross aggregate. Here it is very useful that the network can have multiple<br />
aggregators, because if at least one aggregator receives the data, the data can be queried by the<br />
operator.<br />
66<br />
CDS
4.4.4 Query<br />
4.4. Advanced protocol<br />
The ultimate goal of the sensor network is to make the measured data available to the operator<br />
upon request. While the aggregation subprotocol ensures that the measured data is stored by the<br />
aggregators, the goal of the query subprotocol is to provide the requested data to the operator and<br />
keep the aggregators’ identity hidden at the same time.<br />
One solution would be that the operator visits all the nodes, and connects to them by wire.<br />
While this solution would leak no information about the identities of the aggregators to any eavesdropping<br />
attacker, the execution would be very time consuming and cumbersome. Moreover, the<br />
accessibility of some nodes may be difficult or dangerous (for example in a military scenario).<br />
Therefore, I propose a solution where it is sufficient for the operator to get in wireless communication<br />
range of any of the nodes. This node does not need to be an aggregator, as actually no one,<br />
not even the operator knows who the aggregator nodes are.<br />
As a first step, the operator authenticates itself to the selected node O using the key kO. After<br />
that, node O starts the query protocol by sending out a query, obtains the response to the query<br />
from the cluster, and makes the response available to the operator. In the following, it is assumed<br />
that O is not a CDS node. (If it is indeed a CDS node, then the first and last transmission of the<br />
query protocol can be omitted.)<br />
Node O broadcasts the query data Q with the help of the CDS nodes in the cluster. This<br />
is done by sending Q to the CDS parent, and then every CDS member rebroadcasts Q as it is<br />
received. The query Q describes what information the operator is interested in. It includes a<br />
variable name, a time interval, and a field for collecting the response to the query. It also includes<br />
a bit, called “aggregated”, which will later be used in the detection of misbehaving nodes. For the<br />
details of misbehaving node detection, the reader is referred to Section 4.4.5; here we assume that<br />
the “aggregated” bit is always set meaning that aggregation is enabled.<br />
The idea of the query protocol is that each node i in the cluster contributes to the response by<br />
a number Ri, which is computed as follows:<br />
{<br />
h(Q|ki), for non-aggregators<br />
Ri =<br />
(4.4)<br />
h(Q|ki) + M, for aggregators<br />
where M is the stored measurement (available only if the node is an aggregator), h is a cryptographic<br />
hash function, and ki is the key shared by node i and the operator. Thus, non-aggregators<br />
contribute with a pseudo-random number h(Q|ki) computed from the query and the key ki, which<br />
can later be also computed by the operator, while aggregator nodes contribute with the sum of<br />
a pseudo-random number and the requested measurement data. The sum is normal fix point<br />
addition, which can overflow if the hash is a large value.<br />
The goal is that the querying node O receives back the sum of all these Ri values. For this<br />
reason, when the query Q is received by a non CDS node from its CDS parent, it computes its<br />
Ri value and sends it back to the CDS parent in the response field of the query token. When a<br />
CDS parent receives back the query tokens with the updated response field from its children, it<br />
computes the sum of the received Ri values and its own, and after inserting the identifiers of the<br />
nodes sends the result back to its parent. This is repeated until the query token reaches back to<br />
the CDS parent of node O, which can forward the response R = ∑ Ri and the list of responding<br />
nodes to node O, where the sum is computed by normal fix point addition. This operation is<br />
illustrated in Figure 4.8.<br />
When receiving R from O, the operator can calculate the stored data as follows. First of all, the<br />
operator can regenerate each hash value h(Q|ki), because it stores (or can compute from a master<br />
key on-the-fly) each key ki, and it knows the original query data Q. The operator can subtract the<br />
hash values from R (note that the responding nodes list is present in the response), and it gets a<br />
result R ′ = cM, where c is the actual number of aggregators in the cluster 2 . Unfortunately, this<br />
number c is unknown to the operator, as it is unknown to everybody else. Nevertheless, if M is<br />
2 Note that each aggregator contributed the measurement M to the response, that is why at the end, the response<br />
will be c times M, where c is the number of aggregators.<br />
67
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
O<br />
Q<br />
Q<br />
Q<br />
Q<br />
O Q<br />
Q<br />
Q<br />
Q Q<br />
O Q<br />
R 1<br />
Q<br />
R 1 +R 2 +R 3<br />
R 2<br />
R +... +R<br />
1 5<br />
O<br />
R +R +R +R<br />
1 2 3 4<br />
Q<br />
R 1 +R 2 +R 3<br />
Figure 4.8: Query example. The subfigures from left to right represents the consecutive steps of a<br />
query: (i) The operator sends the Q query to node O. This node forwards it to its CDS parent.<br />
The CDS parent broadcasts the query. (ii) The CDS nodes broadcasts the query, so every node in<br />
the network is aware of Q. (iii) Every non CDS node (except O) sends it response to its parent. (iv)<br />
The sum of the responses is propagated back to the parent of O (including the list of responding<br />
nodes, not on the figure), who forwards it to the operator through O.<br />
restricted to lie in an interval [A, B] such that the intervals [iA, iB] for i = 1, 2, . . . , N are nonoverlapping,<br />
then cM can fall only into interval [cA, cB], and hence, c can be uniquely determined<br />
by the operator by checking which interval R ′ belongs to. Then, dividing R ′ with c gives the<br />
requested data M.<br />
More specifically, and for practical reasons, the following three criteria need to be satisfied by<br />
the interval [A, B] for my query scheme to work: (i) as we have seen before, for unique decoding<br />
of cM, the intervals [iA, iB] for i = 1, 2, . . . , N must be non-overlapping, (ii) in order to fit in<br />
the messages and to avoid integer overflow 3 , the highest possible value for cM, i.e., NB must be<br />
representable with a pre-specified number L, and (iii) it must be possible to map a pre-specified<br />
number D of different values into [A, B].<br />
The first criterion (i) is met, if the lower end of each interval is larger than the higher end of<br />
the preceding interval:<br />
0 < iA − (i − 1)B = i(A − B) + B, i = 1, . . . , N<br />
Note that if the above inequality holds for i = N, then it holds for every i, because A − B is a<br />
negative constant and B is a positive constant. So it is enough to consider only the case of i = N:<br />
The second criterion (ii) means that<br />
0 < N(A − B) + B<br />
B < N<br />
N−1 A<br />
BN < L<br />
B < L<br />
N<br />
while the third criterion (iii) can be formalized as<br />
D < B − A<br />
B > A + D<br />
Figure 4.9 shows an example for a graphical representation of the three criteria, where the<br />
crossed area represents the admissible (A, B) pairs. It can also be easily seen in this figure that<br />
a solution exists only if the B coordinate of the intersection of inequalities (4.5) and (4.7) meets<br />
criterion (4.6), or in other words<br />
3 In case of overflow, the result is not unique.<br />
68<br />
(4.5)<br />
(4.6)<br />
(4.7)
L<br />
N<br />
D<br />
B<br />
NM < L<br />
N<br />
(4.5)<br />
A<br />
(4.7)<br />
(4.6)<br />
Figure 4.9: Graphical representation of the suitable intervals<br />
4.4. Advanced protocol<br />
As a numerical example, let us assume, that we want to measure at least 100 different values<br />
(D = 99), the micro-controller is a 16 bit controller (L = 2 16 ), and we have at most 20 nodes in<br />
each cluster (N = 20). Then a suitable interval that satisfies all three criteria would be [A, B] =<br />
[2000 − 2100]. Checking that this interval indeed meets the requirements is left for the interested<br />
reader. Finally, note that any real measurement interval can be easily mapped to this interval<br />
[A, B] by simple scaling and shifting operations, and my solution requires that such a mapping is<br />
performed on the real values before the execution of the query protocol.<br />
Our proposed protocol has many advantageous properties. First, the network can respond to<br />
a query if at least one aggregator can successfully participate in the subprotocol. Second, the<br />
operator does not need to know the identity of the aggregators, thus even the operator cannot<br />
leak that information accidentally (although, after receiving the response, the operator learns the<br />
actual number of the aggregator nodes). Third, the protocol does not leak any information about<br />
the identity of the aggregators: an attacker can eavesdrop the query information Q, and the Ri<br />
pseudo random numbers, but cannot deduce from them the identity of the aggregators. Finally,<br />
the message complexity of the query is O(N), where N is the number of nodes in the cluster. This<br />
is the best complexity achievable, when the originator of the query does not know the identity of<br />
the aggregator(s). The latency of the query protocol depends on the longest path of the network<br />
rooted at node O.<br />
As mentioned in the previous subprotocols, the protocol must be prepared to packet losses<br />
due to the nature of wireless sensor networks. Due to the packet losses, the final sum R is the<br />
sum of the responding nodes which is a subset of all nodes. That is why the identifiers must be<br />
included in the responses. The operator can calculate cM independently from the actual subset of<br />
responders. If at least one response from an aggregator gets to the operator, it can calculate M in<br />
the previously described way. If cM = 0, then it is clear for the operator that every aggregators’<br />
response is lost.<br />
4.4.5 Misbehaving nodes<br />
In this section, I look beyond my initial goal. I briefly analyze what happens if a compromised<br />
node deviates from the protocol to achieve some goals other than just learning the identity of the<br />
aggregators.<br />
In the election process, a compromised node may elect itself to be aggregator in every election.<br />
This can be a problem if this node is the only elected aggregator, because a compromised node may<br />
69
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
not store the aggregated values. Unfortunately this situation cannot be avoided in any election<br />
protocol, because an aggregator can be compromised after the election, and the attacker can erase<br />
the memory of that node. Actually my protocol is partially resistant to this attack, because more<br />
than one aggregator may be elected with some probability, and the attacker cannot be sure if the<br />
compromised node is the single aggregator node in the cluster.<br />
During the aggregation, a misbehaving node can modify its readings, or modify the values it<br />
aggregates. The modification of others’ values can be prevented by some broadcast authentication<br />
schemes discussed in Section 4.4.1. The problem of reporting false values can be handled by<br />
statistical approaches discussed in [Buttyán et al., 2006; Wagner, 2004; Buttyán et al., 2009].<br />
The most interesting subprotocol from the perspective of misbehaving nodes is the query protocol.<br />
In this protocol, a compromised node can easily modify the result of the query in the following<br />
way. A compromised node can add an arbitrary number X to the hash in Equation (4.4) instead of<br />
using 0 or M. It is easy to see, that if X is selected from the interval [A, B], then after subtracting<br />
the hashes, the resulting sum R ′ will be an integer in the interval [(c+1)A, (c+1)B] (c is the actual<br />
number of aggregators, c + 1 nodes act like aggregators, the c aggregator and the compromised<br />
node). A compromised node can further increase its influence by choosing X from the interval<br />
[iA, iB]. This means that the resulting sum R ′ will be in the interval [(c + i)A, (ci)B]. If X is<br />
not selected from interval [jA, jB], j = 1 . . . N, then the result can be outside of the decodable<br />
intervals. This can be immediately detected by the operator (see Figure 4.10).<br />
If the result is in a legitimate interval (∃j, R ′ ∈ [jA, jB]), then the operator can further check the<br />
consistency by calculating R ′ mod j. If the result is zero, then it is possible, that no misbehaving<br />
node is present in the network. If the result is non zero, the operator can be sure, that apart<br />
from the zeros and Ms, some node sent a different value, thus a misbehaving node is present in<br />
the network. It is hard for the attacker to guess j, because it neither knows the actual number of<br />
aggregators, nor can calculate R ′ from R by subtracting the unknown hashes.<br />
If the modulus is zero, but the operator is still suspicious about the result, it can further test<br />
the cluster for misbehaving nodes with the help of the aggregated bit in the queries. This further<br />
testing can be done regularly, randomly, or on receiving suspicious results. If the aggregated bit is<br />
cleared in a query Q, then the CDS nodes does not sum the incoming replies, but forward them<br />
towards the agent O node as they are received. So if the operator wants to check if a misbehaving<br />
node is present in the network it can run a query Q with aggregated bit set, and then run the<br />
same query with cleared aggregated bit. If the two results are different, then the operator can be<br />
sure, that a node wants to hide its malicious activity from the operator. If the two sums are equal,<br />
then the operator can further check the results from the second round. If the values are all equal<br />
after subtracting the hashes (not considering the zero values), then no misbehavior is detected,<br />
otherwise some node(s) misbehave in the cluster.<br />
Note here, that this algorithm does not find every misbehavior, but the misbehaviors not<br />
detected by this algorithm does not influence the operator. For example, two nodes can misbehave<br />
such that the first adds S to its hash and the second adds −S. It is clear that this misbehavior<br />
does not effect the result computed by the operator, because S − S = 0. Other misbehavior not<br />
detected by the algorithm if a compromised non aggregator node sends M instead of 0. This is<br />
not detected by the algorithm, but not modifies the result the operator computes. The operation<br />
of misbehavior detection algorithm is depicted on Figure 4.10. This algorithm only detects if some<br />
misbehavior is occurred in the cluster, but does not necessarily find the misbehaving node. I left<br />
the elaboration of this problem for future work.<br />
4.5 Related work<br />
A survey on privacy protection techniques for WSNs is provided in [Li et al., 2009], where they are<br />
classified into two main groups: data-oriented and context oriented protection. In this section, I<br />
briefly review these techniques, with an emphasis on those solutions that are closly related to my<br />
work.<br />
In data-oriented protection, the confidentiality of the measured data must be preserved. It is<br />
70
R ′ = R− N<br />
i=1 h(Q|ki)<br />
ÆÓ Å×ÚÓÖ<br />
R ′ ∃j R<br />
×<br />
mod j = 0 ÆÓ Å×ÚÓÖ<br />
×<br />
ÙÖØÖ<br />
ÆÓ<br />
ÆÓÑ×ÚÓÖ<br />
×<br />
ÊÚRiÚÐÙ×<br />
′ ∈ [jA,jB]<br />
N i=1<br />
×<br />
R′ i = R ′<br />
∃M R ′ i = 0∨R ′ i = M<br />
ÆÓÑ×ÚÓÖ<br />
×<br />
ÆÓ<br />
ÆÓ<br />
ËÒQÕÙÖÝÛØ<br />
ÖØØÐÖ<br />
R ′ i = Ri −h(Q|ki)<br />
Å×ÚÓÖ<br />
Å×ÚÓÖ<br />
Figure 4.10: Misbehavior detection algorithm for the query protocol.<br />
71<br />
4.5. Related work
4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />
also a research direction how the operator can verify if the received data is correct. The main<br />
focus is on the confidentiality in [He et al., 2007], while the verification of the received data is also<br />
ensured in [Sheng and Li, 2008].<br />
According to [Li et al., 2009] context oriented protection covers the location privacy of the<br />
source and the base station. The source location privacy is mainly a problem in event driven<br />
networks, where the existence and location of the event is the information, which must be hidden.<br />
The location privacy of the base station is discussed in [Deng et al., 2006b]. The main difference<br />
between hiding the base station and the in network aggregators is that a WSN regularly contains<br />
only one base station which is a predefined node, while at the same time there are more in network<br />
aggregators used in one network, and the nodes used as aggregators are periodically changed.<br />
The problem of private cluster aggregator election in wireless sensor networks is strongly related<br />
to anonym routing in WSNs. The main difference between anonym routing and anonymous<br />
aggregation is that anonym routing supports any traffic pattern and generally handles external attackers,<br />
while anonymous aggregation supports aggregation specific traffic patterns and can handle<br />
compromised nodes as well. In [Seys and Preneel, 2006] an efficient anonymous on demand routing<br />
scheme called ARM is proposed for mobile ad hoc networks. For the same problem another solution<br />
is given in [Zhang et al., 2006] (MASK), where a detailed simulation is also presented for the<br />
proposed protocol. A more efficient solution is given in [Rajendran and Sreenaath, 2008], which<br />
uses low cryptographic overhead, and addresses some drawbacks of the two papers above. In [Choi<br />
et al., 2007] a privacy preserving communication system (PPCS) is proposed. PPCS provides a<br />
comprehensive solution to anonymize communication endpoints, keep the location and identifier<br />
of a node unlinkable, and mask the existence of communication flows.<br />
The security of different aggregator node election protocols is surveyed in [Schaffer et al., 2012].<br />
Most protocols are aiming at no security on the election, or they aim at the non-manipulability<br />
of the election. Such protocols are can withstand passive attacks [Kuhn et al., 2006], or active<br />
attacks as well[Sirivianos et al., 2007; Gicheol, 2010].<br />
4.6 Conclusion<br />
In wireless sensor networks, in-network data aggregation is often used to ensure scalability and<br />
energy efficient operation. However, as we saw, this also introduces some security issues: the<br />
designated aggregator nodes that collect and store aggregated sensor readings and communicate<br />
with the base station are attractive targets of physical node destruction and jamming attacks. In<br />
order to mitigate this problem, in this chapter, I proposed two private aggregator node election<br />
protocols for wireless sensor networks that hide the elected aggregator nodes from the attacker,<br />
who, therefore, cannot locate and disable them. My basic protocol provides fewer guarantees than<br />
my advanced protocol, but it may be sufficient in cases where the risk of physical compromise of<br />
nodes is low. My advanced protocol hides the identity of the elected aggregator nodes even from<br />
insider attackers, thus it handles node compromise attacks too.<br />
I also proposed a private data aggregation protocol and a corresponding private query protocol<br />
for the advanced version, which allow the aggregator nodes to collect sensor readings and respond to<br />
queries of the operator, respectively, without revealing any useful information about their identity.<br />
My aggregation and query protocols are resistant to both external eavesdroppers and compromised<br />
nodes participating in the protocol. The communication in the advanced protocol is based on the<br />
concept of connected dominating set, which suits well to wireless sensor networks.<br />
In this chapter I went beyond the goal of only hiding the identity of the aggregator nodes. I<br />
also analyzed what happens if a malicious node wants to exploit the anonymity offered by the<br />
system, and tries to mislead the operator by injecting false reports. I proposed an algorithm that<br />
can detect if any of the nodes misbehaves in the query phase. I only detect the fact of misbehavior<br />
and leave the identification of the misbehaving node itself for future work.<br />
In general, my protocols increase the dependability of sensor networks, and therefore, they can<br />
be applied in mission critical sensor network applications, including high-confidence cyber-physical<br />
72
4.7. Related publications<br />
systems where sensors and actuators monitor and control the operation of some critical physical<br />
infrastructure.<br />
4.7 Related publications<br />
[Buttyán and Holczer, 2009] Levente Buttyán and Tamas Holczer. Private cluster head election<br />
inwireless sensor networks. In Proceedings of the Fifth IEEE International Workshop on Wireless<br />
and Sensor Networks Security (WSNS 2009), pages 1048–1053. IEEE, IEEE, 2009.<br />
[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data<br />
aggregation in wireless sensor networks. In Proceedings of The 7th IEEE International Conference<br />
on Mobile Ad-hoc and Sensor Systems (WSNS 2010), San Francisco, November 2010. IEEE.<br />
[Holczer and Buttyán, 2011] Tamas Holczer and Levente Buttyán. Anonymous aggregator<br />
election and data aggregation in wireless sensor networks. International Journal of Distributed<br />
Sensor Networks, page 18, 2011. Article ID 828414.<br />
[Schaffer et al., 2012] Péter Schaffer, Károly Farkas, Ádám Horváth, Tamás Holczer, and Levente<br />
Buttyán. Secure and reliable clustering in wireless sensor networks: A critical survey. Elsevier<br />
Computer Networks, 2012.<br />
73
Chapter 5<br />
Application of new results<br />
In this dissertation three different wireless network based systems are considered: Radio Frequency<br />
Identification Systems, Vehicular Ad Hoc Networks, and Wireless Sensor Networks. In this chapter,<br />
a brief overview is give, where these systems are used, and how my new results fit in them.<br />
Radio Frequency Identification Systems The application of RFID is very widespread, some<br />
application areas are [Wu et al., 2009; RFID, 2012]:<br />
Payment by mobile phones Many companies like MasterCard or Nokia is working on mobile<br />
phones with embedded RFID capabilities to enable payment by such devices.<br />
Inventory systems RFID systems can provide accurate knowledge of the current inventory,<br />
which helps saving labor cost, and enables self checkout in shops.<br />
Access control RFID tags can be used as identification badges to enable access control in office<br />
buildings, or can be used as tickets in automated fare collection systems.<br />
Transportation and logistics In transportation, RFID tags can help identify cargo, its owner<br />
or destination.<br />
Passport Many countries include RFID tags into passports, to fasten the passport control on the<br />
borders, and to make illegitimate replication harder.<br />
Hospitals and healthcare Hospitals began implanting patients with RFID tags and using RFID<br />
systems, usually for workflow and inventory management [Fisher, 2006].<br />
Libraries Libraries are using RFID to replace the barcodes on library items. An RFID system<br />
may replace or supplement bar codes and may offer another method of inventory management<br />
and self-service checkout by patrons. [Molnar and Wagner, 2004]<br />
Any usage of RFID systems, where the holder of the tag is a human being might breach<br />
the privacy of the holder. The solutions proposed in Chapter 2 can be used in such situations.<br />
An example application is the automated fare collection systems, where the pass for the mass<br />
transportation system can contain an RFID tag. In such a system, the system designer might<br />
consider the usage of key trees or group based private authentication, in particular if the legal<br />
environment requires the usage of some kind of privacy enhancing technology.<br />
Vehicular Ad Hoc Networks The application of Vehicular Ad Hoc Networks is very widespread,<br />
but can be categorized into three main categories: safety related applications, transport efficiency,<br />
and information/entertainment applications [Hartenstein and Laberteaux, 2008; Willke<br />
et al., 2009]. Hundreds of possible applications can be envisioned or are under construction. Such<br />
75
5. APPLICATION OF NEW RESULTS<br />
an application is the cooperative forward collision warning, which help avoiding rear-end collisions<br />
with the use of beacon messages. The traffic efficiency for example can be increased by a traffic<br />
light optimal speed advisory application, which can assists the driver to arrive during a green<br />
phase. An example for the information gathering applications is the ability of remote wireless<br />
diagnosis, which enables to make the state of the vehicle accessible for remote diagnosis.<br />
Most of the safety and traffic efficiency related applications are based on the beacon messages,<br />
which are frequent messages containing the location, heading, identifier, and some other attributes<br />
of the vehicle. These messages can enable the tracking of individual vehicles, which is an undesirable<br />
side effect of the usage of VANETs. This side effect is analyzed in Chapter 3, and a countermeasure<br />
is proposed as well. The countermeasure algorithm is compatible with the framework proposed by<br />
the Car 2 Car Communication Consortium [Consortium, 2012].<br />
Most of the results of Chapter 3 were parts of the results of the SeVeCom 1 European Commission<br />
funded project. The results were delivered to and accepted by the European Commission.<br />
Wireless Sensor Networks Wireless sensor networks can be used in many scenarios. In Chapter<br />
4 I proposed two anonym aggregation schemes, which hides the identity of the aggregator node.<br />
In the following a few applications are given based on [Akyildiz et al., 2002] with a special attention<br />
on the possible need of hiding some special nodes: wireless sensor networks can be an integral part<br />
of military command, control, communications, computing, intelligence, surveillance, reconnaissance<br />
and targeting (C4ISRT) systems, where there is a clear motivation for an attacker to disturb<br />
the normal functioning of the network by eliminating some special nodes. Another example can<br />
be the protection of critical infrastructure. The problem is that some critical infrastructure like<br />
electrical lines or drinking water pipes are so large scale, that it is impossible to protect them<br />
with traditional methods. WSNs can be a possible protection and surveillance system, where the<br />
disturbance of normal operation by the elimination of aggregator nodes must be avoided.<br />
In the above mentioned applications, there is a clear need for aggregation, and the loss of<br />
the aggregator might have undesirable consequences. Hence in these applications, the anonym<br />
aggregator election, aggregation, and query schemes proposed in Chapter 4 can be used.<br />
The goal of the Wireless Sensor and Actuator Networks for Critical Infrastructure Protection<br />
project (WSAN4CIP 2 ), funded by the European Commission, was to make critical infrastructure<br />
more dependable by the use of WSNs. Some of the results of Chapter 4 were integral part of that<br />
project.<br />
In summary, it can be seen that the results of Chapter 2-4 can be used in real applications,<br />
and the problems discussed in the chapters are important for the society.<br />
1 http://www.sevecom.org/ 2 http://www.wsan4cip.eu<br />
76
Chapter 6<br />
Conclusion<br />
In this thesis, I proposed several privacy enhancing protocols for wireless networks. I dealt with<br />
three different types of networks, namely RFID systems, vehicular ad hoc networks, and wireless<br />
sensor networks.<br />
In Chapter 2 I proposed a key-tree and a group based private authentication protocol for RFID<br />
systems. Both approaches use only symmetric key based cryptographic primitives, which well suits<br />
to resource limited RFID systems.<br />
Key-trees provide an efficient solution for private authentication, however, the level of privacy<br />
provided by key-tree based systems decreases considerably if some members are compromised.<br />
This loss of privacy can be minimized by the careful design of the tree. Based on my results<br />
presented in this dissertation, I can conclude that a good practical design principle is to maximize<br />
the branching factor at the first level of the tree such that the resulting tree still respects the<br />
constraint on the maximum authentication delay in the system. Once the branching factor at the<br />
first level is maximized, the tree can be further optimized by maximizing the branching factors<br />
at the successive levels, but the improvement achieved in this way is not really significant; what<br />
really counts is the branching factor at the first level.<br />
In the second part of Chapter 2, I proposed a novel group based private authentication scheme.<br />
I analyzed the proposed scheme and quantified the level of privacy that it provides. I compared<br />
my group based scheme to the key-tree based scheme. I showed that the group based scheme<br />
provides a higher level of privacy than the key-tree based scheme. In addition, the complexity of<br />
the group based scheme for the verifier can be set to be the same as in the key-tree based scheme,<br />
while the complexity for the prover is always smaller in the latter scheme. The primary application<br />
area of my schemes are that of RFID systems, but it can also be used in applications with similar<br />
characteristics (e.g., in wireless sensor networks).<br />
Some possible work that could be done is the usage of different metrics like the entropy based<br />
metric, or the usage of different constraints like the minimal size of the anonymity sets when<br />
selecting a structure like the groups for the users. These new metrics or constraints can make the<br />
resulting optimization problem complex, which can require heuristic solutions as well. A general<br />
framework that could solve the optimization problem for different metrics and constraints could<br />
be a future research direction.<br />
The most criticized part of any key tree or group based solution is the difficulty of the key<br />
update. Hence, a challenging future work could be the implementation of a key update scheme in<br />
a tree based solution.<br />
In the first half of Chapter 3, I studied the effectiveness of changing pseudonyms to provide<br />
location privacy for vehicles in vehicular networks. The approach of changing pseudonyms to<br />
make location tracking more difficult was proposed in prior work, but its effectiveness has not<br />
been investigated yet. In order to address this problem, I defined a model based on the concept of<br />
the mix zone. I assumed that the adversary has some knowledge about the mix zone, and based<br />
77
6. CONCLUSION<br />
on this knowledge, she tries to relate the vehicles that exit the mix zone to those that entered<br />
it earlier. I also introduced a metric to quantify the level of privacy enjoyed by the vehicles in<br />
this model. In addition, I performed extensive simulations to study the behavior of my model in<br />
realistic scenarios. In particular, in my simulation, I used a rather complex road map, generated<br />
traffic with realistic parameters, and varied the strength of the adversary by varying the number of<br />
her monitoring points. My simulation results provided detailed information about the relationship<br />
between the strength of the adversary and the level of privacy achieved by changing pseudonyms.<br />
I abstracted away the frequency with which the pseudonyms are changed, and I simply assumed<br />
that this frequency is high enough so that every vehicle surely changes pseudonym while in the mix<br />
zone. It seems that changing the pseudonyms frequently has some advantages as frequent changes<br />
increase the probability that the pseudonym is changed in the mix zone. On the other hand, the<br />
higher the frequency, the larger the cost that the pseudonym changing mechanism induces on the<br />
system in terms of management of cryptographic material (keys and certificates related to the<br />
pseudonyms). In addition, if for a given frequency, the probability of changing pseudonym in the<br />
mix zone is already close to 1, then there is no sense to increase the frequency further as it will<br />
no longer increase the level of privacy, while it will still increase the cost. Hence, there seems to<br />
be an optimal value for the frequency of the pseudonym change. Unfortunately, this optimal value<br />
depends on the characteristics of the mix zone, which is ultimately determined by the observing<br />
zone of the adversary, which is not known to the system designer.<br />
In the second half of Chapter 3, I proposed a simple and effective privacy preserving scheme,<br />
called SLOW, for VANETs. SLOW requires vehicles to stop sending heartbeat messages below<br />
a given threshold speed (this explains the name SLOW that stands for “silence at low speeds”)<br />
and to change all their identifiers (pseudonyms) after each such silent period. By using SLOW,<br />
the vicinity of intersections and traffic lights become dynamically created mix zones, as there are<br />
usually many vehicles moving slowly at these places at a given moment in time. In other words,<br />
SLOW implicitly ensures a synchronized silent period and pseudonym change for many vehicles<br />
both in time and space, and this makes it effective as a location privacy enhancing scheme. Yet,<br />
SLOW is remarkably simple, and it has further advantages. For instance, it relieves vehicles of<br />
the burden of verifying a potentially large amount of digital signatures when the vehicle density is<br />
large, as this usually happens when the vehicles move slowly in a traffic jam or stop at intersections.<br />
Finally, the risk of a fatal accident at a slow speed is low, and therefore, SLOW does not seriously<br />
impact safety-of-life.<br />
I evaluated SLOW in a specific attacker model that seems to be realistic, and it proved to be<br />
effective in this model, reducing the success rate of tracking a target vehicle from its starting point<br />
to its destination down to the range of 10–30%.<br />
Some future work could be a detailed analysis of the result of SLOW on the safety of vehicles,<br />
or the analysis of the exceptional cases where the vehicles are forced to send a beacon message<br />
below the threshold.<br />
In Chapter 4 I proposed two private aggregation algorithms for wireless sensor networks. In<br />
wireless sensor networks, in-network data aggregation is often used to ensure scalability and energy<br />
efficient operation. However, this also introduces some security issues: the designated aggregator<br />
nodes that collect and store aggregated sensor readings and communicate with the base station<br />
are attractive targets of physical node destruction and jamming attacks. In order to mitigate this<br />
problem, I proposed two private aggregator node election protocols for wireless sensor networks<br />
that hide the elected aggregator nodes from the attacker, who, therefore, cannot locate and disable<br />
them. My basic protocol provides fewer guarantees than my advanced protocol, but it may be<br />
sufficient in cases where the risk of physically compromising nodes is low. My advanced protocol<br />
hides the identity of the elected aggregator nodes even from insider attackers, thus it handles node<br />
compromise attacks too.<br />
I also proposed a private data aggregation protocol and a corresponding private query protocol<br />
for the advanced version, which allow the aggregator nodes to collect sensor readings and respond to<br />
queries of the operator, respectively, without revealing any useful information about their identity.<br />
My aggregation and query protocols are resistant to both external eavesdroppers and compromised<br />
78
6.0. Conclusion<br />
nodes participating in the protocol. The communication in the advanced protocol is based on the<br />
concept of connected dominating set, which suits well to wireless sensor networks.<br />
At the end of Chapter 4 I went beyond the goal of only hiding the identity of the aggregator<br />
nodes. I also analyzed what happens if a malicious node wants to exploit the anonymity offered by<br />
the system, and tries to mislead the operator by injecting false reports. I proposed an algorithm<br />
that can detect if any of the nodes misbehaves in the query phase. I only detect the fact of<br />
misbehavior and leave the identification of the misbehaving node itself for future work. A more<br />
challenging future work is the reduction of the message or computational complexity of the election<br />
subprotocol.<br />
79
List of Acronyms<br />
CA Cluster Aggregator<br />
CDS Connected Dominating Set<br />
CH Cluster Head<br />
DSRC Dedicated Short-Range Communications<br />
ID IDentifier<br />
IR Infrared<br />
MAC Message Authentication Code<br />
OBU On Board unit<br />
RF Radio Frequency<br />
RFID Radio Frequency IDentification<br />
RSA Rivest Shamir Adleman algorithm<br />
RSU Road Side Unit<br />
SEVECOM Secure Vehicular Communication<br />
SLOW Silence at LOW speeds<br />
SUMO Simulation of Urban MObility<br />
TTL Time To Live<br />
VANET Vehicular Ad Hoc Network<br />
VIN Vehicle Identification Number<br />
WSAN4CIP Wireless Sensor and Actuator Networks for Critical Infrastructure Protection<br />
WSN Wireless Sensor Network<br />
81
List of publications<br />
[Avoine et al., 2007] Gildas Avoine, Levente Buttyan, Tamas Holczer, and Istvan Vajda. Groupbased<br />
private authentication. In Proceedings of the International Workshop on Trust, Security,<br />
and Privacy for Ubiquitous Computing (TSPUC 2007). IEEE, 2007.<br />
[Buttyán and Holczer, 2009] Levente Buttyán and Tamas Holczer. Private cluster head election<br />
inwireless sensor networks. In Proceedings of the Fifth IEEE International Workshop on Wireless<br />
and Sensor Networks Security (WSNS 2009), pages 1048–1053. IEEE, IEEE, 2009.<br />
[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data aggregation<br />
in wireless sensor networks. In Proceedings of The 7th IEEE International Conference<br />
on Mobile Ad-hoc and Sensor Systems (WSNS 2010), San Francisco, November 2010. IEEE.<br />
[Buttyan et al., 2004] Levente Buttyan, Tamas Holczer, and Peter Schaffer. Incentives for cooperation<br />
in multi-hop wireless networks. Híradástechnika, LIX(3):30–34, March 2004. (in Hungarian).<br />
[Buttyan et al., 2005] Levente Buttyan, Tamas Holczer, and Peter Schaffer. Spontaneous cooperation<br />
in multi-domain sensor networks. In Proceedings of the 2nd European Workshop on Security<br />
and Privacy in Ad-hoc and Sensor Networks (ESAS), Visegrád, Hungary, July 2005. Springer.<br />
[Buttyan et al., 2006a] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Optimal key-trees<br />
for tree-based private authentication. In Proceedings of the International Workshop on Privacy<br />
Enhancing Technologies (PET), June 2006. Springer.<br />
[Buttyan et al., 2006b] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Providing location<br />
privacy in automated fare collection systems. In Proceedings of the 15th IST Mobile and Wireless<br />
Communication Summit, Mykonos, Greece, June 2006.<br />
[Buttyan et al., 2007] Levente Buttyan, Tamas Holczer, and Istvan Vajda. On the effectiveness<br />
of changing pseudonyms to provide location privacy in vanets. In Proceedings of the Fourth<br />
European Workshop on Security and Privacy in Ad hoc and Sensor Networks (ESAS2007).<br />
Springer, 2007.<br />
[Buttyan et al., 2009] Levente Buttyan, Tamas Holczer, Andre Weimerskirch, and William Whyte.<br />
Slow: A practical pseudonym changing scheme for location privacy in vanets. In Proceedings of<br />
the IEEE Vehicular Networking Conference, pages 1–8. IEEE, IEEE, October 2009.<br />
[Dora and Holczer, 2010] Laszlo Dora and Tamas Holczer. Hide-and-lie: Enhancing applicationlevel<br />
privacy in opportunistic networks. In Proceedings of the Second International Workshop<br />
on Mobile Opportunistic Networking ACM/SIGMOBILE MobiOpp 2010, Pisa, Italy, February<br />
22-23 2010.<br />
[Dvir et al., 2011] Amit Dvir, Tamas Holczer, and Levente Buttyán. Vera - version number and<br />
rank authentication in rpl. In Proceedings of the 7th IEEE International Workshop on Wireless<br />
and Sensor Networks Security (WSNS 2011). IEEE, 2011.<br />
83
LIST OF PUBLICATIONS<br />
[Holczer and Buttyán, 2011] Tamas Holczer and Levente Buttyán. Anonymous aggregator election<br />
and data aggregation in wireless sensor networks. International Journal of Distributed Sensor<br />
Networks, page 18, 2011. Article ID 828414.<br />
[Holczer et al., 2009] Tamas Holczer, Petra Ardelean, Naim Asaj, Stefano Cosenza, Michael Müter,<br />
Albert Held, Björn Wiedersheim, Panagiotis Papadimitratos, Frank Kargl, and Danny De Cock.<br />
Secure vehicle communication (sevecom). Demonstration. Mobisys, June 2009.<br />
[Papadimitratos et al., 2008] Panagiotis Papadimitratos, Antonio Kung, Frank Kargl, Zhendong<br />
Ma, Maxim Raya, Julien Freudiger, Elmar Schoch, Tamas Holczer, Levente Buttyán, and Jean<br />
pierre Hubaux. Secure vehicular communication systems: design and architecture. IEEE Communications<br />
Magazine, 46(11):100–109, 2008.<br />
[Schaffer et al., 2012] Péter Schaffer, Károly Farkas, Ádám Horváth, Tamás Holczer, and Levente<br />
Buttyán. Secure and reliable clustering in wireless sensor networks: A critical survey. Computer<br />
Networks, 2012.<br />
84
Bibliography<br />
[Abadi and Fournet, 2004] M. Abadi and C. Fournet. Private authentication. Theoretical Computer<br />
Science, 322(3):427–476, 2004.<br />
[Akyildiz et al., 2002] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless<br />
sensor networks: a survey. Computer networks, 38(4):393–422, 2002.<br />
[Anderson and Kuhn, 1996] R. Anderson and M. Kuhn. Tamper resistance: a cautionary note. In<br />
Proceedings of the 2nd conference on Proceedings of the Second USENIX Workshop on Electronic<br />
Commerce-Volume 2, page 1. USENIX Association, 1996.<br />
[Aoki and Fujii, 1996] M. Aoki and H. Fujii. Inter-vehicle communication: Technical issues on<br />
vehicle control application. Communications Magazine, IEEE, 34(10):90–93, 1996.<br />
[Armknecht et al., 2007] F. Armknecht, A. Festag, D. Westhoff, and K. Zeng. Cross-layer privacy<br />
enhancement and non-repudiation in vehicular communication. In 4th Workshop on Mobile<br />
Ad-Hoc Networks (WMAN), 2007.<br />
[ASV, ] Advanced safety vehicle program. ”http://www.ahsra.or.jp/demo2000/eng/demo_e/<br />
ahs_e7/iguchi/iguchi.html”.<br />
[Avoine and Oechslin, 2005] G. Avoine and P. Oechslin. A scalable and provably secure hash-based<br />
rfid protocol. In Pervasive Computing and Communications Workshops, 2005. PerCom 2005<br />
Workshops. Third IEEE International Conference on, pages 110–114. IEEE, 2005.<br />
[Avoine et al., 2005] G. Avoine, E. Dysli, and P. Oechslin. Reducing time complexity in rfid systems.<br />
In Proceedings of the 12th Annual Workshop on Selected Areas in Cryptography (SAC’05),<br />
pages 291–306. Springer, 2005.<br />
[Avoine, 2012] Gildas Avoine. Bibliography on security and privacy in rfid systems.<br />
http://www.epfl.ch/*gavoine/rfid/, 2012.<br />
[Baruya, 1998] A. Baruya. Speed-accident relationship on different kinds of european roads. MAS-<br />
TER Deliverable 7, September 1998.<br />
[Beresford and Stajano, 2003] A.R. Beresford and F. Stajano. Location privacy in pervasive computing.<br />
Pervasive Computing, IEEE, 2(1):46–55, 2003.<br />
[Beresford and Stajano, 2004] A.R. Beresford and F. Stajano. Mix zones: User privacy in locationaware<br />
services. In Pervasive Computing and Communications Workshops, 2004. Proceedings of<br />
the Second IEEE Annual Conference on, pages 127–131. IEEE, 2004.<br />
[Berki, 2008] Z. Berki. Development of Traffic Models on the basis of Passanger Demand Surveys<br />
<strong>Thesis</strong> of the PhD dissertation. PhD thesis, Budapest University of Technology and Economics,<br />
2008.<br />
85
BIBLIOGRAPHY<br />
[Beye and Veugen, 2011] M. Beye and T. Veugen. Improved anonymity for key-trees? Technical<br />
report, Cryptology ePrint Archive, Report 2011/395, 2011.<br />
[Beye and Veugen, 2012] M. Beye and T. Veugen. Anonymity for key-trees with adaptive adversaries.<br />
Security and Privacy in Communication Networks, pages 409–425, 2012.<br />
[Black and McGrew, 2008] David L. Black and David A. McGrew. The internet key exchange<br />
(ikev2) protocol. 2008.<br />
[Blum et al., 2004a] Jeremy Blum, Min Ding, Andrew Thaeler, and Xiuzhen Cheng. Connected<br />
dominating set in sensor networksand manets. In D.-Z. Du and P. Pardalos, editors, Handbook<br />
of Combinatorial Optimization, pages 329–369. Kluwer Academic Publishers, 2004.<br />
[Blum et al., 2004b] J.J. Blum, A. Eskandarian, and L.J. Hoffman. Challenges of intervehicle ad<br />
hoc networks. Intelligent Transportation Systems, IEEE Transactions on, 5(4):347–351, 2004.<br />
[Bono et al., 2005] S. Bono, M. Green, A. Stubblefield, A. Juels, A. Rubin, and M. Szydlo. Security<br />
analysis of a cryptographically-enabled rfid device. In 14th USENIX Security Symposium,<br />
volume 1, page 16, 2005.<br />
[Boyd and Mathuria, 2003] C. Boyd and A. Mathuria. Protocols for authentication and key establishment.<br />
Springer Verlag, 2003.<br />
[Brandt, 2006] F. Brandt. Efficient cryptographic protocol design based on distributed El Gamal<br />
encryption. Lecture Notes in Computer Science, 3935:32, 2006.<br />
[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data aggregation<br />
in wireless sensor networks. In Proceedings of the Sixth IEEE International Workshop<br />
on Wireless and Sensor Networks Security (WSNS’10). IEEE, IEEE, 2010.<br />
[Buttyán and Hubaux, 2008] Levente Buttyán and Jean Pierre Hubaux. Security and Cooperation<br />
in Wireless Networks. Cambridge University Press, 2008.<br />
[Buttyán and Schaffer, 2010] Levente Buttyán and Peter Schaffer. Panel: Position-based aggregator<br />
node election in wireless sensor networks. International Journal of Distributed Sensor<br />
Networks, 2010.<br />
[Buttyán et al., 2006] Levente Buttyán, Peter Schaffer, and István Vajda. Ranbar: Ransac-based<br />
resilient aggregation in sensor networks. In In Proceedings of the Fourth ACM Workshop on<br />
Security of Ad Hoc and Sensor Networks (SASN), Alexandria, VA, USA, October 2006. ACM<br />
Press.<br />
[Buttyán et al., 2009] Levente Buttyán, Peter Schaffer, and István Vajda. Cora: Correlation-based<br />
resilient aggregation in sensor networks. Elsevier Ad Hoc Networks, 7(6):1035–1050, 2009.<br />
[Calandriello et al., 2007] Giorgio Calandriello, Panos Papadimitratos, Jean-Pierre Hubaux, and<br />
Antonio Lioy. Efficient and robust pseudonymous authentication in vanet. In VANET ’07:<br />
Proceedings of the fourth ACM international workshop on Vehicular ad hoc networks, pages<br />
19–28, New York, NY, USA, 2007. ACM.<br />
[Camenisch and Lysyanskaya, 2001] J. Camenisch and A. Lysyanskaya. An efficient system for<br />
non-transferable anonymous credentials with optional anonymity revocation. Advances in<br />
Cryptology-EUROCRYPT 2001, pages 93–118, 2001.<br />
[Camenisch and Stadler, 1997] Jan Camenisch and Markus Stadler. Proof systems for general<br />
statements about discrete logarithms. Technical report, Department of Computer Science, ETH<br />
Zürich, 1997.<br />
86
Bibliography<br />
[Carbunar et al., 2007] B. Carbunar, Y. Yu, L. Shi, M. Pearce, and V. Vasudevan. Query privacy<br />
in wireless sensor networks. In Sensor, Mesh and Ad Hoc Communications and Networks, 2007.<br />
SECON’07. 4th Annual IEEE Communications Society Conference on, pages 203–212. IEEE,<br />
2007.<br />
[Chan and Perrig, 2003] H. Chan and A. Perrig. Security and privacy in sensor networks. Computer,<br />
36(10):103–105, 2003.<br />
[Chan et al., 2003] H. Chan, A. Perrig, and D. Song. Random key predistribution schemes for<br />
sensor networks. In IEEE Symposium on Security and Privacy, pages 197–215. IEEE Computer<br />
Society, 2003.<br />
[Chang, 2006] E.J.H. Chang. Echo algorithms: Depth parallel operations on general graphs. Software<br />
Engineering, IEEE Transactions on, (4):391–401, 2006.<br />
[Chaum, 1981] D.L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms.<br />
Communications of the ACM, 24(2):84–90, 1981.<br />
[Chaum, 1988] D. Chaum. The dining cryptographers problem: Unconditional sender and recipient<br />
untraceability. Journal of Cryptology, 1(1):65–75, 1988.<br />
[Chisalita and Shahmehri, 2002] L. Chisalita and N. Shahmehri. A peer-to-peer approach to vehicular<br />
communication for the support of traffic safety applications. In Intelligent Transportation<br />
Systems, 2002. Proceedings. The IEEE 5th International Conference on, pages 336–341. IEEE,<br />
2002.<br />
[Choi et al., 2005] J.Y. Choi, M. Jakobsson, and S. Wetzel. Balancing auditability and privacy in<br />
vehicular networks. In Proceedings of the 1st ACM international workshop on Quality of service<br />
& security in wireless and mobile networks, pages 79–87. ACM, 2005.<br />
[Choi et al., 2007] H. Choi, P. McDaniel, and TF La Porta. Privacy Preserving Communication<br />
in MANETs. In 4th Annual IEEE Communications Society Conference on Sensor, Mesh and<br />
Ad Hoc Communications and Networks, pages 233–242, 2007.<br />
[COM, ] Communications for esafety. ”http://www.comesafety.org/”.<br />
[Consortium, 2012] Car 2 Car Communication Consortium. ”http://www.car-to-car.org”,<br />
2012.<br />
[Deng et al., 2005] J. Deng, R. Han, and S. Mishra. Countermeasures against traffic analysis<br />
attacks in wireless sensor networks. In Security and Privacy for Emerging Areas in Communications<br />
Networks, 2005. SecureComm 2005. First International Conference on, pages 113–126.<br />
IEEE, 2005.<br />
[Deng et al., 2006a] J. Deng, R. Han, and S. Mishra. Decorrelating wireless sensor network traffic<br />
to inhibit traffic analysis attacks. Pervasive and Mobile Computing, 2(2):159–186, 2006.<br />
[Deng et al., 2006b] J. Deng, R. Han, and S. Mishra. Decorrelating wireless sensor network traffic<br />
to inhibit traffic analysis attacks. Pervasive and Mobile Computing, 2(2):159–186, 2006.<br />
[Diaz et al., 2002] C. Diaz, S. Seys, J. Claessens, and B. Preneel. Towards measuring anonymity.<br />
In Proceedings of the 2nd international conference on Privacy enhancing technologies, pages<br />
54–68. Springer-Verlag, 2002.<br />
[Dingledine et al., 2004] R. Dingledine, N. Mathewson, and P. Syverson. Tor: The secondgeneration<br />
onion router. Technical report, DTIC Document, 2004.<br />
[Dötzer, 2006] F. Dötzer. Privacy issues in vehicular ad hoc networks. In Privacy Enhancing<br />
Technologies, pages 197–209. Springer, 2006.<br />
87
BIBLIOGRAPHY<br />
[El Zarki et al., 2002] M. El Zarki, S. Mehrotra, G. Tsudik, and N. Venkatasubramanian. Security<br />
issues in a future vehicular network. In European Wireless, volume 2, 2002.<br />
[Faizulkhakov, 2007] Ya. R. Faizulkhakov. Time synchronization methods for wireless sensor networks:<br />
A survey. Programming and Computing Software, 33(4):214–226, 2007.<br />
[Fisher, 2006] J.A. Fisher. Indoor positioning and digital management. Surveillance and security:<br />
Technological politics and power in everyday life, page 77, 2006.<br />
[Fishkin et al., 2005] K. Fishkin, S. Roy, and B. Jiang. Some methods for privacy in rfid communication.<br />
Security in ad-hoc and sensor networks, pages 42–53, 2005.<br />
[Floerkemeier et al., 2005] C. Floerkemeier, R. Schneider, and M. Langheinrich. Scanning with<br />
a purpose–supporting the fair information principles in rfid protocols. Ubiquitous Computing<br />
Systems, pages 214–231, 2005.<br />
[Francillon and Castelluccia, 2007] Aurélien Francillon and Claude Castelluccia. TinyRNG: A<br />
cryptographic random number generator for wireless sensors network nodes. In Modeling and<br />
Optimization in Mobile, Ad Hoc and Wireless Networks and Workshops, 2007. WiOpt 2007. 5th<br />
International Symposium on, pages 1–7, April 2007.<br />
[Freudiger et al., 2007] J. Freudiger, M. Raya, M. Felegyházi, P. Papadimitratos, and J.-P.<br />
Hubaux. Mix-zones for location privacy in vehicular networks. In Proceedings of the 1st International<br />
Workshop on Wireless Networking for Intelligent Transportation Systems (WiN-ITS<br />
07), 2007.<br />
[Ganesan et al., 2003] P. Ganesan, R. Venugopalan, P. Peddabachagari, A. Dean, F. Mueller, and<br />
M. Sichitiu. Analyzing and modeling encryption overhead for sensor network nodes. In Proceedings<br />
of the 2nd ACM international conference on Wireless sensor networks and applications,<br />
Sep. 2003.<br />
[Gerlach, 2006] M. Gerlach. Assessing and improving privacy in vanets. ESCAR Embedded Security<br />
in Cars, 2006.<br />
[Gicheol, 2010] W. Gicheol. Secure cluster head election using mark based exclusion in wireless<br />
sensor networks. IEICE transactions on communications, 93(11):2925–2935, 2010.<br />
[Goldsmith, 2005] Andrea Goldsmith. Wireless Communications. Cambridge University Press,<br />
New York, NY, USA, 2005.<br />
[Gruteser and Hoh, 2005] M. Gruteser and B. Hoh. On the anonymity of periodic location samples.<br />
In Proceedings of the Second International Conference on Security in Pervasive Computing, pages<br />
179–192. Springer, 2005.<br />
[Gulcu and Tsudik, 1996] C. Gulcu and G. Tsudik. Mixing e-mail with babel. In Network and<br />
Distributed System Security, 1996., Proceedings of the Symposium on, pages 2–16. IEEE, 1996.<br />
[Hancke, 2005] G.P. Hancke. A practical relay attack on iso 14443 proximity cards. Technical<br />
report, University of Cambridge Computer Laboratory, 2005.<br />
[Hao and Zielinski, 2006] F. Hao and P. Zielinski. A 2-round anonymous veto protocol. In Proceedings<br />
of the 14th International Workshop on Security Protocols, Cambridge, UK, 2006.<br />
[Harkins and Carrel, 1998] D. Harkins and D. Carrel. The internet key exchange (ike)protocol.<br />
1998.<br />
[Hartenstein and Laberteaux, 2008] H. Hartenstein and K.P. Laberteaux. A tutorial survey on<br />
vehicular ad hoc networks. Communications Magazine, IEEE, 46(6):164 –171, June 2008.<br />
88
Bibliography<br />
[He et al., 2007] W. He, X. Liu, H. Nguyen, K. Nahrstedt, and T. Abdelzaher. Pda: Privacypreserving<br />
data aggregation in wireless sensor networks. In Proceedings of Infocom, pages 2045–<br />
2053. IEEE, 2007.<br />
[Heinzelman et al., 2000] WR Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energyefficient<br />
communication protocol for wireless microsensor networks. In Proceedings of the 33rd<br />
Annual Hawaii International Conference onSystem Sciences., page 10, 2000.<br />
[Hu and Wang, 2005] Y.C. Hu and H.J. Wang. A framework for location privacy in wireless networks.<br />
In ACM SIGCOMM Asia Workshop. Citeseer, 2005.<br />
[Hu et al., 2005] Y.C. Hu, A. Perrig, and D.B. Johnson. Ariadne: A secure on-demand routing<br />
protocol for ad hoc networks. Wireless Networks, 11(1-2):21–38, 2005.<br />
[Hu et al., 2006] Y.C. Hu, A. Perrig, and D.B. Johnson. Wormhole attacks in wireless networks.<br />
Selected Areas in Communications, IEEE Journal on, 24(2):370–380, 2006.<br />
[Huang et al., 2005] L. Huang, K. Matsuura, H. Yamane, and K. Sezaki. Enhancing wireless<br />
location privacy using silent period. In Wireless Communications and Networking Conference,<br />
2005 IEEE, volume 2, pages 1187–1192. IEEE, 2005.<br />
[Huang et al., 2009] Y. Huang, W. He, and K. Nahrstedt. ChainFarm: A Novel Authentication<br />
Protocol for High-rate Any Source Probabilistic Broadcast. In Proc. of The 6th IEEE International<br />
Conference on Mobile Ad-hoc and Sensor Systems (IEEE MASS), 2009.<br />
[Hubaux et al., 2004] J.P. Hubaux, S. Capkun, and J. Luo. The security and privacy of smart<br />
vehicles. Security & Privacy, IEEE, 2(3):49–55, 2004.<br />
[Instruments, 2005] Texas Instruments. Securing the pharmaceutical supply chain with rfid and<br />
public-key infrastructure (pki) technologies. texas instruments white paper, june 2005, 2005.<br />
[Iqbal and Khayam, 2009] Adnan Iqbal and Syed Ali Khayam. An energy-efficient link layer protocol<br />
for reliable transmission over wireless networks. EURASIP J. Wirel. Commun. Netw.,<br />
2009:28:1–28:10, January 2009.<br />
[ISO, 2008] Iso 9798-2. mechanisms using symmetric encipherment algorithms. 2008.<br />
[ITLaw, ] ITLaw. Right of privacy. ”http://itlaw.wikia.com/wiki/Right_of_privacy”.<br />
[Jacquet, 2004] Philippe Jacquet. Performance of connected dominating set in olsr protocol. Technical<br />
Report RR-5098, INRIA, 2004.<br />
[Jian et al., 2007] Y. Jian, S. Chen, Z. Zhang, and L. Zhang. Protecting receiver-location privacy in<br />
wireless sensor networks. In INFOCOM 2007. 26th IEEE International Conference on Computer<br />
Communications. IEEE, pages 1955–1963. Ieee, 2007.<br />
[Juels and Brainard, 2004] A. Juels and J. Brainard. Soft blocking: Flexible blocker tags on the<br />
cheap. In Proceedings of the 2004 ACM workshop on Privacy in the electronic society, pages<br />
1–7. ACM, 2004.<br />
[Juels et al., 2003] A. Juels, R.L. Rivest, and M. Szydlo. The blocker tag: Selective blocking of<br />
rfid tags for consumer privacy. In Proceedings of the 10th ACM conference on Computer and<br />
communications security, pages 103–111. ACM, 2003.<br />
[Juels et al., 2006] A. Juels, P. Syverson, and D. Bailey. High-power proxies for enhancing rfid<br />
privacy and utility. In Privacy Enhancing Technologies, pages 210–226. Springer, 2006.<br />
[Juels, 2005a] A. Juels. Minimalist cryptography for low-cost rfid tags. Security in Communication<br />
Networks, pages 149–164, 2005.<br />
89
BIBLIOGRAPHY<br />
[Juels, 2005b] A. Juels. Strengthening epc tags against cloning. In Proceedings of the 4th ACM<br />
workshop on Wireless security, pages 67–76. ACM, 2005.<br />
[Juels, 2006] A. Juels. Rfid security and privacy: A research survey. Selected Areas in Communications,<br />
IEEE Journal on, 24(2):381–394, 2006.<br />
[Kamat et al., 2005] P. Kamat, Y. Zhang, W. Trappe, and C. Ozturk. Enhancing source-location<br />
privacy in sensor network routing. In Distributed Computing Systems, 2005. ICDCS 2005.<br />
Proceedings. 25th IEEE International Conference on, pages 599–608. IEEE, 2005.<br />
[Kamat et al., 2007] P. Kamat, W. Xu, W. Trappe, and Y. Zhang. Temporal privacy in wireless<br />
sensor networks. In Distributed Computing Systems, 2007. ICDCS’07. 27th International<br />
Conference on, pages 23–23. IEEE, 2007.<br />
[Kargl et al., 2008] Frank Kargl, Antonio Kung, Albert Held, Giorgo Calandriello, Ta Vinh Thong,<br />
Björn Wiedersheim, Elmar Schoch, Michael Müter, Levente Buttyán, Panagiotis Papadimitratos,<br />
and Jean-Pierre Hubaux. Secure vehicular communication systems: implementation,<br />
performance, and research challenges. IEEE Communications Magazine, 46(11):110–118, 2008.<br />
[Karnadi et al., 2005] F.K. Karnadi, Z.H. Mo, and K. Lan. Rapid generation of realistic mobility<br />
models for vanet. In Wireless Communications and Networking Conference, 2007. WCNC 2007.<br />
IEEE, pages 2506–2511. IEEE, 2005.<br />
[Kelly and Erickson, 2005] E.P. Kelly and G.S. Erickson. Rfid tags: commercial applications v.<br />
privacy rights. Industrial Management & Data Systems, 105(6):703–713, 2005.<br />
[Kesdogan et al., 1998] D. Kesdogan, J. Egner, and R. Büschkes. Stop-and-go-mixes providing<br />
probabilistic anonymity in an open system. In Information Hiding, pages 83–98. Springer, 1998.<br />
[Kfir and Wool, 2005] Z. Kfir and A. Wool. Picking virtual pockets using relay attacks on contactless<br />
smartcard. In Security and Privacy for Emerging Areas in Communications Networks,<br />
2005. SecureComm 2005. First International Conference on, pages 47–58. IEEE, 2005.<br />
[Kloeden et al., 1997] C.N. Kloeden, A.J. McLean, V.M. Moore, and G. Ponte. Travelling speed<br />
and the risk of crash involvement. NHMRC Road Accident Research Unit, The University of<br />
Adelaide, 1997.<br />
[Kohl and Neuman, 1993] J. Kohl and C. Neuman. Rfc 1510: The kerberos network authentication<br />
service (v5). Published Sep, 1993.<br />
[Krajzewicz et al., 2002] Daniel Krajzewicz, Georg Hertkorn, Christian Rössel, and Peter Wagner.<br />
Sumo (simulation of urban mobility); an open-source traffic simulation. In A Al-Akaidi, editor,<br />
Proceedings of the 4th Middle East Symposium on Simulation and Modelling (MESM2002), pages<br />
183–187, Sharjah, United Arab Emirates, September 2002. SCS European Publishing House.<br />
[Kroh et al., 2006] Rainer Kroh, Antonio Kung, and Frank Kargl. Vanets security requirements<br />
final v ersion. Sevecom D1.1, 2006.<br />
[Kruskal, 1956] Jr. Kruskal, Joseph B. On the shortest spanning subtree of a graph and the<br />
traveling salesman problem. Proceedings of the American Mathematical Society, 7(1):pp. 48–50,<br />
1956.<br />
[Kuhn et al., 2006] F. Kuhn, T. Moscibroda, and R. Wattenhofer. Fault-tolerant clustering in ad<br />
hoc and sensor networks. In Distributed Computing Systems, 2006. ICDCS 2006. 26th IEEE<br />
International Conference on, pages 68–68. IEEE, 2006.<br />
[Langheinrich, 2009] M. Langheinrich. A survey of rfid privacy approaches. Personal and Ubiquitous<br />
Computing, 13(6):413–421, 2009.<br />
90
Bibliography<br />
[Leaf and Preusser, 1999] W.A. Leaf and D.F. Preusser. Literature review on vehicle<br />
travel speeds and pedestrian injuries. National Highway Traffic Safety Administration,<br />
http://www.nhtsa.dot.gov/people/injury/research/ pub/HS809012.html, October 1999.<br />
[Li et al., 2009] N. Li, N. Zhang, S.K. Das, and B. Thuraisingham. Privacy preservation in wireless<br />
sensor networks: A state-of-the-art survey. Ad Hoc Networks, 2009.<br />
[Lin and Lu, 2012] Xiaodong Lin and Rongxing Lu. Bibliography on secure vehicular communications.<br />
http://bbcr.uwaterloo.ca/ rxlu/sevecombib.htm, 2012.<br />
[Lin et al., 2008] X. Lin, R. Lu, C. Zhang, H. Zhu, P.H. Ho, and X. Shen. Security in vehicular<br />
ad hoc networks. Communications Magazine, IEEE, 46(4):88–95, 2008.<br />
[Liu and Ning, 2008] An Liu and Peng Ning. Tinyecc: A configurable library for elliptic curve<br />
cryptography in wireless sensor networks. In Proceedings of the 7th International Conference on<br />
Information Processing in Sensor Networks (IPSN 2008), pages 245–256, April 2008.<br />
[Liu et al., 2005] D. Liu, P. Ning, S. Zhu, and S. Jajodia. Practical broadcast authentication in sensor<br />
networks. In Mobile and Ubiquitous Systems: Networking and Services, 2005. MobiQuitous<br />
2005. The Second Annual International Conference on, pages 118–129, 2005.<br />
[Lopez and Zhou, 2008] J. Lopez and J. Zhou. Wireless Sensor Network Security. Cryptology and<br />
Information Security Series, IOS Press, 2008.<br />
[Lopez, 2008] J. Lopez. Wireless sensor network security, volume 1. Ios Pr Inc, 2008.<br />
[Lu et al., 2012] R. Lu, X. Li, T.H. Luan, X. Liang, and X. Shen. Pseudonym changing at social<br />
spots: An effective strategy for location privacy in vanets. Vehicular Technology, IEEE<br />
Transactions on, 61(1):86–96, 2012.<br />
[Luo and Hubaux, 2004] J. Luo and J.P. Hubaux. A survey of inter-vehicle communication. Lausanne,<br />
Switzerland, Tech. Rep, IC/2004/24, 2004.<br />
[Ma et al., 2010] Z. Ma, F. Kargl, and M. Weber. Measuring long-term location privacy in vehicular<br />
communication systems. Computer Communications, 33(12):1414–1427, 2010.<br />
[McMillin et al., 1998] B. McMillin, J. Sirois, R. Mahoney, and F. Budd. Fault-tolerant and secure<br />
intelligent vehicle highway system software a safety prototype. In IEEE International Conference<br />
on Intelligent Vehicles. IEEE, 1998.<br />
[Mehta et al., 2007] K. Mehta, D. Liu, and M. Wright. Location privacy in sensor networks against<br />
a global eavesdropper. In Network Protocols, 2007. ICNP 2007. IEEE International Conference<br />
on, pages 314–323. IEEE, 2007.<br />
[mir, ] http://www.shamus.ie/.<br />
[Molnar and Wagner, 2004] D. Molnar and D. Wagner. Privacy and security in library rfid: Issues,<br />
practices, and architectures. In Proceedings of the 11th ACM conference on Computer and<br />
communications security, pages 210–219. ACM, 2004.<br />
[Nohara et al., 2005] Y. Nohara, S. Inoue, K. Baba, and H. Yasuura. Quantitative evaluation of<br />
unlinkable id matching schemes. In Proceedings of the 2005 ACM workshop on Privacy in the<br />
electronic society, pages 55–60. ACM, 2005.<br />
[Ohkubo et al., 2004] M. Ohkubo, K. Suzuki, and S. Kinoshita. Efficient hash-chain based rfid<br />
privacy protection scheme. In International Conference on Ubiquitous Computing–Ubicomp,<br />
Workshop Privacy: Current Status and Future Directions, 2004.<br />
91
BIBLIOGRAPHY<br />
[Oliveira et al., 2008] Leonardo B. Oliveira, Michael Scott, J ˇ d˙z˝lio Lopez, , and Ricardo Dahab.<br />
TinyPBC: Pairings for Authenticated Identity-Based Non-Interactive Key Distribution in Sensor<br />
Networks. In Proceedings of the 5th International Conference on Networked Sensing Systems<br />
(INSS’08), pages 173–179, Kanazawa/Japan, June 2008. IEEE, IEEE.<br />
[Peris-Lopez et al., 2006] P. Peris-Lopez, J. Hernandez-Castro, J. Estevez-Tapiador, and A. Ribagorda.<br />
Rfid systems: A survey on security threats and proposed solutions. In Personal Wireless<br />
Communications, pages 159–170. Springer, 2006.<br />
[Perrig et al., 2002] Adrian Perrig, Ran Canetti, J. ˜ D. Tygar, and Dawn Song. The TESLA Broadcast<br />
Authentication Protocol. RSA CryptoBytes, 5(Summer), 2002.<br />
[Perrig et al., 2004] A. Perrig, J. Stankovic, and D. Wagner. Security in wireless sensor networks.<br />
Communications of the ACM, 47(6):53–57, 2004.<br />
[Pfitzmann and Köhntopp, 2001] A. Pfitzmann and M. Köhntopp. Anonymity, unobservability,<br />
and pseudonymity–a proposal for terminology. In Designing privacy enhancing technologies,<br />
pages 1–9. Springer, 2001.<br />
[Piotrowski et al., 2006] K. Piotrowski, P. Langendoerfer, and S. Peter. How public key cryptography<br />
influences wireless sensor node lifetime. In Proceedings of the fourth ACM workshop on<br />
Security of ad hoc and sensor networks, pages 169–176, Nov. 2006.<br />
[Preneel and Oorschot, 1999] B. Preneel and Van Oorschot. On the security of iterated message<br />
authentication codes. IEEE Transactions on Information theory, 45(1):188–199, 1999.<br />
[Prim, 1957] R.C. Prim. Shortest connection networks and some generalizations. Bell system<br />
technical journal, 36(6):1389–1401, 1957.<br />
[Rajendran and Sreenaath, 2008] T. Rajendran and K. V. Sreenaath. Secure anonymous routing<br />
in ad hoc networks. In Proceedings of the 1st Bangalore Annual Computer Conference. ACM<br />
New York, 2008.<br />
[Rappaport, 2001] Theodore Rappaport. Wireless Communications: Principles and Practice.<br />
Prentice Hall PTR, Upper Saddle River, NJ, USA, 2nd edition, 2001.<br />
[Raya and Hubaux, 2005] M. Raya and J. P. Hubaux. The security of vehicular ad hoc networks.<br />
In Proc. of Third ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN 2005).<br />
ACM, 2005.<br />
[Raya and Hubaux, 2007] M. Raya and J.P. Hubaux. Securing vehicular ad hoc networks. Journal<br />
of Computer Security, 15(1):39–68, 2007.<br />
[Reiter and Rubin, 1998] M.K. Reiter and A.D. Rubin. Crowds: Anonymity for web transactions.<br />
ACM Transactions on Information and System Security (TISSEC), 1(1):66–92, 1998.<br />
[rel, ] http://code.google.com/p/relic-toolkit/.<br />
[Ren et al., 2011] D. Ren, S. Du, and H. Zhu. A novel attack tree based risk assessment approach<br />
for location privacy preservation in the vanets. In Communications (ICC), 2011 IEEE<br />
International Conference on, pages 1–5. IEEE, 2011.<br />
[RFID, 2012] Wikipedia RFID. Radio-frequency identification. ”http://en.wikipedia.org/<br />
wiki/Radio-frequency_identification”, 2012.<br />
[Rieback et al., 2005] M. Rieback, B. Crispo, and A. Tanenbaum. Rfid guardian: A batterypowered<br />
mobile device for rfid privacy management. In Information Security and Privacy, pages<br />
259–273. Springer, 2005.<br />
92
Bibliography<br />
[Sampigethaya et al., 2005] K. Sampigethaya, L. Huang, M. Li, R. Poovendran, K. Matsuura, and<br />
K. Sezaki. Caravan: Providing location privacy for vanet. In in Embedded Security in Cars<br />
(ESCAR, 2005.<br />
[Sampigethaya et al., 2007] K. Sampigethaya, M. Li, L. Huang, and R. Poovendran. Amoeba:<br />
Robust location privacy scheme for vanet. IEEE Journal on Selected Areas in Communications,<br />
25(8):1569–1589, 2007.<br />
[Schnorr, 1991] C. P. Schnorr. Efficient signature generation by smart cards. Journal of Cryptology,<br />
4(3):161–174, 1991.<br />
[Schoch et al., 2006] E. Schoch, F. Kargl, T. Leinmüller, S. Schlott, and P. Papadimitratos. Impact<br />
of pseudonym changes on geographic routing in vanets. Security and Privacy in Ad-Hoc and<br />
Sensor Networks, pages 43–57, 2006.<br />
[Serjantov and Danezis, 2003] A. Serjantov and G. Danezis. Towards an information theoretic<br />
metric for anonymity. In Privacy Enhancing Technologies, pages 259–263. Springer, 2003.<br />
[Seys and Preneel, 2006] S. Seys and B. Preneel. ARM: Anonymous routing protocol for mobile<br />
ad hoc networks. In 20th International Conference on Advanced Information Networking and<br />
Applications, AINA, pages 133–137. IEEE, 2006.<br />
[Sharma et al., 2012] S. Sharma, A. Sahu, A. Verma, and N. Shukla. Wireless sensor network<br />
security. Advances in Computer Science and Information Technology. Computer Science and<br />
Information Technology, pages 317–326, 2012.<br />
[Sheng and Li, 2008] B. Sheng and Q. Li. Verifiable privacy-preserving range query in two-tiered<br />
sensor networks. In Proceedings of Infocom, pages 46–50. IEEE, 2008.<br />
[Sirivianos et al., 2007] M. Sirivianos, D. Westhoff, F. Armknecht, and J. Girao. Non-manipulable<br />
aggregator node election protocols for wireless sensor networks. In Modeling and Optimization<br />
in Mobile, Ad Hoc and Wireless Networks and Workshops, 2007. WiOpt 2007. 5th International<br />
Symposium on, pages 1–10. IEEE, 2007.<br />
[Studer et al., 2008] A. Studer, E. Shi, F. Bai, and A. Perrig. TACKing Together Efficient Authentication,<br />
Revocation, and Privacy in VANETs. Technical report, Carnegie Mellon CyLab,<br />
2008.<br />
[Syamsuddin et al., 2008] Irfan Syamsuddin, Tharam Dillon, Elizabeth Chang, and Song Han. A<br />
survey of RFID authentication protocols based on hash-chain method. In Convergence and<br />
Hybrid Information Technology – ICCIT’08, volume 2, pages 559–564. IEEE, 2008.<br />
[Szczechowiak et al., 2008] Piotr Szczechowiak, Leonardo B. Oliveira, Michael Scott, Martin Collier,<br />
and Ricardo Dahab. Nanoecc: Testing the limits of elliptic curve cryptography in sensor<br />
networks. In Proceedings of the European conference on Wireless Sensor Networks (EWSN’08),<br />
2008.<br />
[Tel, 2000] Gerard Tel. Introduction to Distributed Algorithms (2nd ed.). Cambridge University<br />
Press, 2000.<br />
[VSC, ] Vehicle safety communications project. ”http://www-nrd.nhtsa.dot.gov/pdf/nrd-12/<br />
CAMP3/pages/VSCC.htm/”.<br />
[Wagner, 2004] David Wagner. Resilient aggregation in sensor networks. In Proceedings of the 2nd<br />
ACM workshop on Security of ad hoc and sensor networks, SASN ’04, pages 78–87, New York,<br />
NY, USA, 2004. ACM.<br />
[Wan et al., 2002] C.Y. Wan, A.T. Campbell, and L. Krishnamurthy. PSFQ: a reliable transport<br />
protocol for wireless sensor networks. In Proceedings of the 1st ACM international workshop on<br />
Wireless sensor networks and applications, pages 1–11. ACM, 2002.<br />
93
BIBLIOGRAPHY<br />
[Wiedersheim et al., 2010] B. Wiedersheim, Z. Ma, F. Kargl, and P. Papadimitratos. Privacy in<br />
inter-vehicular networks: Why simple pseudonym change is not enough. In Wireless On-demand<br />
Network Systems and Services (WONS), 2010 Seventh International Conference on, pages 176–<br />
183. IEEE, 2010.<br />
[Willke et al., 2009] T.L. Willke, P. Tientrakool, and N.F. Maxemchuk. A survey of inter-vehicle<br />
communication protocols and their applications. Communications Surveys Tutorials, IEEE,<br />
11(2):3 –20, quarter 2009.<br />
[Wu et al., 2009] D.L. Wu, W.W.Y. Ng, D.S. Yeung, and H.L. Ding. A brief survey on current rfid<br />
applications. In Machine Learning and Cybernetics, 2009 International Conference on, volume 4,<br />
pages 2330–2335. IEEE, 2009.<br />
[Xi et al., 2006] Y. Xi, L. Schwiebert, and W. Shi. Preserving source location privacy in<br />
monitoring-based wireless sensor networks. In Parallel and Distributed Processing Symposium,<br />
2006. IPDPS 2006. 20th International, pages 8–pp. IEEE, 2006.<br />
[Xiong et al., 2010] Xiaokang Xiong, Duncan S. Wong, and Xiaotie Deng. TinyPairing: A Fast and<br />
Lightweight Pairing-based Cryptographic Library for Wireless Sensor Networks. In Proceedings<br />
of the IEEE Wireless Communications & Networking Conference. IEEE, 2010.<br />
[Yick et al., 2008] J. Yick, B. Mukherjee, and D. Ghosal. Wireless sensor network survey. Computer<br />
networks, 52(12):2292–2330, 2008.<br />
[Zhang et al., 2006] Y. Zhang, W. Liu, W. Lou, and Y. Fang. Mask: Anonymous on-demand<br />
routing in mobile ad hoc networks. IEEE Transactions on Wireless Communications, 5(9):2376–<br />
2385, 2006.<br />
[Zhang et al., 2008] W. Zhang, C. Wang, and T. Feng. Gpˆ 2s: Generic privacy-preservation<br />
solutions for approximate aggregation of sensor data (concise contribution). In Pervasive Computing<br />
and Communications, 2008. PerCom 2008. Sixth Annual IEEE International Conference<br />
on, pages 179–184. IEEE, 2008.<br />
[Zhu et al., 2003] Sencun Zhu, Sanjeev Setia, and Sushil Jajodia. Leap: Efficient security mechanisms<br />
for large-scale distributed sensor networks. In Proceedings of the 10th ACM conference<br />
on Computer and communications security, pages 62–72. ACM Press, 2003.<br />
94