4. Synchronization (2).pdf
4. Synchronization (2).pdf
4. Synchronization (2).pdf
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
DISTRIBUTED SYSTEMS<br />
Principles and Paradigms<br />
Second Edition<br />
ANDREW S. TANENBAUM<br />
MAARTEN VAN STEEN<br />
Chapter 6<br />
<strong>Synchronization</strong> (2)<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Plan<br />
• Clock synchronization in distributed<br />
systems<br />
– Physical clocks<br />
– Logical clocks<br />
• Ordered multicasting<br />
– JGroups<br />
• Mutual exclusion<br />
• Election<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups<br />
• Java toolkit for reliable group communication<br />
– Join group<br />
– Send to all or single group members<br />
– Receive messages from group<br />
Unreliable Reliable<br />
Unicast UDP TCP<br />
Multicast IP Multicast JGroups<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups<br />
• Channels<br />
– Similar to (BSD) sockets – pullbased<br />
– Connected to a protocol stack<br />
• Protocol stack<br />
– Bidirectional list of protocol<br />
layers<br />
• Building blocks for higher-level<br />
functionality<br />
– E.g., PullPushAdapter,<br />
RpcDispatcher<br />
• Used, e.g., for replication and<br />
load balancing in a number of<br />
application servers<br />
– E.g., JBoss, Tomcat<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Jgroups Programming Basics<br />
• Processes decide on a<br />
group name<br />
• Each process<br />
1. Creates a channel<br />
2. Connects to that channel<br />
• Using the group name<br />
• This means starting the protocol<br />
stack<br />
3. Sends messages to and<br />
receives messages from the<br />
channel<br />
• The concrete multicast depends<br />
on the protocol stack<br />
<strong>4.</strong> Eventually disconnects from<br />
the channel<br />
• This means stopping the<br />
associated protocol stack<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups<br />
• JGroup’s protocol stack has a number of<br />
ordering possibilities<br />
– None<br />
– org.jgroups.protocols.CAUSAL<br />
– org.jgroups.protocols.TOTAL<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups: Alphabet Example<br />
• Stable process group<br />
• Send parts of alphabet in sequence<br />
– Initiator<br />
• Select address of random member in group<br />
• Multicasts {”A”, }<br />
– Receivers print ”A”, stores ”A”<br />
– Receiver with address <br />
• Select address of random member in group<br />
• Multicasts {”B”, }<br />
– And so on...<br />
• Which ordering do we want<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups: Alphabet Example<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups: Alphabet Example<br />
Create channel<br />
Connect to group<br />
Listen to messages<br />
Listen to group changes<br />
Send an initial message<br />
A message contains a source (null), a destination (null), and<br />
A Serializable object (an array with two values)<br />
Listen to changes in the view<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Receive a massage<br />
Unpack letter and handler<br />
Close connection and terminate<br />
Find a new handler<br />
Send off next letter<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups: Stack setup<br />
Use UDP communication<br />
Ping via mcast<br />
Discover<br />
subgroups<br />
To make the example more interesting…<br />
Heartbeat-based failure detection<br />
Check that FD is correct<br />
And finally causal multicast…<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Plan<br />
• Clock synchronization in distributed<br />
systems<br />
– Physical clocks<br />
– Logical clocks<br />
• Ordered multicasting<br />
– JGroups<br />
• Mutual exclusion<br />
• Election<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion<br />
• We often want to ensure<br />
that only one process<br />
accesses a critical region<br />
• Requirements: ensure<br />
– Mutually exclusive access<br />
– Deadlock freedom<br />
– Starvation freedom<br />
– Fairness<br />
• Types of algorithms<br />
– Permission-based<br />
– Token-based<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Centralized Algorithm<br />
• Simulate how mutual exclusion is done in a<br />
one-processor system<br />
– Let one process act as a coordinator<br />
– Send request to coordinator to ask for permission<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Centralized Algorithm<br />
• Mutually exclusive access<br />
– Coordinator only lets one process in<br />
• Deadlock freedom<br />
– If no process is in critical region, any other<br />
that wants to can be granted<br />
• Starvation freedom<br />
• Fairness<br />
– Access granted in order of receipt at<br />
coordinator<br />
• Only 3 messages per ressource use<br />
– But coordinator is single point of failure and<br />
potential bottleneck<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Decentralized Algorithm<br />
• Sigma: algorithm based on DHTs<br />
– Logically replicate ressource to n coordinators<br />
• Name ith replica ”r_name-i” and let node with id<br />
lookup(r_name-i) be responsible for it<br />
– To enter into criticial region<br />
• Get grant from m > n/2 coordinators<br />
• If cannot get enough grants, give up and retry later<br />
– Problem: failure/recovery of coordinator<br />
• May handle up to f failures where<br />
–<br />
• Probability of violating mutual exclusion is small<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Decentralized Algorithm<br />
• Mutual exclusion<br />
– Average lifetime of node is 10000 seconds<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Distributed Algorithm<br />
• Totally ordered multicast of requests for resources from requester to<br />
receivers<br />
1. Receiver did not send request: Send OK back<br />
2. Receiver sent request, is accessing resource: Wait and queue request<br />
3. Receiver sent request, is not accessing resource<br />
• Receiver has higher timestamp on its request: Send OK back<br />
• Receiver has lower timestamp on its: Wait and queue request<br />
• When requester has OK from all processes it accesses resource<br />
– When done, send OK to all processes in queue<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Distributed Algorithm<br />
• Guarantees<br />
– Mutual exclusion<br />
• Need OK from all processes<br />
– Deadlock freedom<br />
– Starvation freedom<br />
– Fairness<br />
• Timestamp used to decide<br />
• But<br />
– n-points-of-failure (instead of 1)<br />
• Could try to detect whether resource<br />
has failed<br />
– n bottlenecks (instead of 1)<br />
• Could try to just get OK from majority<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Token Ring Algorithm<br />
• Assume processes can be connected in a (logical) ring<br />
– Token circulates on the ring<br />
– Take the token if need to access critical region<br />
– If token not needed, send it to next neighbor<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mutual Exclusion:<br />
A Token Ring Algorithm<br />
• Guarantees<br />
– Only one token at one process in a given time<br />
• Mutual exclusion, deadlock freedom, starvation freedom,<br />
fairness<br />
• But<br />
– What if token is lost<br />
– Process crashes<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
A Comparison of the Four Algorithms<br />
• Figure 6-17. A comparison of three mutual<br />
exclusion algorithms.<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Election<br />
• There is often a need for selecting a process<br />
with some special role<br />
– E.g., choose coordinator in a centralized protocol<br />
• If processes have unique id<br />
– Try to find process with highest id, make this leader<br />
– Algorithms differ in how they locate process with<br />
highest id<br />
• If processes do not have unique id (or similar)<br />
– Then there is no way to select any of them to be<br />
special<br />
– Well…<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Population Protocol<br />
• All agents start in same state<br />
– Eventually one agent is leader<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
JGroups and Election<br />
• Just use view and choose the first process<br />
in the view to be elected<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
The Bully Algorithm<br />
• Assume that every process knows the ids of<br />
every other process<br />
– Static group, but members may have failed<br />
• P notices that there is no coordinator – P wants<br />
to hold an election<br />
– P sends an ELECTION message to all processes with<br />
higher numbers<br />
– If no one responds, P wins the election and becomes<br />
coordinator<br />
– If one of the higher-ups answers, it takes over – P’s<br />
job is done<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
The Bully Algorithm (1)<br />
• Figure 6-20. The bully election algorithm. (a) Process 4 holds<br />
an election. (b) Processes 5 and 6 respond, telling 4 to stop. (c)<br />
Now 5 and 6 each hold an election.<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
The Bully Algorithm (2)<br />
• Figure 6-20. The bully election algorithm. (d) Process 6<br />
tells 5 to stop. (e) Process 6 wins and tells everyone.<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
A Ring Algorithm<br />
• Build view of processes by circulating message<br />
– Two steps: ELECTION and COORDINATOR<br />
– ELECTION message constructs view of alive processes during<br />
circulation<br />
– Choose process with highest id from view as coordinator<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Wireless Environments<br />
• Limited range of wireless<br />
• May want to elect node with particular<br />
properties<br />
– High battery-level<br />
– Near the “middle” of the network<br />
– …<br />
• Assume stability and reliability initially<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Wireless Environments (1)<br />
• Figure 6-22. Election algorithm in a wireless network, with node<br />
a as the source. (a) Initial network. (b)–(e) The build-tree phase<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Wireless Environments (2)<br />
• Figure 6-22. Election algorithm in a wireless network,<br />
with node a as the source. (a) Initial network. (b)–(e) The<br />
build-tree phase<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Wireless Environments (3)<br />
• Figure 6-22. (e) The build-tree phase.<br />
(f) Reporting of best node to source.<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Wireless Environments (4)<br />
• Handling split and merge<br />
– Use probe/reply messages to see if children in<br />
spanning tree are gone<br />
• Handling node crashes and recovery<br />
– Crash is a special case of split<br />
– Recovery is treated as a special case of<br />
merge<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Large-Scale Systems (1)<br />
• Requirements for superpeer selection:<br />
1. Normal nodes should have low-latency<br />
access to superpeers.<br />
2. Superpeers should be evenly distributed<br />
across the overlay network.<br />
3. There should be a predefined portion of<br />
superpeers relative to the total number of<br />
nodes in the overlay network.<br />
<strong>4.</strong> Each superpeer should not need to serve<br />
more than a fixed number of normal nodes.<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Elections in Large-Scale Systems (2)<br />
• Assume we want L leaders in m-bit Chord<br />
DHT<br />
– Reserve don’t care bits<br />
– p does lookup(p && 11…11000) to detect if it<br />
is the leader<br />
k<br />
• Each super-peer is then responsible for<br />
with high probability<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Summary<br />
• Mutual exclusion<br />
– Centralized<br />
– Decentralized<br />
– Distributed<br />
– Token-based<br />
• Election<br />
– Bully<br />
– Ring<br />
– Decentral and large-scale<br />
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5