12.07.2015 Views

Chapter 3 Internet Protocol Layer Part II: 3.3 - High Speed Network ...

Chapter 3 Internet Protocol Layer Part II: 3.3 - High Speed Network ...

Chapter 3 Internet Protocol Layer Part II: 3.3 - High Speed Network ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Chapter</strong> 3<strong>Internet</strong> <strong>Protocol</strong> <strong>Layer</strong><strong>Part</strong> <strong>II</strong>: <strong>3.3</strong>Ren-Hung HwangControl Plane Mechanisms• Address resolution• Address configuration• Error reporting• Intra-domain routing• Inter-domain routing• Multicast routing


Address Resolution• What is address resolution Translate address at different layers For example, host name to IP address, IP addressto Ethernet address• Why address resolution MAC address vs. IP addressAddress Resolution <strong>Protocol</strong>• <strong>Protocol</strong> operation Source node broadcasts an ARP request packeton the IP subnet All nodes on the subnet will receive the ARPrequest, but only the target node (or somedesignate server) will reply an ARP reply packetvia unicast Source node receives the reply and gets the MACaddress of the target node Cache is used to speed up (w/ timer)


ARP Packet Format0 8 1624 31Hardware Address Type <strong>Protocol</strong> Address TypeH. Addr Len P. Addr Len Operation CodeSender Hardware Address (0-3)Sender Hardware Addr (4-5) Sender <strong>Protocol</strong> Addr (0-1)Sender <strong>Protocol</strong> Addr (2-3)Target Hardware Addr (0-1)Target Hardware Address (0-3)Target <strong>Protocol</strong> AddressARP Packet Format• HARDWARE ADDRESS TYPELink types: Ethernet=0x0001• PROTOCOL ADDRESS TYPEUpper layer protocol identifier: IP=0x0800• HADDR LENLength of the address of the link layer: Ethernet=6• PADDR LENLength of the address of the network layer: IP=4


ARP Packet Format• OPERATIONOperation code: ARP request=1, ARP reply=2 RARPrequest=3, RARP reply=4• SENDER HADDRSender link layer address• SENDER PADDRSender network layer address• TARGET HADDRTarget link layer address, fill zero if unknown• TARGET PADDRTarget network layer addressEncapsulate ARP Packet into MACFrame• <strong>Protocol</strong> id: 0x0806• Destination address of an ARP requestpacket: 0xFFFFFFFFFFFF


Reverse ARP (RARP)• Allow a diskless workstation to discover its IPaddress• Need a RARP server on each network• Bootp: Use UDP messages which are forwarded overrouters to find the file server that holds themappingOpen Source Implementation:ARP• Data structure Hash table: arp_table Hash parameters: a primary key and device interface index• Functions Arp_send(): set up ARP header and then xmitArp_rcv(): Only deal with reply or request operation.• Request: calls ip_input_route(), if routes tolocal, calls arp_send() to send out ARP reply.Otherwise, if the host is an arp proxy, alsosends ARP reply.• Reply: update ARP table.__neigh_lookup(): calls neigh_lookup() to search the arp hashtable, if not found, create oneEth_rebuild_header (old) or arp_solicit() calls arp_send()


Error Control <strong>Protocol</strong>• What is error control protocol A protocol for reporting error or status of TCP/IPat remote site (router or host)• Why error control protocol For monitoring the status of TCP/IP at eachhost/router For reporting error between hosts or routersICMP• ICMP runs over IPICMP HeaderICMP DataIP HeaderIP Data


ICMPv4 Packet Format• Type and Code are used to identify an errorevent• Data reports the header and first 8 bytes ofthe error packet0 8 1624 31Type Code ChecksumDataType and CodeType Code description0 0 echo reply (ping)3 0 destination network unreachable3 1 destination host unreachable3 2 destination protocol unreachable3 3 destination port unreachable3 6 destination network unknown3 7 destination host unknown4 0 source quench (congestion control)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP header


ICMPv4 Examples• Echo Request/Reply Source sends an echo request to a destination Destination responses with an echo reply Type and code of Echo Request and Reply are (8, 0)and (0, 0), respectively. ping uses echo request and reply• Destination Unreachable (type=3)Possible errors: network unreachable(code=0),hostunreachable(code=1) ,protocol unreachable(code=2),port unreachable(code=3),sourceroute fail(code=5) ,destination network unknown(code=6), destination host unknown(code = 7 )ICMPv4 Examples• If the do not fragment bit in IP header is set to 1, and thepacket length is larger than the MTU of the output interface,router will discard this packet and send a fragmentationrequired (type=3, code=4) ICMP message to the source.Source Quench• when buffer of a router overflows, router sends a sourcequench (type=4) to source。Routing redirect• If a host forwards a packet to a router and the router finds thatthe packet should be forwarded by another router within thesame physical network, it will forward the packet to that routerand sends a redirect (type=5, code=0 or 1, (network/ host))ICMP message to the host.


ICMPv4 ExampleTime Exceeded:• After decreases the TTL by one, if a router findsthe TTL is less or equal to zero, it will send aTime Exceeded (type=11) ICMP message to thesource.• traceroute uses this type of ICMP messagetraceroute sends an ICMP echo request with TTL=1 to the targetmachineWhen the first router receives the message, it responds with atime exceeded messagetraceroute then sends another echo request with TTL=2The message passes the first router, but discarded by the secondrouter with a returned time exceeded messageTraceroute repeats sending echo requests until it receives anecho reply from the target machineICMPv4 Example IP header error:• Wrong IP header, such as wrong option field. (type=12)


ICMPv6• New type and codeType 0..127: error report• 1: Destination unreachable• 2: Packet too big• 3: Time Exceeded• 4: Parameter problemType 128..255: informational• 128, 129: Echo request & reply• 130, 131, 132: Multicast group membership management• 133,134: Router solicitation and advertisement• 135, 136: Neighbor solicitation and advertisement• 137: RedirectICMPv6Type Code Description1 0 No route to destination1 1 Communication with destinationadministratively prohibited1 3 Address unreachable1 4 Port unreachable2 0 Packet too big3 0 Hop limit exceeded in transit3 1 Fragment reassembly time exceeded4 0 Erroneous header field encountered4 1 Unrecognized Next Header type4 2 Unrecognized IPv6 option encountered128 0 Echo request129 0 Echo reply130 0 Multicast Listener Query131 0 Multicast Listener Report132 0 Multicast Listener Done133 0 Router Solicitation134 0 Router Advertisement135 0 Neighbor Solicitation136 0 Neighbor Advertisement137 0 Redirect


Routing• Task of routing Select a path from the source to the destination• Goal of routing Stable Robust Efficient (low delay, high throughput, …)Optimality of IP Routing• IP uses hop-by-hop routing(forwarding) Each router determines its own routing table Why packets will be delivered to their destinationsalong the optimal path?• If k is an intermediate node on the optimal path fromsource node s to destination d• The path from s to k is also the optimal path from s to k• A shortest path tree can be constructed from a source tothe rest of the graph.


Routing Algorithm Classification• Global or decentralized information? Link State routing: use Dijkstra algorithm Distance Vector routing: use distributed Bellman-Ford algorithm• Static or dynamic(adaptive)? Fixed routing table, set up manually Routing table adapts to network statusThe Shortest Path Algorithm• View a network as a graph Nodes are routers Edges are physical links• Associated with a link cost: delay, congestion level, …• Find the least cost path from a sending nodeto the destination node Depends on information available


Link-State Routing• Routing information Global information is available by reliablebroadcasting Dynamic: information exchanged when topologychanges or periodically• Path calculation Dijkstra algorithmDijkstra AlgorithmFor each v in V-{s} {If v is adjacent to sC(v)=lc(s,v)elseC(v)=?}T = {s}While (T≠V) {find w not in T s.t. C(w) is the minimum for all w in (V-T)}T = T ∪ {w}For each v in V-TC(v) = MIN(C(v), C(w)+lc(w,v))P(v)=w)


Dijkstra Algorithm Example4B1DA2311C1EIteration T C(B),p(B) C(C),p(C) C(D),p(D) C(E),p(E)0 A 4,A 1,A ∞ ∞1 AC 3,C 4,C 2,C2 ACE 3,C 3,E3 ACEB 3,E4 ACEBDRouting Table at Node ADestination Cost NextHopB 3 CC 1 CD 3 CE 2 C


Distance Vector Algorithm• Routing information Only local information is known• Knows status of adjacent links and routing information ofadjacent nodes Dynamic: information exchanged when link cost orshortest path changed• Path calculation Bellmen-FordBellman-Ford AlgorithmWhile (1) {If x received route update message from y {For each (Dest, Distance) pair in y’s report {If (Dest is new) { /* Dest not in routing table */Add a new entry for destination Destrt(Dest).distance = Distance+lc(x,y)rt(Dest).NextHop = y}else if ((Distance+lc(x,y))


Bellman-Ford Algorithm ExampleInitial Routing Table at AD estinatio n Distance NextHopB 4 BC 1 CFinal Distance Table at ADestination Distance NextHopB 3 CC 1 CD 3 CE 2 CHierarchical Routing• Not a flat network: too many routing entries• Define an AS Routers within an AS are under the sameadministrative control• Routing within an AS and between AS’s Intradomain routing Interdomain routing


AS• The <strong>Internet</strong> consists of Autonomous Systems (AS)interconnected with each other:Stub AS: small corporationMultihomed AS: large corporation (no transit)Transit AS: provider• Two-level routing:Intra-AS: routing within an ASInter-AS: routing between AS’sAn example of Hierarchical RoutingIntra-domain routers (exterior gateway)A.1Domain AA.2A.3C.1C.2Domain CC.3B.1Domain BB.4B.2Inter-domain routers (interior gateway)B.3


Example of <strong>Internet</strong> Routing <strong>Protocol</strong>s• Intradomain routing RIP OSPF• Interdomain routing BGP-4Intra-domain Routing• What is intra-domain routing Routing within a domain (AS) Administrator decides the routing protocol Administrator has total control on all routers• Why intra-domain routing Maintain connectivity within a domain


Intra-domain Routing• Runs Interior Gateway <strong>Protocol</strong>s (IGP)• Most common IGPs: RIP: Routing Information <strong>Protocol</strong> OSPF: Open Shortest Path FirstRIP• Originally designed for Xerox PARCUniversal <strong>Protocol</strong> (used in XNS)• Adopted by UNIX and TCP/IP in 1982 (e.g.,routed of BSD)• RIP: RFC 1058 [1988]• RIPv2: RFC 1388 [1993]


RIP• Distance Vector routing use hop count as cost metric (up to 15) restrict size of the network to 15 Exchange routing message (advertisement) every30 seconds Each advertisement consists of up to 25 routes(destination nets)RIPv2 Packet Format0 8 1624 31Command VersionMust be zeroFamily of net 1Address of net 1Subnet Mask for net 1Next Hop for net 1Distance to net 1Address of net 2Subnet Mask for net 2Next Hop for net 2Distance to net 2Route Tag for net 1Family of net 2 Route Tag for net 2


RIP Packet Format and Stability• RIP packet format commands: request or reply, version number up to 25 destination addresses• Stability hop count limit: 15 means infinity Stabilization Timer:• allows RIP to learn all routes from its neighbors before sendingfull updates.Split horizons• no update on backward route (omits routes learned from thatneighbor)Poison Reverse Update• sends updates to a neighbor includes routes learned from thatneighbor but sets the route metric to infinity.Routing Table of RIP• Taken from a cisco router at cs.ccu.edu.twDestination Gateway Flags Ref Use Interface-------------------- -------------------- ----- ----- ------ ---------127.0.0.1 127.0.0.1 UH 0 26492 lo0192.168.2. 192.168.2.5 U 2 13 fa0193.55.114. 193.55.114.6 U 3 58503 le0192.168.3. 192.168.3.5 U 2 25 qaa0224.0.0.0 193.55.114.6 U 3 0 le0default 193.55.114.129 UG 0 143454


Open Source Implementation• GNU Zebra Project Supports many routing protocols• RIP, OSPF, BGP Runs routing daemon as user process• Communicates with kernel via netlinkRouting Daemon and KernelUser spaceKernel spaceRouting manager(Zebra, routed, gated, …)Handling protocol specific packetsRouting TableData packetsKernelControlpacketsPackets from NICs


Overview of Zebra Routing <strong>Protocol</strong>sRIPd OSPFd BGPd RIPngdZebra Daemonioctl sysctl netlink proc fs rtnetlinkRouting Information(via socket interface)Routing TableKernelZebra and Netlink/RtnetlinkRouting <strong>Protocol</strong>sZebra <strong>Protocol</strong>Zebra Daemonnetlink / rtnetlinkKernel


Client Server Interaction in Zebra<strong>Protocol</strong>Make zebra server socketzclient_init()Install callback functionszclient_connectZebra server APIsZebraconnectionZebra client APIscallback functionsZebra Client/Server <strong>Protocol</strong>/* Structure for the zebra client. */struct zclient{/* Other data structures here */…/* Pointer to the callback functions. */int (*interface_add) (…);int (*interface_delete) (…);int (*interface_up) (…);int (*interface_down) (…);int (*interface_address_add) (…);int (*interface_address_delete) (…);int (*ipv4_route_add) (…);int (*ipv4_route_delete) (…);};Zebra IPv4 route message APIZebra Serverzsend_interface_{add,delete}zsend_interface_address_{add,delete}zsend_interface_{up,down}zsend_ipv4_{add,delete}zsend_ipv4_{add,delete}_multipathZebra clientzapi_ipv4_{add, delete}zebra_interface_add_readzebra_interface_state_readzebra_interface_address_{add,delete}_read


RIP Daemon (ripd)InitializationSchedulingRIP corerip_versionrip_default_metricrip_timersrip_routerip_distanceInterfacerip_networkrip_neighborrip_passive_interfaceip_rip_versionip_rip_authenticationrip_split_horizonZebraclientRIP Peerrip_peer_timeoutrip_peer_updaterip_peer_displayroutemapoffsetZebra DaemonOSPF• Features link-state routing protocol run internal to a single Autonomous System shortest-path tree be constructed for routing table• Dijkstra algorithm support for equal-cost multipath routing support for TOS-based routing support variable subnet length• each route distributed has a destination and mask Integrated uni- and multicast support:• Multicast OSPF (MOSPF) uses same topologydatabase as OSPF


OSPF• FeaturesTwo levels of hierarchy : areas within an AS• OSPF allows collection of contiguous networks andhosts to be grouped together, called an area.• The topology of an area is invisible form outside of theother area.• Routing in the AS takes place on two level intra-area routing, inter-area routingAS boundary routerArea boundaryrouterbackbonerouterBackboneArea boundaryrouterinternalrouterinternalrouterinternalrouterArea AArea BArea C


OSPF• FeaturesExternally derived routing data (via EGP) is advertisedthrough the AS.• Flood without modification• Two types of costtype 1: compatible with costs within area, cost to an externalnetwork is the sum of internal cost and external costtype 2: order of magnitude larger, cost to an external network issolely determined by external costOSPF• FeaturesSupports stub to reduce broadcasting• An area can be figured as stub when there is a singleexit point from the area.• Virtual Link can not be configured through stub areas.• AS boundary routers cannot be placed internal to stubareas.• No AS external advertisements are flood into /throughstub areas.


3N13N2Area 1Area 2StubRT111RT2N3N4N11RT9112RT488RT38RT58N12N13N147 66 6 RT67 Ia65IbRT1088RT7InternalrouterArea borderrouterAS boundaryrouter2 N129N15H1101N9 11 RT122 N10RT1123N8Area 31 1N61 RT84 N7OSPF Hierarchy• Area border routers: “summarize” distancesto nets in own area, advertise to other AreaBorder routers.• Backbone routers: run OSPF routing limitedto backbone.• Boundary routers: connect to other ASs.


OSPF• Database of area 1|RT1|RT2|RT3|RT4|RT5|RT7|____________________________RT1| | | | | | |RT2| | | | | | |RT3| | | | | | |RT4| | | | | | |RT5| | | 14 | 8 | | |RT7| | | 20 | 14 | | |N1 | 3 | | | | | |N2 | | 3 | | | | |N3 | 1 | 1 | 1 | 1 | | |N4 | | | 2 | | | |Ia,Ib| | | 20 | 27 | | ||RT1|RT2|RT3|RT4|RT5|RT7|________________________________N6 | | | 16 | 15 | | |N7 | | | 20 | 19 | | |N8 | | | 18 | 18 | | |N9-N11,H1| | | 29 | 36 | | |N12 | | | | | 8 | 2 |N13 | | | | | 8 | |N14 | | | | | 8 | |N15 | | | | | | 9 |OSPF• Summarized area information advertised byRT3 and RT4 to backbone.<strong>Network</strong> RT3 adv RT4 advN1 4 4N2 4 4N3 1 1N4 2 3


OSPF• Backbone information advertised into area1 by RT3 and RT4.Destination RT3 adv. RT4 adv.Ia, Ib 20 27N6 16 15N7 20 19N8 18 18N9-11,H1 29 36RT5 14 8RT7 20 14OSPF Daemon of ZebraInitializationSchedulingOSPF coreip_ospf_interfaceip_ospf_neighborospf_router_idnetwork_areashow_ip_ospf_cmdInterface<strong>Network</strong>LSALink StateAdvertisementzclientZebradaemonRouteRoute Maproute_map_updateroute_map_eventOSPF FloodingOSPF SPFcalcuationASEAS externalroute calculationLSDB


Inter-domain Routing• Called Exterior Gateway <strong>Protocol</strong>s (EGP)• Most common EGP: BGP: Border Gateway <strong>Protocol</strong>BGP• Features RFC 1771 (BGP-4) “Path vector” routing• loop free interdomain routing between ASsCan be used within and between ASs• multiple border routers (BGP speaker) within an AS• IBGP: Interior BGP runs between routers in the same AS All BGP speakers within the AS must be fully meshed(through IGP protocol)• EBGP: Exterior BGP runs between routers belonging to two different ASs


BGP• Runs over TCP with port 179• Routing table keeps all feasible paths to adestination network but advertises only the optimalpath to neighbors• Support information aggregation CIDR Confederation• could also be used to allow multiple ASs within anAS• Policy routing at AS access-list permit or deny (route or path filtering)• Metric: combination of different metric with thedegree of preference (weight, loc pref, med, …)BGP• Messages Open: first message sent after the TCP connection is established,followed by keepalive message Keepalive: send often enough to keep the hold-time timer fromexpiring Update: No periodic refresh of the entire table, exchanges onlychanges in tables• advertise a single feasible route to a peer• withdraw multiple routes previously advertised• Message contains path attributes (origin, as_path,next_hop, multi_exit_disc, local_pref, aggregator, …)and NLRI (network layer reachability information) Notification: send when an error is detected; also used to closeconnection


BGP• Path vector routingDifferent ASs may have different link cost metricsLoop free is very importantPolicy routing is preferred (different priorities, prohibit lists, …)AS_PATH of the path attribute• a list of ASs to the destination• Loop is found if current AS already in the AS_PATHNext_Hop of the path attribute indicates the next router (need notbe a BGP speaker) to the destinationNLRI• a list of subnets that can be reached by the AS_PATHBGP• Path selection(1) If Next_Hop is inaccessible, drop the update(2) Prefer largest LOCAL_PREF(3) Prefer shorter AS_PATH(4) Prefer lower origin code (igp


BGP PATH Attributes• Origindefines the origin of the path information• IGP (i), BGP (e), Incomplete (?) (unknown, e.g., staticroute)• AS_PATH ordered list or a set• Next_Hop IP of the next hop to the destination For multiaccess network, nexthop could be a router other thanthe BGP speaker• LOCAL_PREF indicate preferred exit router within an ASBGP PATH Attributes• Multi_Exit_Disc(MED) When a router has multiple external links to the sameAS, the link to the router with lower MED is preferred.


Open Source ImplementationMulticast• What is multicast?• <strong>Protocol</strong>s <strong>Internet</strong> Group Management <strong>Protocol</strong> V2 Distance Vector Multicast Routing <strong>Protocol</strong> <strong>Protocol</strong>-Independent Multicast (PIM) – SparseMode (SM)• Open Source Implementation Trace of IGMP Trace of DVMRP


Multicast• Communication among more than two parties Multi-party video conferencing Distance learning• Issues Maintain group member information Construct a multicast tree for transmissionpackets Many to many communicationMBONE• A virtual network on top of <strong>Internet</strong>• Provide multicast and real-timetransmission technique• Characteristic of MboneBandwidth usage will not increase proportionallywhen group membership increases• Goal of MBONEConstruct a testbed for multicast applications whenno ubiquitous mrouters in the <strong>Internet</strong>


MBONE Structure• Three components of Mbone : Island Mrouter Tunnel• Islands <strong>Network</strong>s with IP multicast capability Hosts in the same island can do multicastdirectly without through routersMBONE Structuremembermrouterrouter w/omulticast cap.Island ATunnelIsland CIsland BIsland D


MBONE Structure• MrouterTo solve problems caused by some routers that donot support multicast routingrun mrouted (multicast routing daemon)• determine routing path• multicast packet transitionMBONE Structure• Tunnel Construct a virtual point-to-point link betweenlocal mrouter and remote mrouter Allow multicast traffic to pass through nonmulticastcapable router CapsulationMulticast HeaderIMulticast DataNew IP HeaderTunnel sourceand destinationOriginal Multicast Packet


MBONE Address• Multicast address assigned to a multicast groups senders use it as destination IP address• Class D Address (224.0.0.0~239.255.255.255 ) high-order four bits is 1110 28-bit multicast group IDMBONE Communication <strong>Protocol</strong>• Multicast Routing <strong>Protocol</strong>s DVMRP, PIM-DM, PIM-SM, CBT, MOSPF, ...• IGMP A communication protocol between mrouterand hosts in a subnet


MBONE Application• Debug tool mtrace map-mbone• Basic software SDR (Session Directory) Wb (Whiteboard) VAT (Visual Audio Tool) VIC (Video Conference)<strong>Internet</strong> Group Management <strong>Protocol</strong>( IGMPv2)• RFC 2236• Used by IP hosts to report multicast groupmemberships to routers• Enhances IGMPv1- Querier election mechanism- IGMPv2 Leave Group message- Group-Specific Query message


<strong>Protocol</strong> Overview• Multicast router plays one of the two roles:Querier or Non-QuerierQuerier is responsible for maintain membershipinformation of the attached physical networkRouter with the smallest IP address becomes theQuerier• Routers hear the Query messages andmake the judgeQuerier periodically sends General Query to solicitmembership informationA General Query is sent to 224.0.0.1 (ALL-SYSTEMSmulticast group)<strong>Protocol</strong> Overview• When a host receives a General Query delays a random time from the range of[0..Max Response Time](starts a timer)• Max Resp. Time is given in the Query message Sends the report with TTL=1 when timerexpires If the host receives another host’s reportbefore timer expires, stop the timer and doesnot send the report (Report suppression)• Similar for a host receives a Group-Specific Query


<strong>Protocol</strong> Overview• When a router receives a report adds the group being reported to the list ofmulticast groups Sets timer for the membership to [GroupMembership Interval]. Deletes it if no reportsreceived before this timer has expired. (Query issent periodically.)• When a host joins a multicast group Sends an unsolicited report immediately<strong>Protocol</strong> Overview• When a host leaves a multicast group If it was the last host to reply to a Query, itshould send a Leave Group message to allroutersmulticast address (224.0.0.2)• When a router receives a Leave Groupmessage Sends Group-specific Queries every [LastMember Query Interval] to the group being leftfor [Last Member Query Count] times. If no reports received before [Last MemberQuery Interval], assumes no local members.


IGMPv2 message format• message format0 8 1624 31TypeMax. Resp.TimeChecksumMulticast group Address• type0x11=Membership Query- General query- Group-Specific Query0x16=Version 2 Membership Report0x17=Leave Group0x12=Version 1 Membership ReportIGMPv2 message format• Max Response Time- only in membership query message- set to be zero in other messages• Checksum- 16-bit one’s complement• Group address- zero when sending a General Query- group address when sending a Group-Specificquery


IGMPv3• IETF draft-ietf-idmr-igmp-v3-05.txt• Adds support for “source filtering” A receiver may request to receive packets onlyfrom specific source addresses Select source addresses by INCLUDE orEXCLUDE• IPMulticastListen(socket, interface, multicast-address,filter-mode, source-list)• filter-mode: INCLUDE or EXCLUDEMulticast Routing <strong>Protocol</strong>s• Two types of multicast treesource-based treecore-based tree (shared tree)• Multicast protocolsDVMRPPIM• Sparse mode• Dense modeCBTMOSPFBGMPWhat’s thedifference:per (S,G) treeorper (*,G) tree


Distance Vector Multicast Routing<strong>Protocol</strong> (DVMRP)• RFC-1054• Derived from RIP Relies on RIP for unicast routing• Widely used on the Mbone Enable incremental deployment of IP multicastsince it supports tunnel• Construct a source-based tree per source Provide a shortest path between source andreceiversDVMRP• Major difference between DVMRP and RIP RIP : concern with calculating next hop to adestination DVMRP : concern with calculating previous hopback to a source• Reverse Path Forwarding (RPF) algorithm Reverse Path Broadcast (RPB) Prune to a Reverse Path Multicast (RPM) tree Forwarding data uni-directionally


RPF Algorithm• Broadcast on the Reserve Path When a multicast packet is received• Forward the packet on all of its outgoing linksonly if Packet arrives on the interface that is also theinterface of the shortest path back to the sender Packet is not duplicated• Otherwise, discard the packetReverse Path Broadcasting (RPB)membermrouterrouter w/omembersourceForwardDiscard


RPF Algorithm• Prune Routers that do not lead to any members sendprune messages to upstream routers Routers know membership information viaIGMPPrune RPB Treemembermrouterrouter w/omembersourceForwardPrune


Example of a RPM treememberrouter w/memberrouter w/omembersourceForwardRPF Drawbacks And Benefits• Drawbacks :- First packet still has to be flooded- periodic prune state refresh in order to adopt to networktopology changes- routers must keep routing state per (source , group) pair• Benefits :- guarantee efficient delivery- easy to implement


DVMRP’s Problem• Work well only for densely representedgroups within a subnet periodic broadcast will cause performanceproblems• Amount of state information stored inmrouters information for forwarding multicast messages prune-state information• not scale to support sparsely distributedmulticast groupsPIM-SM• <strong>Protocol</strong> Overview• Special Features• Packet Formats


<strong>Protocol</strong> Overviews• Documents RFC 2362IETF draft: draft-ietf-pim-sm-v2-new-01.txt• TerminologiesDR: Designated RouterRP: Rendezvous PointRPT: RP-based Tree• PIM-SM route packets in three phasesPhase one: RP treePhase two: Register StopPhase three: Shortest-Path Tree (Optional)Phase One: RP Tree• Receiver Sends join message to DR using IGMP DR sends (*,G) PIM Join message to RP• Reaches RP or converge on a router on the RPT• Join message is sent periodically (o.w., it will timeout)• Sender Sender sends a packet with multicast addressas its destination to DR DR unicasts encapsulated packet to RP• PIM Register packets RP decapsulates it and forwards it onto RPT


Phase Two: Register Stop• Motivation• StepsEncapsulation and decapsulation are too expensiveRP initiates an (S,G) source-specific Join to SAll the routers on the path records the (S,G) multicast statePackets start to flow following the (S,G) tree to RPIf the packet reaches a router with (*,G), do a short-cut toreceivers.RP may now receive duplicate packets: native andencapsulated. RP discards the encapsulated packet.RP sends a Register-Stop message to DR of Source.RP forwards native packets to the RPT.Phase Three: Shortest-Path Tree• Motivation• StepsFrom source to RP, then to receivers is too long.A receiver’s DR may optionally initiate to transfer from theRPT to a source-specific tree (SPT)It issues an (S,G) join to S. The join message may reachthe source or converged at some router.It starts to receive two copies of packets. Drop the onefrom RPT.It then sends an (S,G) prune message to RP• (S, G, rpt) prune• Prune message reaches RP or converged atsome router.


Special Issues• Source-specific Joins• Multi-access Transit LANs• RP DiscoverySource-specific Joins• If a receiver sends a source-specific joinusing IGMPv3 DR may omit performing a (*,G) join. Instead, DR issues a source-specific (S,G) join.• Multicast addresses for source-specificmulticast 232.0.0.0 to 232.255.255.255 Only source-specific join will be accepted forgroup in this range.


Multi-access Transit LANs• Problems on a LAN with more than onerouters Two or more routers issue (*,G) Joins Two or more routers issue (S,G) Joins A router issues a (*,G) Join while another routerissues a (S,G) Join• Routers will observe duplicate join messages Use PIM Assert messages to elect a singleforwarder for the LAN• Choose the router sends (S,G)• Choose the router with best metric to RP or to sourceRP Discovery• PIM-SM routers need to know how to map agroup to an RP Use bootstrap mechanism In each PIM domain, a router is elected as theBootstrap Router (BSR). Candidate RPs of the domain unicast theircandidacy to the BSR. BSR decides an RP-set and periodicallyannounces it in a bootstrap message to all routers. A router (DR) uses an order-preserving hashfunction to map the group address into the RP-set


DR Election• PIM-Hello messages are sent periodically on eachPIM-enabled interface Hello messages are used to learn neighboring routersand elect a DR. Hello messages are sent to address 224.0.0.13 Hello messages contain DR election priority andGeneration Identifier fields• A router with largest DR election priority will be the DR.Tie break by IP address (larger is preferred)• Generation Identifier is randomly generated. A newGenID causes update of old Hello information and maycause a new election of DR.BSR Election• A set of routers are configured as candidatebootstrap routers (C-BSRs) Bootstrap messages are used for BSR electionand RP-set distribution A C-BSR with largest BSR priority is elected asthe BSR. Tie break by IP address.


RP-set• A set of routers are configured as candidateRPs (C-RPs) Typically same as C-BSRs• Candidate RPs periodically unicastCandidate-RP-Advertisement messages (C-RP-Advs) to the BSR (which includes) C-RP address Group address and a mask to indicate a set ofgroups it preferred to be the RP• BSR forms the RP-set (for each group prefix)Hash Function• A router maintains up to date RP-set• Choose an RP for a group G based on Choose RPs from the RP-set whose Groupprefixis the longest that covers G Compute a value byValue(G,M,C(i))=(1103515245 * ((1103515245 * (G&M)+12345)XOR C(i)) + 12345) mod 2^31 Choose the RP with highest priority and value Tie break by IP address


Summary• Source-rooted tree :- advantage :creating optimal path between sources and receivers- disadvantage :routers must maintain path information for each (S,G) pair• Shared tree :- advantage :requiring minimum amount of state in each router- disadvantage :path between sources and receivers may not be optimal

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!