PCI Express

PCI Express PCI Express

13.02.2014 Views

PLX Overview PCI Express & Storage - Two Fast Growing Markets - Now over 50% of Company’s Revenue PCI Express Switches & Bridges Storage Controllers Connectivity USB & PCI Bridges, Controllers & UARTs PLX Revenue Split by Product Line Public NASDAQ Company (PLXT) Financially Solid with Zero Debt, Cash Flow Positive 2

PLX Overview<br />

<strong>PCI</strong> <strong>Express</strong> & Storage<br />

- Two Fast Growing Markets<br />

- Now over 50% of Company’s Revenue<br />

<strong>PCI</strong> <strong>Express</strong><br />

Switches & Bridges<br />

Storage<br />

Controllers<br />

Connectivity<br />

USB & <strong>PCI</strong> Bridges,<br />

Controllers & UARTs<br />

PLX Revenue Split<br />

by Product Line<br />

Public NASDAQ Company (PLXT)<br />

Financially Solid with Zero Debt, Cash Flow Positive<br />

2


Market Leader<br />

#1 Supplier <strong>PCI</strong> <strong>Express</strong> Interconnect<br />

<strong>PCI</strong> <strong>Express</strong><br />

Switches & Bridges<br />

- Over 55% Marketshare*<br />

- Designs with all Market Leaders<br />

- 4 Million Units Shipped<br />

- Broadest Offering of Switches & Bridges<br />

Communications<br />

Embedded<br />

Server<br />

Storage<br />

PC/Consumer<br />

* 55% for Bridges & Switches Combined. Over 65% for Switches.<br />

3


<strong>PCI</strong>e Switch Is a Basic Building Block<br />

SPARC CPU<br />

with Native <strong>PCI</strong>e<br />

x86 Processors<br />

Network, Security,<br />

Graphic<br />

& Co-Processors<br />

Processors<br />

Chip Sets<br />

ASICs, Logic,<br />

& FPGAs<br />

Communication<br />

& Storage<br />

<strong>PCI</strong> <strong>Express</strong><br />

Switch<br />

<strong>PCI</strong> <strong>Express</strong><br />

Cleaner, Lower Cost, Lower Power<br />

Switch is a basic building block<br />

4


Product Summary<br />

& Road Maps<br />

PLX Products<br />

No NDA Required<br />

5


48<br />

32<br />

24<br />

16<br />

12<br />

8<br />


Lanes<br />

96<br />

80<br />

64<br />

48<br />

32<br />

24<br />

16<br />

12<br />

8<br />

6<br />

4<br />

1 st Gen 2 Family<br />

<strong>PCI</strong>e Gen 2 Switch Road Map<br />

Server, Storage & Dual Graphics<br />

PEX 8648<br />

48 Lanes, 12 Ports, NT<br />

PEX 8632<br />

32 Lanes, 12 Ports, NT<br />

2KB, 3HPC, DC, RP, DB<br />

PEX 8624<br />

24 Lanes, 6 Ports, NT<br />

2KB, 3HPC, DC, RP, DB<br />

PEX 8616<br />

16 Lanes, 4 Ports, NT<br />

2KB, 3HPC, DC, RP, DB<br />

PEX 8612<br />

12 Lanes, 3 Ports, NT<br />

2KB, 3HPC, DC, RP, DB<br />

Multi-Host MR<br />

& Multicast<br />

Blade & Rack Servers,<br />

Storage & Networking<br />

PEX 8647<br />

48 Lanes, 3 x16 Ports<br />

2KB, DC, DB, RP<br />

High Port Count, 2VC<br />

, DMA<br />

& SSC for Control Plane<br />

Networking, Embedded & Storage<br />

PEX 8617<br />

16 Lanes, 4 Ports, NT<br />

PEX 8618<br />

16 Lanes, 16 Ports, NT<br />

PEX 8613<br />

12 Lanes, 3 Ports, NT<br />

PEX 8614<br />

12 Lanes, 12 Ports, NT<br />

2VC, DC, RP, SSC, DB<br />

PEX 8608<br />

8 Lanes, 8 Ports, NT<br />

2VC, DC, RP, SSC, DB<br />

PEX 8606<br />

6 Lanes, 6 Ports, NT<br />

PEX 8604<br />

4 Lanes, 4 Ports, NT<br />

2VC, DC, RP, SSC, DB<br />

PEX 8696<br />

96 Lanes, 24 Ports, NT<br />

Multi-Root/Host<br />

PEX 8680<br />

PEX 8619<br />

16 Lanes, 16 Ports, NT<br />

2VC, DC, RP, SSC, DB, DMA<br />

PEX 8615<br />

12 Lanes, 12 Ports, NT<br />

2VC, DC, RP, SSC, DB, DMA<br />

PEX 8609<br />

8 Lanes, 8 Ports, NT<br />

2VC, DC, RP, SSC, DB, DMA<br />

80 Lanes, 20 Ports, NT<br />

Multi-Root/Host<br />

PEX 8664<br />

64 Lanes, 16 Ports, NT<br />

Multi-Root/Host<br />

PEX 8649<br />

48 Lanes, 12 Ports, NT<br />

(MR)/Host & Multicast<br />

Low Lane<br />

Count<br />

General Purpose<br />

Shipping Now<br />

In Development<br />

Planned/Concept<br />

2008 2009<br />

Approximate Timeframes 2010<br />

2010<br />

7


PLX Exclusive Features<br />

‣ Here are the exclusive features we will talk about today:<br />

• visionPAK TM Suite<br />

• Extraction of Receive Data “Eye Width”<br />

• <strong>PCI</strong>e Packet Generator<br />

• Performance Monitoring<br />

• Error Injection<br />

• SerDes Loopback Modes<br />

• performancePAK TM Suite<br />

• Read Pacing<br />

• Multicast<br />

• Dynamic Buffer Allocation<br />

‣ All valuable in both Gen 2 and Gen 1 modes!!<br />

8


visionPAK Suite<br />

System Debug Features<br />

Exclusive to PLX 8600 Switches<br />

9


Best-in-Class Gen 2 SerDes<br />

‣ Best-in-class SerDes (ARM)<br />

• Can support up to 60 inches of trace length for backplanes<br />

‣ Optional programmable features<br />

• Transition amplitude and Non-transitional amplitude<br />

• Pre-emphasis, De-emphasis and drive strength granularity to 50mV<br />

• Receiver Detect and Electrical Idle bits<br />

• SerDes BIST and AC JTAG<br />

• Automatic impedance calibration<br />

Non-Transitional Eye<br />

Transitional Eye<br />

10


Extraction of Receiver Eye Width<br />

‣ What is it?<br />

• PLX Exclusive pre-system design tool<br />

• Used on PLX RDK to test how clean a link is<br />

• Early indicator of potential link issues<br />

• A very open eye means a very reliable link<br />

• A tight eye indicates a weak link<br />

• Supported by all PEX 86xx switches<br />

11


Extraction of Receiver Eye Width<br />

‣ How does it work?<br />

1. User selects port, lane, Rx equalization, dwell time, etc.<br />

2. Software finds the center of the eye<br />

3. Software steps through the eye left & right until data errors occur<br />

4. Determines eye width and displays on the screen<br />

Center of the Eye<br />

Eye Width<br />

Steps Left<br />

Steps Right<br />

12


Extraction of Receiver Eye Width<br />

‣ Minimum Eye Width Test<br />

• Customer selects Port, Lanes, and Minimum Eye Width<br />

• Software tests all selected lanes for Minimum Eye Width<br />

• Returns “Pass” or “Fail” for each selected lane<br />

• Extremely convenient and pain-free tool for Customers<br />

13


Extraction of Receiver Eye Width<br />

‣ Auto-Calibrate Feature<br />

Rx = Receiver<br />

• Customer selects Port and Lane and range of Rx Equalization<br />

• Software steps through each eye and finds optimal setting<br />

• Returns value of Rx Equalization that gives best eye width<br />

• Extremely convenient and useful tool for Customers<br />

14


Extraction of Receiver Eye Width<br />

‣ Customer Benefits<br />

• Allows customer to see how much margin their<br />

link has at the PEX 86xx Receiver<br />

• Convenient features<br />

• Minimum Eye Width Test<br />

• Auto-Calibrate Feature<br />

• Identify potential link issues earlier<br />

…GET TO MARKET FASTER!!!<br />

15


<strong>PCI</strong>e Packet Generator<br />

‣ What is it?<br />

• PLX Exclusive feature that allows customers to create<br />

their own traffic patterns using a PLX switch<br />

• Enables high density traffic<br />

• Up to x16 port saturation (not easy to achieve)<br />

• Can create error messages<br />

• See how software/system reacts to errors<br />

• PLX RDKs can be used as <strong>PCI</strong>e packet generators<br />

• Great alternative to expensive <strong>PCI</strong>e Exercisers<br />

• All-in-one solution<br />

16


<strong>PCI</strong>e Packet Generator<br />

‣ User-programmable traffic<br />

• Memory Reads/Writes<br />

• Payload Size<br />

• <strong>PCI</strong> Address<br />

‣ View<br />

command list<br />

‣ Create<br />

looped traffic<br />

17


<strong>PCI</strong>e Packet Generator<br />

‣ Customer Benefits<br />

• Convenient, inexpensive way of stress testing their system<br />

• Programmable traffic<br />

• Ability to fully saturate their links<br />

• Test system software<br />

• No external equipment needed<br />

…SAVES $$$!!!<br />

18


Performance Monitor<br />

‣ What is it?<br />

• PLX Exclusive feature that allows customers to monitor<br />

PEX 86xx switch performance real-time<br />

• Displays performance for each individual port<br />

• Displays Ingress and Egress performance<br />

• Completely passive<br />

• Does not impact system performance<br />

19


‣ How does it work?<br />

Performance Monitor<br />

• Customer selects Port to monitor<br />

• Software reads PEX 86xx registers and displays the data<br />

• Link Utilization % indicates unused link potential<br />

• Displays total rate, payload rate, reads vs. writes, etc.<br />

20


Performance Monitor<br />

‣ Graphically ‘see’ the traffic on each port during runtime<br />

Total<br />

Byte Rate<br />

% Link<br />

Utilization<br />

Payload<br />

Byte Rate<br />

Average<br />

Payload Size<br />

(Bytes)<br />

21


Performance Monitor<br />

‣ Count Ingress & Egress TLPs for every port<br />

• Extensive Granularity<br />

• Posted header & dword<br />

• Non-Posted dword<br />

• Completion header & dword<br />

• Filter for various types of Posted and Non-Posted packets<br />

• For example, count just MWr64<br />

‣ Count Ingress & Egress DLLPs for every port<br />

• Extensive Granularity<br />

• Filter for ACKs<br />

• Filter for NAKs<br />

• Filter for UpdateFCs (Posted, Non-Posted, or Completions)<br />

22


Performance Monitor<br />

‣ Customer Benefits<br />

• Allows customers to track real-time link utilization<br />

• Helps find any weak links & potential bottlenecks<br />

• Gives additional visibility into traffic patterns<br />

• Convenient, inexpensive system bring-up tool<br />

…THE COMPLETE SOLUTION!!!<br />

23


Error Injection<br />

‣ What is it?<br />

• Software development tool<br />

• Inject ECC errors in <strong>PCI</strong>e packet<br />

• Inject <strong>PCI</strong>e error messages<br />

‣ Customer Benefits<br />

• Allows a customer to see how their software would<br />

react to these errors<br />

…SPEEDS UP SOFTWARE DEVELOPMENT!!!<br />

24


Loopback<br />

‣ Customer Benefits<br />

• Four convenient ways to test the SerDes and Logic of the<br />

PEX 86xx and/or connected device<br />

• Internal Tx<br />

• External Tx<br />

• Recovered CLK<br />

• Recovered Data<br />

…FOUR LOOPBACK MODES!!!<br />

25


Internal Transmitter Loopback<br />

‣ Used to test PLX SerDes and Logic<br />

26


External Transmitter Loopback<br />

‣ Used to test the link and logic of connected device<br />

27


Recovered Clock Loopback<br />

‣ External Tx test plus also tests PLX Recovered Clock Circuit<br />

28


Recovered Data Loopback<br />

‣ External Tx test plus also tests PLX Logic<br />

29


Easy Debug via I 2 C<br />

‣ Customers recommended to design in a I 2 C connector<br />

• Allows easy connection to PLX RDK via laptop or PC (USB)<br />

• Enables systems w/o Windows or Linux OS to run SDK thru<br />

remote system (i.e. laptop)<br />

• Convenient for debug purposes<br />

• Ideal for on-site FAE support<br />

USB<br />

PLX RDK<br />

PORT 8 – X16<br />

PORT 12 – X16<br />

Port 8 Status<br />

LED<br />

Port 12 Status<br />

LED<br />

Port 0 Status<br />

LED<br />

Manual<br />

Reset<br />

HD Power<br />

Connector<br />

I 2 C<br />

EEPROM<br />

JTAG<br />

RefClk<br />

PEX 8548<br />

DIP Switch<br />

for Config<br />

I 2 C<br />

PERST#<br />

Port 0 – x16<br />

(Upstream Port)<br />

System Chassis<br />

30


performancePAK Suite<br />

Performance Enhancing Features<br />

Exclusive to PLX 8600 Switches<br />

31


PEX 8600 Performance<br />

‣ PLX Leadership in Gen 2 Performance<br />

‣ Achieving >99% theoretical max throughput<br />

in Host-Centric and Peer-to-Peer environments<br />

‣ Proven in simulations & actual measurements<br />

Stay ahead of the pack with PLX!<br />

‣ Featuring performancePAK<br />

• Read Pacing<br />

• Dual Cast & Multicast<br />

• Dynamic Buffer Allocation<br />

• DMA-engine inside<br />

Throughput (MB/s)<br />

7000<br />

6000<br />

5000<br />

4000<br />

3000<br />

2000<br />

1000<br />

0<br />

Throughput (GB/s)<br />

10<br />

8<br />

6<br />

x16 Gen 2 Host-Centric Throughput<br />

100.0<br />

10GE Throughput<br />

12<br />

80.0<br />

60.0<br />

100.00<br />

40.0<br />

80.00<br />

20.0<br />

60.00<br />

0.0<br />

40.00<br />

64 4 128 256 512 1024 2048<br />

Packet Size (B)<br />

2<br />

20.00<br />

Theoretical Max PEX 8600 Bidirectional % Theoretical Max<br />

% of Native<br />

% Theoretical Max<br />

Native<br />

w/ PEX 8624<br />

% of Native<br />

0<br />

One Thread<br />

Multi-Thread<br />

0.00<br />

32


Read Pacing<br />

‣ Problem Reduced endpoint performance caused by:<br />

• Unbalanced upstream/downstream link-widths<br />

• Uneven number of Read Requests made by endpoints<br />

• Leads to one endpoint dominating Root Complex queue<br />

• Other endpoints get starved<br />

‣ Solution PLX Read Pacing*<br />

• Read Pacing Queues manage incoming Read Requests<br />

• Prevents one endpoint from dominating Root Complex queue<br />

• Ensures no endpoint is starved<br />

• Allows for optimized performance of endpoints<br />

* Patents pending<br />

34


Without Read Pacing<br />

‣ Performance bottleneck due to mixing of slow<br />

and fast I/Os<br />

1. FC HBA makes multiple 2KB Read Requests<br />

2. Root Complex queues FC HBA requests<br />

3. Ethernet NIC makes one 1KB Read Requests<br />

4. Root Complex queues Ethernet NIC request<br />

5. Ethernet NIC must wait for RC to service FC HBA<br />

requests before serving Ethernet NIC request<br />

Root Complex<br />

IN<br />

2KB RR<br />

2KB RR<br />

2KB RR<br />

2KB RR<br />

1KB RR<br />

x8<br />

Switch<br />

1KB Data<br />

2KB Data<br />

2KB Data<br />

2KB Data<br />

2KB Data<br />

OUT<br />

Ethernet NIC packets<br />

at the end of the line<br />

6. Ethernet NIC is starved<br />

Reduced Ethernet NIC Performance!<br />

Sends 2KB<br />

Read Requests<br />

FC<br />

HBA<br />

x4<br />

2KB RR<br />

x4<br />

1KB RR<br />

Sends 1KB<br />

Read Requests<br />

Ethernet<br />

NIC<br />

35


With PLX Read Pacing<br />

‣ Increased performance due to fair allocation of<br />

bandwidth to downstream ports<br />

1. FC HBA makes multiple 2KB Read Requests<br />

2. Switch allows one FC HBA request at a time to<br />

pass through based on programmable thresholds<br />

3. Ethernet NIC makes one 1KB Read Requests<br />

4. Switch allows one requests to pass through the<br />

switch based on programmable thresholds<br />

5. Switch continues to allow Ethernet NIC requests to<br />

pass through in front of large FC HBA requests<br />

based on programmable settings<br />

6. Ethernet NIC gets serviced more often with no<br />

impact to FC HBA performance<br />

7. Neither endpoint is starved<br />

Optimized Performance!<br />

FC<br />

HBA<br />

Root Complex<br />

IN<br />

2KB RR<br />

2KB RR<br />

2KB RR<br />

1KB RR<br />

1KB RR<br />

x8<br />

Read Pacing<br />

Queue<br />

Sends 2KB<br />

x4<br />

Read Requests<br />

2KB RR<br />

PEX 8600<br />

2KB Data<br />

1KB Data<br />

2KB Data<br />

1KB Data<br />

2KB Data<br />

OUT<br />

Ethernet NIC packets<br />

fairly queued<br />

Read Pacing<br />

Queue<br />

Sends 1KB<br />

x4<br />

Read Requests<br />

1KB RR<br />

Ethernet<br />

1KB NIC RR<br />

36


Read Pacing Measured<br />

Read Pacing Comparison<br />

Throughput (MB/s)<br />

498.355 496.315 496.125<br />

112.15<br />

112.24<br />

20.87<br />

Standalone Read Pacing Off Read Pacing On<br />

PLX Packet<br />

Generator<br />

“Port Hog”<br />

Root Complex<br />

x8<br />

Switch<br />

x4<br />

x4<br />

Ethernet<br />

NIC<br />

“Starved Port”<br />

GE NIC Performance<br />

PLX <strong>PCI</strong>e Packet Generator Performance<br />

Endpoint<br />

Read Pacing OFF Read Pacing ON<br />

GE NIC 18.6 100.1<br />

PLX Packet<br />

Generator<br />

% of Standalone Performance<br />

99.6 99.6<br />

‣ PLX <strong>PCI</strong>e Packet Generator<br />

• Used to mimic a “fast” I/O<br />

(i.e. FC HBA)<br />

• Sending back to back Memory<br />

Read Requests to Host<br />

37


<strong>PCI</strong>e Multicast<br />

‣ Address Based<br />

• Multicast BAR creates a Multicast space<br />

• Posted packets that hit in the Multicast BAR are multicast<br />

‣ Reliability<br />

• <strong>PCI</strong>e has low error rate and hop by hop error free transmission<br />

‣ Supports Legacy<br />

• Any source can send posted packet in Multicast space<br />

• Legacy devices can be multicast targets<br />

‣ Multicast ECN<br />

• Specifies how – Switches, Root Complex, and End Points implement Multicast<br />

• Improves flexibility and protection for End Points participating in Multicast<br />

• Also allows use of legacy End Points<br />

38


Multicast – Example<br />

‣ Support for 64 Multicast (MC) Groups<br />

‣ All ports can be programmed as MC source<br />

‣ One source port can have multiple MC groups<br />

‣ One destination port can be part of multiple MC groups<br />

‣ MC can be done across an NT port<br />

CPU<br />

CPU<br />

Source Port<br />

Chipset<br />

Chipset<br />

Destination<br />

Ports<br />

PEX 8680<br />

PEX 8680<br />

IO<br />

IO<br />

GPU<br />

GPU<br />

IO IO GPU GPU<br />

CPU sending single command to multiple IOs (Group 1)<br />

CPU sending single command to multiple GPUs (Group 2)<br />

39


Multicast Memory Space<br />

Multicast_Base<br />

2 Multicast_Index_Position<br />

Multicast Group 0<br />

Memory Space<br />

Multicast Group 1<br />

Memory Space<br />

Multicast Group 2<br />

Memory Space<br />

Multicast Group 3<br />

Memory Space<br />

Multicast Group n<br />

Memory Space<br />

40


Multicast and Address Routing<br />

<strong>PCI</strong>e Standard Address Route<br />

Multicast Address Route<br />

‣ Request that hits a Multicast address<br />

range is routed unchanged to Ports<br />

that are part of the Multicast Group<br />

derived from Request address<br />

‣ <strong>PCI</strong>e standard address route not used<br />

for multicast<br />

• Including default upstream route<br />

41


Peer-to-Peer & Peer Plus Host Multicast<br />

Root<br />

<strong>PCI</strong>e Switch<br />

P2P<br />

Bridge<br />

Virtual <strong>PCI</strong> Bus<br />

P2P<br />

Bridge<br />

P2P<br />

Bridge<br />

P2P<br />

Bridge<br />

P2P<br />

Bridge<br />

Endpoint Endpoint Endpoint Endpoint<br />

42


MC in Graphics & Floating Point Acceleration<br />

‣ Dual-headed graphics<br />

• Each GPU paints ½ the screen<br />

• Multi-Cast commands downstream<br />

• E.g. vector list<br />

• Use peer to peer to transfer bit map<br />

from GPU2 to GPU1<br />

‣ General Floating Point acceleration<br />

• Some GPUs need to see same data<br />

• Push data or commands downstream<br />

to multiple GPUs/FPUs<br />

x16<br />

GPU1<br />

CPU<br />

Root<br />

Complex<br />

x16<br />

<strong>PCI</strong>e<br />

SWITCH<br />

x16<br />

GPU2<br />

43


MC in Communication Systems<br />

A 40G (GE) Line Card<br />

• May need to split processing over 4 NPUs via MC<br />

• Service card on AMC may need to MC packets to FPGA & RegEX<br />

RegEx<br />

FPGA<br />

NPU 1<br />

AMC<br />

PEX 8696<br />

RGMII<br />

CPU<br />

NPU<br />

NPU 4<br />

44


MC in Storage<br />

CPU<br />

Up to 8 Processor Boards<br />

CPU<br />

Chip Set<br />

Mem<br />

Mem<br />

Chip Set<br />

PEX 8624<br />

PEX 8624<br />

MC Enabled Ports<br />

PEX 8664<br />

MR Switch<br />

IO Drawer<br />

PEX 8664<br />

MR Switch<br />

IO Drawer<br />

Back Plane<br />

Back Plane<br />

IO Cards<br />

IO Cards<br />

45


PLX PEX 8600 Buffer Allocation<br />

‣ Shared memory pool per 16 lanes<br />

PLX Buffer Allocation<br />

x4<br />

‣ User assigns buffers as per port-width<br />

• Set minimum buffers per ports<br />

• Also creates a common pool<br />

‣ Ports dynamically grab buffers as needed<br />

• Grab when utilization of assigned<br />

buffers exceeds user-assigned<br />

thresholds (25% by default)<br />

• Return empty buffers to the pool<br />

x4<br />

x2<br />

x2<br />

Assigned<br />

Buffers<br />

Common<br />

Buffer<br />

Pool<br />

Assigned<br />

Buffers<br />

46


Dynamic Allocation Appropriate Buffers<br />

‣ Static Buffers/port<br />

• Unused buffers<br />

• Can not assign based on traffic load<br />

• Can not move buffers between ports<br />

• …LOWER PERFORMANCE<br />

Static Buffers/port<br />

x8<br />

Unused buffers<br />

Fixed 5 packet buffers<br />

for each ports for all<br />

port widths<br />

‣ PLX Shared Memory Pool<br />

• All buffers usable<br />

• Assign based on traffic load<br />

• Move buffers around between ports<br />

• …HIGHER PERFORMANCE<br />

x1<br />

x4<br />

x8<br />

Shared Memory Pool<br />

x8<br />

Buffers assign<br />

as needed<br />

x1<br />

x4<br />

x8<br />

47


Multi-Root Architecture<br />

48


What is Multi-Root?<br />

‣ A root (upstream) port connects in the direction of the CPU<br />

‣ A multi-root device has more than one upstream port<br />

• Provides connection for Two or more CPUs<br />

‣ Note: Endpoint sharing between CPUs is not supported<br />

One upstream port<br />

(single root)<br />

Two upstream ports<br />

(multi-root)<br />

CPU<br />

CPU<br />

CPU<br />

CPU<br />

Upstream port<br />

Upstream port<br />

Switch<br />

Switch<br />

Switch<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e<br />

49


Multi-Root Benefits in PEX Switch<br />

‣ Up to Eight upstream ports<br />

• Eight independent subsystems<br />

Host<br />

Host<br />

Host<br />

‣ Efficient use of interconnect<br />

• Mix-and-match number of<br />

endpoints to CPU<br />

• Based on performance needs<br />

• Up to 24-ports supported<br />

Manager<br />

Root<br />

Complex<br />

<strong>PCI</strong>e<br />

End-Point<br />

Root<br />

Complex<br />

PLX MR<br />

Switch<br />

<strong>PCI</strong>e End-<br />

Point<br />

Root<br />

Complex<br />

<strong>PCI</strong>e<br />

End-Point<br />

‣ Failover/Redundancy<br />

• Re-assign endpoints of a failed<br />

host<br />

Switch<br />

Switch<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

End-Point<br />

End-Point<br />

End-Point<br />

‣ Smaller footprint<br />

• Lower power<br />

<strong>PCI</strong>e<br />

End-Point<br />

<strong>PCI</strong>e<br />

End-Point<br />

50


Transparent and Multi-Root Modes<br />

‣ PEX MR switches support two functional modes<br />

• Mode 0 – Transparent Switch<br />

• Same as today’s switches<br />

• Mode 1 – Multi-Root Switch<br />

• Advanced architecture with multiple upstream ports<br />

CPU<br />

Mode 0 Mode 1<br />

CPU<br />

CPU<br />

Cygnus<br />

Cygnus<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e<br />

<strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e<br />

51


Generic Rack Mount Server<br />

CPU<br />

Chip<br />

Set<br />

PEX 8696<br />

I/O<br />

I/O<br />

52


Storage Server with Failover<br />

CPU<br />

CPU<br />

Chip<br />

Chip<br />

Set<br />

Set<br />

NT<br />

PEX 8696<br />

I/O<br />

I/O<br />

53


Servers Sharing <strong>PCI</strong>e MR Switch<br />

Each CPU* has its own dedicated IOs isolated from other CPUs<br />

CPU<br />

CPU<br />

CPU<br />

Chip<br />

Set<br />

Chip<br />

Set<br />

Chip<br />

Set<br />

PEX 8696<br />

* Up to 8 CPUs<br />

supported<br />

I/O<br />

I/O<br />

I/O<br />

I/O<br />

I/O<br />

I/O<br />

54


Use of Mode 2 for Fail-over<br />

CPU<br />

CPU<br />

Chip<br />

Set<br />

Chip<br />

Set<br />

PEX 8664 PEX 8664<br />

55<br />

x4 & x8<br />

Endpoint<br />

Endpoint<br />

Endpoint<br />

Endpoint<br />

Endpoint<br />

Endpoint


Packet-Ahead<br />

Two Virtual Channels Implementation<br />

Traffic Class re-mapping<br />

56


Traffic Classes and Virtual Channels<br />

‣ Traffic Class (TC)<br />

• Specifies priority for a given <strong>PCI</strong> <strong>Express</strong> packet<br />

• <strong>PCI</strong> <strong>Express</strong> supports Eight TCs: TC0 – TC7<br />

• TC7 Highest Priority<br />

• TC for a Request/Completion pair must be the same<br />

‣ Virtual Channel (VC)<br />

• Buffer entity used to queue <strong>PCI</strong>e packets<br />

• 1 VC = 1 Buffer entity<br />

• 2 VC = 2 Buffer entities<br />

• VC assignment for a given TC according to priority<br />

• Low Priority Traffic shares one VC<br />

• High Priority Traffic has its own VC<br />

57


One Wire, Multiple Traffic Flows<br />

TC6-TC4/VC1<br />

TC7/VC2<br />

TC3-TC0/VC0<br />

<strong>PCI</strong> <strong>Express</strong> Wire<br />

‣ Traffic Class determines priority for packets<br />

‣ Packets mapped to VCs according to Priority<br />

• Highest Priority TC7 VC2<br />

• Lower Priority TC6-TC4 VC1<br />

• Lowest Priority TC3-TC0 VC0<br />

‣ One VC Same priority for ALL packets<br />

58


TC – VC Mapping in <strong>PCI</strong>e Hierarchy<br />

‣ Each Device supports different number of VCs<br />

‣ TC/VC mapping according to device capabilities<br />

• On a per link basis<br />

‣ VC arbitration schemes enable QoS<br />

‣ No ordering between VCs; independent buffers<br />

59


Packet-Ahead Feature<br />

‣ Allows the NT port to modify the original Traffic Class<br />

(TC) of a <strong>PCI</strong>e packet<br />

• From TC0 to TCx<br />

• where TCx is TC1 – TC7<br />

‣ Benefits<br />

• Provides two separate data paths for memory traffic<br />

• Low priority, High priority<br />

• Enhanced QoS regardless of CPU single VC limitation<br />

• Differentiation of traffic in single VC systems<br />

• Available in PEX8618, PEX8614 and PEX8608<br />

60


Example Without Packet-Ahead<br />

‣ CPU supports VC0 and TC0 only<br />

• No differentiation of traffic<br />

CPU<br />

‣ Endpoints and Switch support two VCs<br />

and at least two TCs<br />

‣ Single path to CPU<br />

NT<br />

PEX 8618<br />

‣ System is limited by CPU capabilities<br />

ASIC<br />

ASIC<br />

ASIC<br />

61


Example with Packet-Ahead<br />

‣ Same CPU Limitations<br />

• VC0 and TC0 only<br />

‣ Same Endpoint and Switch capabilities<br />

• VC0 – VC1; TC0 – TC1<br />

CPU<br />

‣ Two paths to CPU<br />

• Via Upstream Port and NT Port<br />

NT<br />

PEX 8618<br />

‣ For packets received on NT Port<br />

• TC is changed from TC0 to TC1<br />

• Are mapped to High Priority VC1<br />

ASIC<br />

ASIC<br />

ASIC<br />

‣ Packets received on upstream port are<br />

unaffected<br />

62


Packet-Ahead Transaction Details<br />

‣ Posted Traffic (mem writes)<br />

• CPU generates Posted Packet with TC0<br />

• NT port modifies packet with TC1<br />

‣ Non-Posted (Read Requests)<br />

• CPU generates Read Request with TC0<br />

• NT port modifies packet with TC1<br />

• Endpoint sinks requests and provides completion with TC1<br />

• NT port modifies completion packet to original TC0<br />

63


Direct Memory Access (DMA)<br />

Inside PLX <strong>PCI</strong>e Switches<br />

64


DMA Benefits<br />

‣ Independent Data Mover<br />

• Can transfer small and large blocks of data<br />

• No CPU involvement<br />

• Can transfer data between all switch ports<br />

‣ Centralized DMA Engine<br />

• Processor/chipset no longer needs to support DMA<br />

• More selection lower cost<br />

• Software consolidation through multiple platforms<br />

• Software code for one DMA engine<br />

‣ Improves system performance<br />

• Low latency transfers while sustaining Gen 2 speeds<br />

65


<strong>PCI</strong>e Switch with DMA<br />

‣ <strong>PCI</strong>e Switch is a Multi-Function device<br />

• Function 0 P-to-P bridge<br />

• Transparent Switch model<br />

• No Driver Required<br />

• Function 1 DMA endpoint<br />

• Type 0 Configuration Header<br />

• Memory mapped registers<br />

• Requires DMA Driver<br />

• Provided by PLX<br />

‣ Available now<br />

• PEX8619: 16-lane/16-Port<br />

• PEX8615: 12-lane/12-Port<br />

• PEX8609: 8-lane/8-Port<br />

P-P<br />

Bridge<br />

DMA<br />

F1<br />

Upstream Port<br />

P-P<br />

Bridge<br />

P-P<br />

Bridge<br />

Downstream Ports<br />

Virtual Bus<br />

P-P<br />

Bridge<br />

P-P<br />

Bridge<br />

66


DMA Implementation<br />

‣ Four DMA Channels – Each Channel:<br />

• Works on one descriptor at a time<br />

• Has a unique Requester ID (RID)<br />

• Has a programmable traffic class (TC) for QoS<br />

‣ DMA descriptor<br />

• Specifies source address, destination address, transfer size, control<br />

• Internal or external<br />

‣ DMA Read Function<br />

• Initiates Read Requests and collects completions by matching RID<br />

and Tag number<br />

‣ DMA Write Function<br />

• Converts Read Completion streams into Memory write streams<br />

67


DMA Descriptor Overview<br />

‣ Descriptors are instructions for DMA<br />

• Written by CPU<br />

• Stored in a ring in Host Memory<br />

OR<br />

• Stored internal to the <strong>PCI</strong>e Switch<br />

Descriptor<br />

Ring<br />

Descriptor N – 1<br />

32b<br />

‣ 16B standard format<br />

• Supports 32-bit Addressing<br />

• Control/Status information<br />

Descriptor 2<br />

Descriptor 1<br />

DstAddr<br />

SrcAddr<br />

SrcAddrH, DstAddrH<br />

Transfer Size, Control<br />

‣ 16B extended format<br />

• Supports 64-bit Addressing<br />

• Control/Status information<br />

Descriptor 0<br />

68


DMA Descriptor Prefetch<br />

‣ DMA channel prefetches 1 to 4 descriptors at a time when in<br />

external descriptor mode<br />

‣ Internal buffer support for up to 256 descriptors<br />

Active Channels Descriptors per Channel<br />

1 256<br />

2 128<br />

4 64<br />

‣ Descriptors are prefetched to internal buffer until filled<br />

• Control in place for # of descriptors to be prefetched<br />

• 1, 4, 8, or max per channel<br />

‣ Invalid descriptors are dropped<br />

• no further descriptor fetch until software clears status<br />

• interrupt optionally enabled<br />

69


DMA Runtime Flow – Host to I/O<br />

Memory<br />

FPGA<br />

CPU<br />

RC<br />

Switch<br />

ASIC<br />

ASIC<br />

ORANGE text describes CPU tasks<br />

1. CPU Programs Descriptors in RAM<br />

2. CPU enables DMA<br />

3. DMA reads Descriptors in RAM<br />

a. DMA prefetches 1-256 descriptors<br />

4. DMA works on 1 descriptor at a time<br />

a. DMA reads source<br />

b. Completions arrive in switch<br />

c. Completions are converted to Writes<br />

d. DMA Writes to Destination<br />

e. (There can be multiple read/write per<br />

descriptor)<br />

f. Clears Valid Bit on Descriptor after last<br />

write (optional)<br />

g. Interrupt CPU after descriptor (optional)<br />

h. Start next descriptor<br />

5. End of Ring (DMA Done)<br />

6. CPU receives and handles interrupts<br />

70


DMA Performance<br />

‣ Ordering enforced within a DMA channel only<br />

• Descriptors are read in order from Host Memory<br />

• Data within a descriptor is moved in order<br />

• Read Requests (MRd) are strictly ordered<br />

• Partial completions per MRd follow <strong>PCI</strong>e Spec – Out of order tags<br />

will be re-ordered on the chip<br />

• Write Requests (MWr) are strictly ordered<br />

‣ Full Line Rate Throughput<br />

• One channel can saturate one link in one direction<br />

• x8 at 5GT/s with 64B Read Completions<br />

• Two channels can saturate both directions<br />

• x8 at 5GT/s with 64B Read Completions<br />

‣ Programmable Interrupt Control<br />

• Less interrupts Less CPU utilization<br />

‣ Data Rate Controls in place to control the maximum read BW<br />

• Transfer Max Read Size every X clocks (programmable)<br />

71


Data Integrity<br />

72


Data Integrity and Error Isolation<br />

‣ PLX supports protection against <strong>PCI</strong>e errors<br />

• Providing robust system through data integrity & error isolation<br />

‣ <strong>PCI</strong>e Error Types<br />

• Malformed packets<br />

• EP (Poisoned TLPs) & ECRC errors<br />

• 1-bit ECC & 2-bit ECC<br />

• LCRC<br />

• PHY Errors: Disparity, 8b/10b Encoding, Scrambler & Framing<br />

• Receiver Overflow, Flow Control Protocol Error<br />

• PLX Device Specific : ECC, UR overflow<br />

73


Data Integrity<br />

‣ Internal data path protection from ingress to egress<br />

• Complete data protection through ECC<br />

• ECRC on ingress and egress ports<br />

‣ Higher performance by reducing re-transmission<br />

74


Error Isolation – Fatal Errors<br />

‣ User selectable behavior on fatal errors<br />

‣ Malformed Packet or Internal Fatal Error handling<br />

• Mode 1 (Default)<br />

• Assert FATAL_ERR# pin and send Error Message<br />

• Mode 2<br />

• Generate Internal Reset (equivalent to in-band hot reset)<br />

• Mode 3<br />

• Block all packet transmissions<br />

• Cancel packets in transit with EDB<br />

• Mode 4<br />

• Block all packet transmission<br />

• Bring upstream link down to cause surprise down<br />

75


Error Isolation – EP/ECRC<br />

‣ User selectable behavior with packet error<br />

‣ Poisoned packet (EP) & ECRC error handling<br />

• EP/ECRC Mode 1 (Default)<br />

• Forwarded with appropriate logging<br />

• EP/ECRC Mode 2<br />

• Drop EP/ECRC packet<br />

• Not forwarded, only logged<br />

• EP/ECRC Mode 3<br />

• Block violating device<br />

• Drop and Block EP/ECRC packet<br />

76


Gen 1 Electrical & Mechanical Summary<br />

Device<br />

Package Size*<br />

(mm 2 )<br />

Typical Power<br />

Consumption (Watts)<br />

Max. Power<br />

Consumption (Watts)<br />

PEX 8505 15 x 15 mm 2 0.8 W 1.4 W<br />

PEX 8508 19 x 19 mm 2 1.6 W 2.5 W<br />

PEX 8509 15 x 15 mm 2 1.2 W 1.8 W<br />

PEX 8511 15 x 15 mm 2 1.0 W 1.6 W<br />

PEX 8512 23 x 23 mm 2 2.2 W 3.1 W<br />

PEX 8513 19 x 19 mm 2 1.3 W 2.6 W<br />

PEX 8516 27 x 27 mm 2 3.2 W 4.3 W<br />

PEX 8517 27 x 27 mm 2 2.6 W 3.6 W<br />

PEX 8518 23 x 23 mm 2 2.6 W 3.6 W<br />

PEX 8519 19 x 19 mm 2 1.7 W 2.6 W<br />

PEX 8524 31 x 31 mm 2 3.9 W 6.1 W<br />

PEX 8525 31 x 31 mm 2 2.6 W 3.8 W<br />

PEX 8532 35 x 35 mm 2 5.7 W 7.4 W<br />

PEX 8533 35 x 35 mm 2 3.3 W 4.8 W<br />

PEX 8547 37.5 x 37.5 mm 2 4.9 W 7.1 W<br />

‣ Voltages – 3 sources<br />

• 1.0 V Core<br />

• 1.5 V SerDes I/O<br />

• 3.3 V I/O<br />

‣ Thermal<br />

• Industrial Temp<br />

available on most<br />

products<br />

* All PBGA, 1.0mm Pitch<br />

PEX 8548 37.5 x 37.5 mm 2 4.9 W 7.1 W<br />

Typical: 35% lane utilization, typical voltages, 25 o C ambient temperature<br />

Maximum: 85% lane utilization, max operating voltages, across industrial temperature<br />

77


Device<br />

** Preliminary Estimates<br />

Gen 2 Electrical & Mechanical<br />

‣ Voltages – 2 sources<br />

• 1.0 V Core & SerDes I/O<br />

• 2.5 V I/O<br />

Package Size*<br />

(mm 2 )<br />

Typical Power<br />

Consumption (Gen 2<br />

Mode)<br />

‣ Thermal<br />

• Commercial Temp<br />

Typical Power<br />

Consumption<br />

(Gen 1 Mode**)<br />

Max. Power<br />

Consumption<br />

(Gen 2 Mode)<br />

Max. Power<br />

Consumption<br />

(Gen 1 Mode**)<br />

PEX 8648 27 x 27 mm 2 3.7 W 3.0 W 8.6 W 7.9 W<br />

PEX 8647 27 x 27 mm 2 2.8 W 2.1 W 7.4 W 6.7 W<br />

PEX 8632 27 x 27 mm 2 2.7 W 2.2 W 6.4 W 5.9 W<br />

PEX 8624 19 x 19 mm 2 1.9 W 1.5 W 4.6 W 4.3 W<br />

PEX 8616 19 x 19 mm 2 1.7 W 1.5 W 4.3 W 4.1 W<br />

PEX 8619 19 x 19 mm 2 1.80 W 1.58 W 4.53 W 4.08 W<br />

PEX 8618 19 x 19 mm 2 1.75 W 1.54 W 4.47 W 4.04 W<br />

PEX 8617 19 x 19 mm 2 1.6 W 1.4 W 4.26 W 3.85 W<br />

PEX 8612 19 x 19 mm 2 1.6 W 1.4 W 4.2 W 4.0 W<br />

PEX 8615 19 x 19 mm 2 1.6 W 1.37 W 4.01 W 3.65 W<br />

PEX 8614 19 x 19 mm 2 1.5 W 1.33 W 3.96 W 3.60 W<br />

PEX 8613 19 x 19 mm 2 1.4 W 1.2 W 3.80 W 3.4 W<br />

PEX 8609 15 x 15 mm 2 1.33 W 1.16 W 3.51 W 3.20 W<br />

PEX 8608 15 x 15 mm 2 1.31 W 1.15 W 3.51 W 3.20 W<br />

PEX 8606 15 x 15 mm 2 1.25 W 1.09 W 3.32 W 3.04 W<br />

PEX 8604 15 x 15 mm 2 1.18 W 1.03 W 3.13 W 2.89 W<br />

Typical: 35% lane utilization, typical voltages 25C° ambient. L0s mode<br />

Maximum: 85% lane utilization, L0 mode, max operating voltages<br />

78


End of Presentation<br />

Thank You<br />

www.plxtech.com<br />

79

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!