PCI Express
PCI Express PCI Express
PLX Overview PCI Express & Storage - Two Fast Growing Markets - Now over 50% of Company’s Revenue PCI Express Switches & Bridges Storage Controllers Connectivity USB & PCI Bridges, Controllers & UARTs PLX Revenue Split by Product Line Public NASDAQ Company (PLXT) Financially Solid with Zero Debt, Cash Flow Positive 2
- Page 2 and 3: Market Leader #1 Supplier PCI Expre
- Page 4 and 5: Product Summary & Road Maps PLX Pro
- Page 6 and 7: Lanes 96 80 64 48 32 24 16 12 8 6 4
- Page 8 and 9: visionPAK Suite System Debug Featur
- Page 10 and 11: Extraction of Receiver Eye Width
- Page 12 and 13: Extraction of Receiver Eye Width
- Page 14 and 15: Extraction of Receiver Eye Width
- Page 16 and 17: PCIe Packet Generator ‣ User-prog
- Page 18 and 19: Performance Monitor ‣ What is it?
- Page 20 and 21: Performance Monitor ‣ Graphically
- Page 22 and 23: Performance Monitor ‣ Customer Be
- Page 24 and 25: Loopback ‣ Customer Benefits •
- Page 26 and 27: External Transmitter Loopback ‣ U
- Page 28 and 29: Recovered Data Loopback ‣ Externa
- Page 30 and 31: performancePAK Suite Performance En
- Page 32 and 33: Read Pacing ‣ Problem Reduced en
- Page 34 and 35: With PLX Read Pacing ‣ Increased
- Page 36 and 37: PCIe Multicast ‣ Address Based
- Page 38 and 39: Multicast Memory Space Multicast_Ba
- Page 40 and 41: Peer-to-Peer & Peer Plus Host Multi
- Page 42 and 43: MC in Communication Systems A 40G (
- Page 44 and 45: PLX PEX 8600 Buffer Allocation ‣
- Page 46 and 47: Multi-Root Architecture 48
- Page 48 and 49: Multi-Root Benefits in PEX Switch
- Page 50 and 51: Generic Rack Mount Server CPU Chip
PLX Overview<br />
<strong>PCI</strong> <strong>Express</strong> & Storage<br />
- Two Fast Growing Markets<br />
- Now over 50% of Company’s Revenue<br />
<strong>PCI</strong> <strong>Express</strong><br />
Switches & Bridges<br />
Storage<br />
Controllers<br />
Connectivity<br />
USB & <strong>PCI</strong> Bridges,<br />
Controllers & UARTs<br />
PLX Revenue Split<br />
by Product Line<br />
Public NASDAQ Company (PLXT)<br />
Financially Solid with Zero Debt, Cash Flow Positive<br />
2
Market Leader<br />
#1 Supplier <strong>PCI</strong> <strong>Express</strong> Interconnect<br />
<strong>PCI</strong> <strong>Express</strong><br />
Switches & Bridges<br />
- Over 55% Marketshare*<br />
- Designs with all Market Leaders<br />
- 4 Million Units Shipped<br />
- Broadest Offering of Switches & Bridges<br />
Communications<br />
Embedded<br />
Server<br />
Storage<br />
PC/Consumer<br />
* 55% for Bridges & Switches Combined. Over 65% for Switches.<br />
3
<strong>PCI</strong>e Switch Is a Basic Building Block<br />
SPARC CPU<br />
with Native <strong>PCI</strong>e<br />
x86 Processors<br />
Network, Security,<br />
Graphic<br />
& Co-Processors<br />
Processors<br />
Chip Sets<br />
ASICs, Logic,<br />
& FPGAs<br />
Communication<br />
& Storage<br />
<strong>PCI</strong> <strong>Express</strong><br />
Switch<br />
<strong>PCI</strong> <strong>Express</strong><br />
Cleaner, Lower Cost, Lower Power<br />
Switch is a basic building block<br />
4
Product Summary<br />
& Road Maps<br />
PLX Products<br />
No NDA Required<br />
5
48<br />
32<br />
24<br />
16<br />
12<br />
8<br />
Lanes<br />
96<br />
80<br />
64<br />
48<br />
32<br />
24<br />
16<br />
12<br />
8<br />
6<br />
4<br />
1 st Gen 2 Family<br />
<strong>PCI</strong>e Gen 2 Switch Road Map<br />
Server, Storage & Dual Graphics<br />
PEX 8648<br />
48 Lanes, 12 Ports, NT<br />
PEX 8632<br />
32 Lanes, 12 Ports, NT<br />
2KB, 3HPC, DC, RP, DB<br />
PEX 8624<br />
24 Lanes, 6 Ports, NT<br />
2KB, 3HPC, DC, RP, DB<br />
PEX 8616<br />
16 Lanes, 4 Ports, NT<br />
2KB, 3HPC, DC, RP, DB<br />
PEX 8612<br />
12 Lanes, 3 Ports, NT<br />
2KB, 3HPC, DC, RP, DB<br />
Multi-Host MR<br />
& Multicast<br />
Blade & Rack Servers,<br />
Storage & Networking<br />
PEX 8647<br />
48 Lanes, 3 x16 Ports<br />
2KB, DC, DB, RP<br />
High Port Count, 2VC<br />
, DMA<br />
& SSC for Control Plane<br />
Networking, Embedded & Storage<br />
PEX 8617<br />
16 Lanes, 4 Ports, NT<br />
PEX 8618<br />
16 Lanes, 16 Ports, NT<br />
PEX 8613<br />
12 Lanes, 3 Ports, NT<br />
PEX 8614<br />
12 Lanes, 12 Ports, NT<br />
2VC, DC, RP, SSC, DB<br />
PEX 8608<br />
8 Lanes, 8 Ports, NT<br />
2VC, DC, RP, SSC, DB<br />
PEX 8606<br />
6 Lanes, 6 Ports, NT<br />
PEX 8604<br />
4 Lanes, 4 Ports, NT<br />
2VC, DC, RP, SSC, DB<br />
PEX 8696<br />
96 Lanes, 24 Ports, NT<br />
Multi-Root/Host<br />
PEX 8680<br />
PEX 8619<br />
16 Lanes, 16 Ports, NT<br />
2VC, DC, RP, SSC, DB, DMA<br />
PEX 8615<br />
12 Lanes, 12 Ports, NT<br />
2VC, DC, RP, SSC, DB, DMA<br />
PEX 8609<br />
8 Lanes, 8 Ports, NT<br />
2VC, DC, RP, SSC, DB, DMA<br />
80 Lanes, 20 Ports, NT<br />
Multi-Root/Host<br />
PEX 8664<br />
64 Lanes, 16 Ports, NT<br />
Multi-Root/Host<br />
PEX 8649<br />
48 Lanes, 12 Ports, NT<br />
(MR)/Host & Multicast<br />
Low Lane<br />
Count<br />
General Purpose<br />
Shipping Now<br />
In Development<br />
Planned/Concept<br />
2008 2009<br />
Approximate Timeframes 2010<br />
2010<br />
7
PLX Exclusive Features<br />
‣ Here are the exclusive features we will talk about today:<br />
• visionPAK TM Suite<br />
• Extraction of Receive Data “Eye Width”<br />
• <strong>PCI</strong>e Packet Generator<br />
• Performance Monitoring<br />
• Error Injection<br />
• SerDes Loopback Modes<br />
• performancePAK TM Suite<br />
• Read Pacing<br />
• Multicast<br />
• Dynamic Buffer Allocation<br />
‣ All valuable in both Gen 2 and Gen 1 modes!!<br />
8
visionPAK Suite<br />
System Debug Features<br />
Exclusive to PLX 8600 Switches<br />
9
Best-in-Class Gen 2 SerDes<br />
‣ Best-in-class SerDes (ARM)<br />
• Can support up to 60 inches of trace length for backplanes<br />
‣ Optional programmable features<br />
• Transition amplitude and Non-transitional amplitude<br />
• Pre-emphasis, De-emphasis and drive strength granularity to 50mV<br />
• Receiver Detect and Electrical Idle bits<br />
• SerDes BIST and AC JTAG<br />
• Automatic impedance calibration<br />
Non-Transitional Eye<br />
Transitional Eye<br />
10
Extraction of Receiver Eye Width<br />
‣ What is it?<br />
• PLX Exclusive pre-system design tool<br />
• Used on PLX RDK to test how clean a link is<br />
• Early indicator of potential link issues<br />
• A very open eye means a very reliable link<br />
• A tight eye indicates a weak link<br />
• Supported by all PEX 86xx switches<br />
11
Extraction of Receiver Eye Width<br />
‣ How does it work?<br />
1. User selects port, lane, Rx equalization, dwell time, etc.<br />
2. Software finds the center of the eye<br />
3. Software steps through the eye left & right until data errors occur<br />
4. Determines eye width and displays on the screen<br />
Center of the Eye<br />
Eye Width<br />
Steps Left<br />
Steps Right<br />
12
Extraction of Receiver Eye Width<br />
‣ Minimum Eye Width Test<br />
• Customer selects Port, Lanes, and Minimum Eye Width<br />
• Software tests all selected lanes for Minimum Eye Width<br />
• Returns “Pass” or “Fail” for each selected lane<br />
• Extremely convenient and pain-free tool for Customers<br />
13
Extraction of Receiver Eye Width<br />
‣ Auto-Calibrate Feature<br />
Rx = Receiver<br />
• Customer selects Port and Lane and range of Rx Equalization<br />
• Software steps through each eye and finds optimal setting<br />
• Returns value of Rx Equalization that gives best eye width<br />
• Extremely convenient and useful tool for Customers<br />
14
Extraction of Receiver Eye Width<br />
‣ Customer Benefits<br />
• Allows customer to see how much margin their<br />
link has at the PEX 86xx Receiver<br />
• Convenient features<br />
• Minimum Eye Width Test<br />
• Auto-Calibrate Feature<br />
• Identify potential link issues earlier<br />
…GET TO MARKET FASTER!!!<br />
15
<strong>PCI</strong>e Packet Generator<br />
‣ What is it?<br />
• PLX Exclusive feature that allows customers to create<br />
their own traffic patterns using a PLX switch<br />
• Enables high density traffic<br />
• Up to x16 port saturation (not easy to achieve)<br />
• Can create error messages<br />
• See how software/system reacts to errors<br />
• PLX RDKs can be used as <strong>PCI</strong>e packet generators<br />
• Great alternative to expensive <strong>PCI</strong>e Exercisers<br />
• All-in-one solution<br />
16
<strong>PCI</strong>e Packet Generator<br />
‣ User-programmable traffic<br />
• Memory Reads/Writes<br />
• Payload Size<br />
• <strong>PCI</strong> Address<br />
‣ View<br />
command list<br />
‣ Create<br />
looped traffic<br />
17
<strong>PCI</strong>e Packet Generator<br />
‣ Customer Benefits<br />
• Convenient, inexpensive way of stress testing their system<br />
• Programmable traffic<br />
• Ability to fully saturate their links<br />
• Test system software<br />
• No external equipment needed<br />
…SAVES $$$!!!<br />
18
Performance Monitor<br />
‣ What is it?<br />
• PLX Exclusive feature that allows customers to monitor<br />
PEX 86xx switch performance real-time<br />
• Displays performance for each individual port<br />
• Displays Ingress and Egress performance<br />
• Completely passive<br />
• Does not impact system performance<br />
19
‣ How does it work?<br />
Performance Monitor<br />
• Customer selects Port to monitor<br />
• Software reads PEX 86xx registers and displays the data<br />
• Link Utilization % indicates unused link potential<br />
• Displays total rate, payload rate, reads vs. writes, etc.<br />
20
Performance Monitor<br />
‣ Graphically ‘see’ the traffic on each port during runtime<br />
Total<br />
Byte Rate<br />
% Link<br />
Utilization<br />
Payload<br />
Byte Rate<br />
Average<br />
Payload Size<br />
(Bytes)<br />
21
Performance Monitor<br />
‣ Count Ingress & Egress TLPs for every port<br />
• Extensive Granularity<br />
• Posted header & dword<br />
• Non-Posted dword<br />
• Completion header & dword<br />
• Filter for various types of Posted and Non-Posted packets<br />
• For example, count just MWr64<br />
‣ Count Ingress & Egress DLLPs for every port<br />
• Extensive Granularity<br />
• Filter for ACKs<br />
• Filter for NAKs<br />
• Filter for UpdateFCs (Posted, Non-Posted, or Completions)<br />
22
Performance Monitor<br />
‣ Customer Benefits<br />
• Allows customers to track real-time link utilization<br />
• Helps find any weak links & potential bottlenecks<br />
• Gives additional visibility into traffic patterns<br />
• Convenient, inexpensive system bring-up tool<br />
…THE COMPLETE SOLUTION!!!<br />
23
Error Injection<br />
‣ What is it?<br />
• Software development tool<br />
• Inject ECC errors in <strong>PCI</strong>e packet<br />
• Inject <strong>PCI</strong>e error messages<br />
‣ Customer Benefits<br />
• Allows a customer to see how their software would<br />
react to these errors<br />
…SPEEDS UP SOFTWARE DEVELOPMENT!!!<br />
24
Loopback<br />
‣ Customer Benefits<br />
• Four convenient ways to test the SerDes and Logic of the<br />
PEX 86xx and/or connected device<br />
• Internal Tx<br />
• External Tx<br />
• Recovered CLK<br />
• Recovered Data<br />
…FOUR LOOPBACK MODES!!!<br />
25
Internal Transmitter Loopback<br />
‣ Used to test PLX SerDes and Logic<br />
26
External Transmitter Loopback<br />
‣ Used to test the link and logic of connected device<br />
27
Recovered Clock Loopback<br />
‣ External Tx test plus also tests PLX Recovered Clock Circuit<br />
28
Recovered Data Loopback<br />
‣ External Tx test plus also tests PLX Logic<br />
29
Easy Debug via I 2 C<br />
‣ Customers recommended to design in a I 2 C connector<br />
• Allows easy connection to PLX RDK via laptop or PC (USB)<br />
• Enables systems w/o Windows or Linux OS to run SDK thru<br />
remote system (i.e. laptop)<br />
• Convenient for debug purposes<br />
• Ideal for on-site FAE support<br />
USB<br />
PLX RDK<br />
PORT 8 – X16<br />
PORT 12 – X16<br />
Port 8 Status<br />
LED<br />
Port 12 Status<br />
LED<br />
Port 0 Status<br />
LED<br />
Manual<br />
Reset<br />
HD Power<br />
Connector<br />
I 2 C<br />
EEPROM<br />
JTAG<br />
RefClk<br />
PEX 8548<br />
DIP Switch<br />
for Config<br />
I 2 C<br />
PERST#<br />
Port 0 – x16<br />
(Upstream Port)<br />
System Chassis<br />
30
performancePAK Suite<br />
Performance Enhancing Features<br />
Exclusive to PLX 8600 Switches<br />
31
PEX 8600 Performance<br />
‣ PLX Leadership in Gen 2 Performance<br />
‣ Achieving >99% theoretical max throughput<br />
in Host-Centric and Peer-to-Peer environments<br />
‣ Proven in simulations & actual measurements<br />
Stay ahead of the pack with PLX!<br />
‣ Featuring performancePAK<br />
• Read Pacing<br />
• Dual Cast & Multicast<br />
• Dynamic Buffer Allocation<br />
• DMA-engine inside<br />
Throughput (MB/s)<br />
7000<br />
6000<br />
5000<br />
4000<br />
3000<br />
2000<br />
1000<br />
0<br />
Throughput (GB/s)<br />
10<br />
8<br />
6<br />
x16 Gen 2 Host-Centric Throughput<br />
100.0<br />
10GE Throughput<br />
12<br />
80.0<br />
60.0<br />
100.00<br />
40.0<br />
80.00<br />
20.0<br />
60.00<br />
0.0<br />
40.00<br />
64 4 128 256 512 1024 2048<br />
Packet Size (B)<br />
2<br />
20.00<br />
Theoretical Max PEX 8600 Bidirectional % Theoretical Max<br />
% of Native<br />
% Theoretical Max<br />
Native<br />
w/ PEX 8624<br />
% of Native<br />
0<br />
One Thread<br />
Multi-Thread<br />
0.00<br />
32
Read Pacing<br />
‣ Problem Reduced endpoint performance caused by:<br />
• Unbalanced upstream/downstream link-widths<br />
• Uneven number of Read Requests made by endpoints<br />
• Leads to one endpoint dominating Root Complex queue<br />
• Other endpoints get starved<br />
‣ Solution PLX Read Pacing*<br />
• Read Pacing Queues manage incoming Read Requests<br />
• Prevents one endpoint from dominating Root Complex queue<br />
• Ensures no endpoint is starved<br />
• Allows for optimized performance of endpoints<br />
* Patents pending<br />
34
Without Read Pacing<br />
‣ Performance bottleneck due to mixing of slow<br />
and fast I/Os<br />
1. FC HBA makes multiple 2KB Read Requests<br />
2. Root Complex queues FC HBA requests<br />
3. Ethernet NIC makes one 1KB Read Requests<br />
4. Root Complex queues Ethernet NIC request<br />
5. Ethernet NIC must wait for RC to service FC HBA<br />
requests before serving Ethernet NIC request<br />
Root Complex<br />
IN<br />
2KB RR<br />
2KB RR<br />
2KB RR<br />
2KB RR<br />
1KB RR<br />
x8<br />
Switch<br />
1KB Data<br />
2KB Data<br />
2KB Data<br />
2KB Data<br />
2KB Data<br />
OUT<br />
Ethernet NIC packets<br />
at the end of the line<br />
6. Ethernet NIC is starved<br />
Reduced Ethernet NIC Performance!<br />
Sends 2KB<br />
Read Requests<br />
FC<br />
HBA<br />
x4<br />
2KB RR<br />
x4<br />
1KB RR<br />
Sends 1KB<br />
Read Requests<br />
Ethernet<br />
NIC<br />
35
With PLX Read Pacing<br />
‣ Increased performance due to fair allocation of<br />
bandwidth to downstream ports<br />
1. FC HBA makes multiple 2KB Read Requests<br />
2. Switch allows one FC HBA request at a time to<br />
pass through based on programmable thresholds<br />
3. Ethernet NIC makes one 1KB Read Requests<br />
4. Switch allows one requests to pass through the<br />
switch based on programmable thresholds<br />
5. Switch continues to allow Ethernet NIC requests to<br />
pass through in front of large FC HBA requests<br />
based on programmable settings<br />
6. Ethernet NIC gets serviced more often with no<br />
impact to FC HBA performance<br />
7. Neither endpoint is starved<br />
Optimized Performance!<br />
FC<br />
HBA<br />
Root Complex<br />
IN<br />
2KB RR<br />
2KB RR<br />
2KB RR<br />
1KB RR<br />
1KB RR<br />
x8<br />
Read Pacing<br />
Queue<br />
Sends 2KB<br />
x4<br />
Read Requests<br />
2KB RR<br />
PEX 8600<br />
2KB Data<br />
1KB Data<br />
2KB Data<br />
1KB Data<br />
2KB Data<br />
OUT<br />
Ethernet NIC packets<br />
fairly queued<br />
Read Pacing<br />
Queue<br />
Sends 1KB<br />
x4<br />
Read Requests<br />
1KB RR<br />
Ethernet<br />
1KB NIC RR<br />
36
Read Pacing Measured<br />
Read Pacing Comparison<br />
Throughput (MB/s)<br />
498.355 496.315 496.125<br />
112.15<br />
112.24<br />
20.87<br />
Standalone Read Pacing Off Read Pacing On<br />
PLX Packet<br />
Generator<br />
“Port Hog”<br />
Root Complex<br />
x8<br />
Switch<br />
x4<br />
x4<br />
Ethernet<br />
NIC<br />
“Starved Port”<br />
GE NIC Performance<br />
PLX <strong>PCI</strong>e Packet Generator Performance<br />
Endpoint<br />
Read Pacing OFF Read Pacing ON<br />
GE NIC 18.6 100.1<br />
PLX Packet<br />
Generator<br />
% of Standalone Performance<br />
99.6 99.6<br />
‣ PLX <strong>PCI</strong>e Packet Generator<br />
• Used to mimic a “fast” I/O<br />
(i.e. FC HBA)<br />
• Sending back to back Memory<br />
Read Requests to Host<br />
37
<strong>PCI</strong>e Multicast<br />
‣ Address Based<br />
• Multicast BAR creates a Multicast space<br />
• Posted packets that hit in the Multicast BAR are multicast<br />
‣ Reliability<br />
• <strong>PCI</strong>e has low error rate and hop by hop error free transmission<br />
‣ Supports Legacy<br />
• Any source can send posted packet in Multicast space<br />
• Legacy devices can be multicast targets<br />
‣ Multicast ECN<br />
• Specifies how – Switches, Root Complex, and End Points implement Multicast<br />
• Improves flexibility and protection for End Points participating in Multicast<br />
• Also allows use of legacy End Points<br />
38
Multicast – Example<br />
‣ Support for 64 Multicast (MC) Groups<br />
‣ All ports can be programmed as MC source<br />
‣ One source port can have multiple MC groups<br />
‣ One destination port can be part of multiple MC groups<br />
‣ MC can be done across an NT port<br />
CPU<br />
CPU<br />
Source Port<br />
Chipset<br />
Chipset<br />
Destination<br />
Ports<br />
PEX 8680<br />
PEX 8680<br />
IO<br />
IO<br />
GPU<br />
GPU<br />
IO IO GPU GPU<br />
CPU sending single command to multiple IOs (Group 1)<br />
CPU sending single command to multiple GPUs (Group 2)<br />
39
Multicast Memory Space<br />
Multicast_Base<br />
2 Multicast_Index_Position<br />
Multicast Group 0<br />
Memory Space<br />
Multicast Group 1<br />
Memory Space<br />
Multicast Group 2<br />
Memory Space<br />
Multicast Group 3<br />
Memory Space<br />
Multicast Group n<br />
Memory Space<br />
40
Multicast and Address Routing<br />
<strong>PCI</strong>e Standard Address Route<br />
Multicast Address Route<br />
‣ Request that hits a Multicast address<br />
range is routed unchanged to Ports<br />
that are part of the Multicast Group<br />
derived from Request address<br />
‣ <strong>PCI</strong>e standard address route not used<br />
for multicast<br />
• Including default upstream route<br />
41
Peer-to-Peer & Peer Plus Host Multicast<br />
Root<br />
<strong>PCI</strong>e Switch<br />
P2P<br />
Bridge<br />
Virtual <strong>PCI</strong> Bus<br />
P2P<br />
Bridge<br />
P2P<br />
Bridge<br />
P2P<br />
Bridge<br />
P2P<br />
Bridge<br />
Endpoint Endpoint Endpoint Endpoint<br />
42
MC in Graphics & Floating Point Acceleration<br />
‣ Dual-headed graphics<br />
• Each GPU paints ½ the screen<br />
• Multi-Cast commands downstream<br />
• E.g. vector list<br />
• Use peer to peer to transfer bit map<br />
from GPU2 to GPU1<br />
‣ General Floating Point acceleration<br />
• Some GPUs need to see same data<br />
• Push data or commands downstream<br />
to multiple GPUs/FPUs<br />
x16<br />
GPU1<br />
CPU<br />
Root<br />
Complex<br />
x16<br />
<strong>PCI</strong>e<br />
SWITCH<br />
x16<br />
GPU2<br />
43
MC in Communication Systems<br />
A 40G (GE) Line Card<br />
• May need to split processing over 4 NPUs via MC<br />
• Service card on AMC may need to MC packets to FPGA & RegEX<br />
RegEx<br />
FPGA<br />
NPU 1<br />
AMC<br />
PEX 8696<br />
RGMII<br />
CPU<br />
NPU<br />
NPU 4<br />
44
MC in Storage<br />
CPU<br />
Up to 8 Processor Boards<br />
CPU<br />
Chip Set<br />
Mem<br />
Mem<br />
Chip Set<br />
PEX 8624<br />
PEX 8624<br />
MC Enabled Ports<br />
PEX 8664<br />
MR Switch<br />
IO Drawer<br />
PEX 8664<br />
MR Switch<br />
IO Drawer<br />
Back Plane<br />
Back Plane<br />
IO Cards<br />
IO Cards<br />
45
PLX PEX 8600 Buffer Allocation<br />
‣ Shared memory pool per 16 lanes<br />
PLX Buffer Allocation<br />
x4<br />
‣ User assigns buffers as per port-width<br />
• Set minimum buffers per ports<br />
• Also creates a common pool<br />
‣ Ports dynamically grab buffers as needed<br />
• Grab when utilization of assigned<br />
buffers exceeds user-assigned<br />
thresholds (25% by default)<br />
• Return empty buffers to the pool<br />
x4<br />
x2<br />
x2<br />
Assigned<br />
Buffers<br />
Common<br />
Buffer<br />
Pool<br />
Assigned<br />
Buffers<br />
46
Dynamic Allocation Appropriate Buffers<br />
‣ Static Buffers/port<br />
• Unused buffers<br />
• Can not assign based on traffic load<br />
• Can not move buffers between ports<br />
• …LOWER PERFORMANCE<br />
Static Buffers/port<br />
x8<br />
Unused buffers<br />
Fixed 5 packet buffers<br />
for each ports for all<br />
port widths<br />
‣ PLX Shared Memory Pool<br />
• All buffers usable<br />
• Assign based on traffic load<br />
• Move buffers around between ports<br />
• …HIGHER PERFORMANCE<br />
x1<br />
x4<br />
x8<br />
Shared Memory Pool<br />
x8<br />
Buffers assign<br />
as needed<br />
x1<br />
x4<br />
x8<br />
47
Multi-Root Architecture<br />
48
What is Multi-Root?<br />
‣ A root (upstream) port connects in the direction of the CPU<br />
‣ A multi-root device has more than one upstream port<br />
• Provides connection for Two or more CPUs<br />
‣ Note: Endpoint sharing between CPUs is not supported<br />
One upstream port<br />
(single root)<br />
Two upstream ports<br />
(multi-root)<br />
CPU<br />
CPU<br />
CPU<br />
CPU<br />
Upstream port<br />
Upstream port<br />
Switch<br />
Switch<br />
Switch<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e<br />
49
Multi-Root Benefits in PEX Switch<br />
‣ Up to Eight upstream ports<br />
• Eight independent subsystems<br />
Host<br />
Host<br />
Host<br />
‣ Efficient use of interconnect<br />
• Mix-and-match number of<br />
endpoints to CPU<br />
• Based on performance needs<br />
• Up to 24-ports supported<br />
Manager<br />
Root<br />
Complex<br />
<strong>PCI</strong>e<br />
End-Point<br />
Root<br />
Complex<br />
PLX MR<br />
Switch<br />
<strong>PCI</strong>e End-<br />
Point<br />
Root<br />
Complex<br />
<strong>PCI</strong>e<br />
End-Point<br />
‣ Failover/Redundancy<br />
• Re-assign endpoints of a failed<br />
host<br />
Switch<br />
Switch<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
End-Point<br />
End-Point<br />
End-Point<br />
‣ Smaller footprint<br />
• Lower power<br />
<strong>PCI</strong>e<br />
End-Point<br />
<strong>PCI</strong>e<br />
End-Point<br />
50
Transparent and Multi-Root Modes<br />
‣ PEX MR switches support two functional modes<br />
• Mode 0 – Transparent Switch<br />
• Same as today’s switches<br />
• Mode 1 – Multi-Root Switch<br />
• Advanced architecture with multiple upstream ports<br />
CPU<br />
Mode 0 Mode 1<br />
CPU<br />
CPU<br />
Cygnus<br />
Cygnus<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e<br />
<strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e <strong>PCI</strong>e<br />
51
Generic Rack Mount Server<br />
CPU<br />
Chip<br />
Set<br />
PEX 8696<br />
I/O<br />
I/O<br />
52
Storage Server with Failover<br />
CPU<br />
CPU<br />
Chip<br />
Chip<br />
Set<br />
Set<br />
NT<br />
PEX 8696<br />
I/O<br />
I/O<br />
53
Servers Sharing <strong>PCI</strong>e MR Switch<br />
Each CPU* has its own dedicated IOs isolated from other CPUs<br />
CPU<br />
CPU<br />
CPU<br />
Chip<br />
Set<br />
Chip<br />
Set<br />
Chip<br />
Set<br />
PEX 8696<br />
* Up to 8 CPUs<br />
supported<br />
I/O<br />
I/O<br />
I/O<br />
I/O<br />
I/O<br />
I/O<br />
54
Use of Mode 2 for Fail-over<br />
CPU<br />
CPU<br />
Chip<br />
Set<br />
Chip<br />
Set<br />
PEX 8664 PEX 8664<br />
55<br />
x4 & x8<br />
Endpoint<br />
Endpoint<br />
Endpoint<br />
Endpoint<br />
Endpoint<br />
Endpoint
Packet-Ahead<br />
Two Virtual Channels Implementation<br />
Traffic Class re-mapping<br />
56
Traffic Classes and Virtual Channels<br />
‣ Traffic Class (TC)<br />
• Specifies priority for a given <strong>PCI</strong> <strong>Express</strong> packet<br />
• <strong>PCI</strong> <strong>Express</strong> supports Eight TCs: TC0 – TC7<br />
• TC7 Highest Priority<br />
• TC for a Request/Completion pair must be the same<br />
‣ Virtual Channel (VC)<br />
• Buffer entity used to queue <strong>PCI</strong>e packets<br />
• 1 VC = 1 Buffer entity<br />
• 2 VC = 2 Buffer entities<br />
• VC assignment for a given TC according to priority<br />
• Low Priority Traffic shares one VC<br />
• High Priority Traffic has its own VC<br />
57
One Wire, Multiple Traffic Flows<br />
TC6-TC4/VC1<br />
TC7/VC2<br />
TC3-TC0/VC0<br />
<strong>PCI</strong> <strong>Express</strong> Wire<br />
‣ Traffic Class determines priority for packets<br />
‣ Packets mapped to VCs according to Priority<br />
• Highest Priority TC7 VC2<br />
• Lower Priority TC6-TC4 VC1<br />
• Lowest Priority TC3-TC0 VC0<br />
‣ One VC Same priority for ALL packets<br />
58
TC – VC Mapping in <strong>PCI</strong>e Hierarchy<br />
‣ Each Device supports different number of VCs<br />
‣ TC/VC mapping according to device capabilities<br />
• On a per link basis<br />
‣ VC arbitration schemes enable QoS<br />
‣ No ordering between VCs; independent buffers<br />
59
Packet-Ahead Feature<br />
‣ Allows the NT port to modify the original Traffic Class<br />
(TC) of a <strong>PCI</strong>e packet<br />
• From TC0 to TCx<br />
• where TCx is TC1 – TC7<br />
‣ Benefits<br />
• Provides two separate data paths for memory traffic<br />
• Low priority, High priority<br />
• Enhanced QoS regardless of CPU single VC limitation<br />
• Differentiation of traffic in single VC systems<br />
• Available in PEX8618, PEX8614 and PEX8608<br />
60
Example Without Packet-Ahead<br />
‣ CPU supports VC0 and TC0 only<br />
• No differentiation of traffic<br />
CPU<br />
‣ Endpoints and Switch support two VCs<br />
and at least two TCs<br />
‣ Single path to CPU<br />
NT<br />
PEX 8618<br />
‣ System is limited by CPU capabilities<br />
ASIC<br />
ASIC<br />
ASIC<br />
61
Example with Packet-Ahead<br />
‣ Same CPU Limitations<br />
• VC0 and TC0 only<br />
‣ Same Endpoint and Switch capabilities<br />
• VC0 – VC1; TC0 – TC1<br />
CPU<br />
‣ Two paths to CPU<br />
• Via Upstream Port and NT Port<br />
NT<br />
PEX 8618<br />
‣ For packets received on NT Port<br />
• TC is changed from TC0 to TC1<br />
• Are mapped to High Priority VC1<br />
ASIC<br />
ASIC<br />
ASIC<br />
‣ Packets received on upstream port are<br />
unaffected<br />
62
Packet-Ahead Transaction Details<br />
‣ Posted Traffic (mem writes)<br />
• CPU generates Posted Packet with TC0<br />
• NT port modifies packet with TC1<br />
‣ Non-Posted (Read Requests)<br />
• CPU generates Read Request with TC0<br />
• NT port modifies packet with TC1<br />
• Endpoint sinks requests and provides completion with TC1<br />
• NT port modifies completion packet to original TC0<br />
63
Direct Memory Access (DMA)<br />
Inside PLX <strong>PCI</strong>e Switches<br />
64
DMA Benefits<br />
‣ Independent Data Mover<br />
• Can transfer small and large blocks of data<br />
• No CPU involvement<br />
• Can transfer data between all switch ports<br />
‣ Centralized DMA Engine<br />
• Processor/chipset no longer needs to support DMA<br />
• More selection lower cost<br />
• Software consolidation through multiple platforms<br />
• Software code for one DMA engine<br />
‣ Improves system performance<br />
• Low latency transfers while sustaining Gen 2 speeds<br />
65
<strong>PCI</strong>e Switch with DMA<br />
‣ <strong>PCI</strong>e Switch is a Multi-Function device<br />
• Function 0 P-to-P bridge<br />
• Transparent Switch model<br />
• No Driver Required<br />
• Function 1 DMA endpoint<br />
• Type 0 Configuration Header<br />
• Memory mapped registers<br />
• Requires DMA Driver<br />
• Provided by PLX<br />
‣ Available now<br />
• PEX8619: 16-lane/16-Port<br />
• PEX8615: 12-lane/12-Port<br />
• PEX8609: 8-lane/8-Port<br />
P-P<br />
Bridge<br />
DMA<br />
F1<br />
Upstream Port<br />
P-P<br />
Bridge<br />
P-P<br />
Bridge<br />
Downstream Ports<br />
Virtual Bus<br />
P-P<br />
Bridge<br />
P-P<br />
Bridge<br />
66
DMA Implementation<br />
‣ Four DMA Channels – Each Channel:<br />
• Works on one descriptor at a time<br />
• Has a unique Requester ID (RID)<br />
• Has a programmable traffic class (TC) for QoS<br />
‣ DMA descriptor<br />
• Specifies source address, destination address, transfer size, control<br />
• Internal or external<br />
‣ DMA Read Function<br />
• Initiates Read Requests and collects completions by matching RID<br />
and Tag number<br />
‣ DMA Write Function<br />
• Converts Read Completion streams into Memory write streams<br />
67
DMA Descriptor Overview<br />
‣ Descriptors are instructions for DMA<br />
• Written by CPU<br />
• Stored in a ring in Host Memory<br />
OR<br />
• Stored internal to the <strong>PCI</strong>e Switch<br />
Descriptor<br />
Ring<br />
Descriptor N – 1<br />
32b<br />
‣ 16B standard format<br />
• Supports 32-bit Addressing<br />
• Control/Status information<br />
Descriptor 2<br />
Descriptor 1<br />
DstAddr<br />
SrcAddr<br />
SrcAddrH, DstAddrH<br />
Transfer Size, Control<br />
‣ 16B extended format<br />
• Supports 64-bit Addressing<br />
• Control/Status information<br />
Descriptor 0<br />
68
DMA Descriptor Prefetch<br />
‣ DMA channel prefetches 1 to 4 descriptors at a time when in<br />
external descriptor mode<br />
‣ Internal buffer support for up to 256 descriptors<br />
Active Channels Descriptors per Channel<br />
1 256<br />
2 128<br />
4 64<br />
‣ Descriptors are prefetched to internal buffer until filled<br />
• Control in place for # of descriptors to be prefetched<br />
• 1, 4, 8, or max per channel<br />
‣ Invalid descriptors are dropped<br />
• no further descriptor fetch until software clears status<br />
• interrupt optionally enabled<br />
69
DMA Runtime Flow – Host to I/O<br />
Memory<br />
FPGA<br />
CPU<br />
RC<br />
Switch<br />
ASIC<br />
ASIC<br />
ORANGE text describes CPU tasks<br />
1. CPU Programs Descriptors in RAM<br />
2. CPU enables DMA<br />
3. DMA reads Descriptors in RAM<br />
a. DMA prefetches 1-256 descriptors<br />
4. DMA works on 1 descriptor at a time<br />
a. DMA reads source<br />
b. Completions arrive in switch<br />
c. Completions are converted to Writes<br />
d. DMA Writes to Destination<br />
e. (There can be multiple read/write per<br />
descriptor)<br />
f. Clears Valid Bit on Descriptor after last<br />
write (optional)<br />
g. Interrupt CPU after descriptor (optional)<br />
h. Start next descriptor<br />
5. End of Ring (DMA Done)<br />
6. CPU receives and handles interrupts<br />
70
DMA Performance<br />
‣ Ordering enforced within a DMA channel only<br />
• Descriptors are read in order from Host Memory<br />
• Data within a descriptor is moved in order<br />
• Read Requests (MRd) are strictly ordered<br />
• Partial completions per MRd follow <strong>PCI</strong>e Spec – Out of order tags<br />
will be re-ordered on the chip<br />
• Write Requests (MWr) are strictly ordered<br />
‣ Full Line Rate Throughput<br />
• One channel can saturate one link in one direction<br />
• x8 at 5GT/s with 64B Read Completions<br />
• Two channels can saturate both directions<br />
• x8 at 5GT/s with 64B Read Completions<br />
‣ Programmable Interrupt Control<br />
• Less interrupts Less CPU utilization<br />
‣ Data Rate Controls in place to control the maximum read BW<br />
• Transfer Max Read Size every X clocks (programmable)<br />
71
Data Integrity<br />
72
Data Integrity and Error Isolation<br />
‣ PLX supports protection against <strong>PCI</strong>e errors<br />
• Providing robust system through data integrity & error isolation<br />
‣ <strong>PCI</strong>e Error Types<br />
• Malformed packets<br />
• EP (Poisoned TLPs) & ECRC errors<br />
• 1-bit ECC & 2-bit ECC<br />
• LCRC<br />
• PHY Errors: Disparity, 8b/10b Encoding, Scrambler & Framing<br />
• Receiver Overflow, Flow Control Protocol Error<br />
• PLX Device Specific : ECC, UR overflow<br />
73
Data Integrity<br />
‣ Internal data path protection from ingress to egress<br />
• Complete data protection through ECC<br />
• ECRC on ingress and egress ports<br />
‣ Higher performance by reducing re-transmission<br />
74
Error Isolation – Fatal Errors<br />
‣ User selectable behavior on fatal errors<br />
‣ Malformed Packet or Internal Fatal Error handling<br />
• Mode 1 (Default)<br />
• Assert FATAL_ERR# pin and send Error Message<br />
• Mode 2<br />
• Generate Internal Reset (equivalent to in-band hot reset)<br />
• Mode 3<br />
• Block all packet transmissions<br />
• Cancel packets in transit with EDB<br />
• Mode 4<br />
• Block all packet transmission<br />
• Bring upstream link down to cause surprise down<br />
75
Error Isolation – EP/ECRC<br />
‣ User selectable behavior with packet error<br />
‣ Poisoned packet (EP) & ECRC error handling<br />
• EP/ECRC Mode 1 (Default)<br />
• Forwarded with appropriate logging<br />
• EP/ECRC Mode 2<br />
• Drop EP/ECRC packet<br />
• Not forwarded, only logged<br />
• EP/ECRC Mode 3<br />
• Block violating device<br />
• Drop and Block EP/ECRC packet<br />
76
Gen 1 Electrical & Mechanical Summary<br />
Device<br />
Package Size*<br />
(mm 2 )<br />
Typical Power<br />
Consumption (Watts)<br />
Max. Power<br />
Consumption (Watts)<br />
PEX 8505 15 x 15 mm 2 0.8 W 1.4 W<br />
PEX 8508 19 x 19 mm 2 1.6 W 2.5 W<br />
PEX 8509 15 x 15 mm 2 1.2 W 1.8 W<br />
PEX 8511 15 x 15 mm 2 1.0 W 1.6 W<br />
PEX 8512 23 x 23 mm 2 2.2 W 3.1 W<br />
PEX 8513 19 x 19 mm 2 1.3 W 2.6 W<br />
PEX 8516 27 x 27 mm 2 3.2 W 4.3 W<br />
PEX 8517 27 x 27 mm 2 2.6 W 3.6 W<br />
PEX 8518 23 x 23 mm 2 2.6 W 3.6 W<br />
PEX 8519 19 x 19 mm 2 1.7 W 2.6 W<br />
PEX 8524 31 x 31 mm 2 3.9 W 6.1 W<br />
PEX 8525 31 x 31 mm 2 2.6 W 3.8 W<br />
PEX 8532 35 x 35 mm 2 5.7 W 7.4 W<br />
PEX 8533 35 x 35 mm 2 3.3 W 4.8 W<br />
PEX 8547 37.5 x 37.5 mm 2 4.9 W 7.1 W<br />
‣ Voltages – 3 sources<br />
• 1.0 V Core<br />
• 1.5 V SerDes I/O<br />
• 3.3 V I/O<br />
‣ Thermal<br />
• Industrial Temp<br />
available on most<br />
products<br />
* All PBGA, 1.0mm Pitch<br />
PEX 8548 37.5 x 37.5 mm 2 4.9 W 7.1 W<br />
Typical: 35% lane utilization, typical voltages, 25 o C ambient temperature<br />
Maximum: 85% lane utilization, max operating voltages, across industrial temperature<br />
77
Device<br />
** Preliminary Estimates<br />
Gen 2 Electrical & Mechanical<br />
‣ Voltages – 2 sources<br />
• 1.0 V Core & SerDes I/O<br />
• 2.5 V I/O<br />
Package Size*<br />
(mm 2 )<br />
Typical Power<br />
Consumption (Gen 2<br />
Mode)<br />
‣ Thermal<br />
• Commercial Temp<br />
Typical Power<br />
Consumption<br />
(Gen 1 Mode**)<br />
Max. Power<br />
Consumption<br />
(Gen 2 Mode)<br />
Max. Power<br />
Consumption<br />
(Gen 1 Mode**)<br />
PEX 8648 27 x 27 mm 2 3.7 W 3.0 W 8.6 W 7.9 W<br />
PEX 8647 27 x 27 mm 2 2.8 W 2.1 W 7.4 W 6.7 W<br />
PEX 8632 27 x 27 mm 2 2.7 W 2.2 W 6.4 W 5.9 W<br />
PEX 8624 19 x 19 mm 2 1.9 W 1.5 W 4.6 W 4.3 W<br />
PEX 8616 19 x 19 mm 2 1.7 W 1.5 W 4.3 W 4.1 W<br />
PEX 8619 19 x 19 mm 2 1.80 W 1.58 W 4.53 W 4.08 W<br />
PEX 8618 19 x 19 mm 2 1.75 W 1.54 W 4.47 W 4.04 W<br />
PEX 8617 19 x 19 mm 2 1.6 W 1.4 W 4.26 W 3.85 W<br />
PEX 8612 19 x 19 mm 2 1.6 W 1.4 W 4.2 W 4.0 W<br />
PEX 8615 19 x 19 mm 2 1.6 W 1.37 W 4.01 W 3.65 W<br />
PEX 8614 19 x 19 mm 2 1.5 W 1.33 W 3.96 W 3.60 W<br />
PEX 8613 19 x 19 mm 2 1.4 W 1.2 W 3.80 W 3.4 W<br />
PEX 8609 15 x 15 mm 2 1.33 W 1.16 W 3.51 W 3.20 W<br />
PEX 8608 15 x 15 mm 2 1.31 W 1.15 W 3.51 W 3.20 W<br />
PEX 8606 15 x 15 mm 2 1.25 W 1.09 W 3.32 W 3.04 W<br />
PEX 8604 15 x 15 mm 2 1.18 W 1.03 W 3.13 W 2.89 W<br />
Typical: 35% lane utilization, typical voltages 25C° ambient. L0s mode<br />
Maximum: 85% lane utilization, L0 mode, max operating voltages<br />
78
End of Presentation<br />
Thank You<br />
www.plxtech.com<br />
79