26.12.2014 Views

Fabric Manager Users Guide, Version 6.1, Revision A - QLogic

Fabric Manager Users Guide, Version 6.1, Revision A - QLogic

Fabric Manager Users Guide, Version 6.1, Revision A - QLogic

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2–Advanced <strong>Fabric</strong> <strong>Manager</strong> Capabilities<br />

Mesh/Torus Topology Support<br />

Path Record Query<br />

Virtual <strong>Fabric</strong>s<br />

As has been implied by the previous discussion, the dor-updown algorithm uses<br />

multiple LIDs, SLs and VLs. For applications to use the correct route through the<br />

fabric, they must use SA PathRecord queries to obtain the addressing information<br />

for communicating to another Channel Adapter. Unlike simpler algorithms,<br />

“cheating the standard” and bypassing the SM will result in non-optimal<br />

performance. Techniques such as out of band LID exchange (which is used by<br />

many MPI implementations) will provide sub-optimal performance.<br />

To permit non-InfiniBand compliant applications (such as the existing MVAPICH<br />

and OpenMPI implementations for verbs) to function in a Mesh/Torus fabric, the<br />

<strong>QLogic</strong> <strong>Fabric</strong> <strong>Manager</strong> configures the Base LID and the 1st SL on each Channel<br />

Adapter for the Up/Down route. This route will provide reliable deadlock free<br />

operation, even if Channel Adapters simply exchange LIDs. It will also operate<br />

both for complete and disrupted fabrics. However, this route will provide greatly<br />

increased latency and reduced bandwidth as compared to proper use of the<br />

SM/SA.<br />

Many applications use IPoIB for path resolution. Since IPoIB makes PathRecord<br />

queries, such applications will be given optimized routes and will function properly.<br />

To permit optimized MPI performance for Mesh/Torus fabrics, the <strong>QLogic</strong> Host<br />

Channel Adapter with its Performance Scaled Messaging (PSM) technology<br />

should be used. PSM can perform PathRecord queries when its path_query<br />

option is enabled (see <strong>QLogic</strong> OFED+ Host Software User <strong>Guide</strong>).<br />

To ensure scalability when using <strong>QLogic</strong> PSM with PathRecord queries enabled,<br />

the <strong>QLogic</strong> Distributed SA (qlogic_sa) must be enabled on every compute node.<br />

The Distributed SA synchronizes the node relevant PathRecord information with<br />

each end node such that job startup time is optimized. See the <strong>QLogic</strong> OFED+<br />

Host Software User <strong>Guide</strong> for more information on the Distributed SA.<br />

The <strong>QLogic</strong> <strong>Fabric</strong> <strong>Manager</strong> permits use of Virtual <strong>Fabric</strong>s in a Mesh/Torus fabric.<br />

In such environments QoS and/or Security can be enabled to separate various<br />

applications or nodes. The QoS Options table in Appendix D shows some of the<br />

combinations of QoS options which are possible. For more information on Virtual<br />

<strong>Fabric</strong>s refer to Section 4.<br />

Dispersive and Multi-Path Routing<br />

Mesh/Torus fabrics require at least 2 LIDs per Channel Adapter. The LMC is set to<br />

1 by default. One LID is used for the optimized DOR route and one LID is used for<br />

the fall back Up/Down route<br />

Often it is desirable to use advanced features such as dispersive routing in MPIs<br />

using <strong>QLogic</strong> PSM or other applications which may be able to take advantage of<br />

multiple paths for redundancy, load balancing or performance.<br />

IB0054608-01 B 2-13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!