Dolphin eXpressWare Installation and Reference Guide

For users of :SISCI, SuperSockets, IPoPCIe

Linux version

Dolphin Interconnect Solutions

This document describes the installation and usage of Dolphins PCI Express software stack (eXpressWare) version 5.19.x in combination with Dolphin MX interconnect hardware.

Published under Gnu Public License v2

March 8th, 2022


Table of Contents

Abstract
1. Introduction & Overview
1. Who needs a PCI Express Network?
2. eXpressWare
2.1. SuperSockets
2.2. SISCI
2.3. Device Lending
2.4. Transparent Hot Add
2.5. TCP/IP Driver
2.6. eXpressWare Licensing
3. Dolphin PCI Express PX Adapter cards
3.1. PXH810
3.2. PXH830
3.3. PXH820
3.4. PXH840
4. Dolphin PCI Express MX Adapter cards
4.1. MXH830
4.2. MXH930
4.3. MXH940
4.4. MXH950
4.5. MXP924
4.6. MXP908
4.7. MXP909
5. Dolphin PCI Express IX Adapter cards
5.1. IXH610
5.2. IXH620
6. Dolphin PCI Express Switches
6.1. IXS600
6.2. MXS824
6.3. MXS924
7. PCI Express Cables
7.1. iPass Cables
7.2. SFF-8644 / PCIe 3.0 / Cables
7.3. SFF-8644 / PCIe 4.0 Cables
8. Supported OEM hardware
8.1. Curtiss-Wright SBCs and switches
8.2. Keysight PXIe Controllers
9. PCI Express networking basics
10. Contents of this Document
11. Terminology
12. Support
2. Quick Installation Guide
3. Requirements and Planning
1. Supported Platforms
1.1. Application recommendations
1.2. Supported Platforms
1.3. Recommended Cluster Node Hardware
1.4. Recommended Cluster Management Hardware
2. Software Support
2.1. Linux
2.2. Windows
2.3. VxWorks
3. Interconnect Planning
3.1. Nodes to Equip with PCI Express Interconnect
4. Dolphin MX Interconnect Configurations and topologies
4.1. Switchless topologies using the MXH830 NTB host adapter card
4.2. PCIe switch based topologies using the MXH830 NTB host adapter card and the MXS824 switch
5. Physical Node Placement
4. Initial Installation
1. Installation Overview
1.1. Requirements
1.2. Installation Variants
1.2.1. Live Installation
1.2.2. Non-GUI Installation
1.3. System installation overview
1.4. Installation Result
2. Software Installation
2.1. Overview
2.2. Starting the Software Installation
2.3. Post installation
2.4. Working with the Dolphin Network Configurator, dis_netconfig
2.4.1. Cluster Edit
2.4.2. Node Arrangement
2.4.3. Cabling Instructions
2.5. Cluster Cabling
2.5.1. Connecting the PCI Express cables
2.5.2. Verifying the Cabling
2.6. Finalising the Software Installation
2.6.1. Static Connectivity Test
2.6.2. SuperSockets Configuration Test
2.6.3. SuperSockets Performance Test
2.7. Handling Installation Problems
2.8. Interconnect Validation using the management GUI
2.8.1. Installing dis_admin
2.8.2. Starting dis_admin
2.8.3. Cluster Overview
2.8.4. PCIe connection Test
2.9. Interconnect Validation using the command line
2.10. Making Cluster Application use PCI Express
2.10.1. Generic Socket Applications
2.10.2. Kernel Socket Services
2.10.3. IPoPCIe
2.10.4. Native SISCI Applications
5. Update Installation
1. Complete Update
2. Rolling Update
6. Manual Installation
1. Installation under Load
2. Installation of a Heterogeneous Cluster
3. Manual RPM Installation
3.1. RPM Package Structure
3.2. RPM Build and Installation
4. Unpackaged Installation
7. Interconnect Maintenance
1. Verifying Functionality and Performance
1.1. Availability of Drivers and Services
1.2. PCIe Connection Test
1.3. Static PCIe Interconnect Test - dis_diag
1.4. Interconnect Load Test
1.4.1. Test Execution from Dolphin dis_admin GUI
1.5. Interconnect Performance Test
2. Replacing Interconnect Cables
3. Replacing an Adapter
4. Physically Moving Nodes
5. Replacing a Node
6. Adding Nodes
7. Removing Nodes
8. Configuring the adapter card
1. DIP Switches
2. PCIe Prefetch Memory size
2.1. Determine the PCIe Prefetch Memory Size
2.2. Configuring the PCIe Prefetch Memory Size
9. Firmware upgrade
1. Dolphin Adapter card firmware upgrade.
2. MXS824 PCIe Gen3 Switch firmware upgrade
3. MXS924 PCIe Gen4 Switch firmware upgrade
10. SISCI API
1. SISCI Documentation and resources
2. Enable applications to use the SISCI API
3. How to compile your own SISCI application
4. SISCI API Demo and Example programs
4.1. SISCI API Example programs
4.1.1. shmem
4.1.2. memcopy
4.1.3. interrupt
4.1.4. data_interrupt
4.1.5. intcb
4.1.6. lsegcb
4.1.7. rsegcb
4.1.8. dma
4.1.9. dmacb
4.1.10. dmavec
4.1.11. rpcia
4.1.12. smartio_example
4.1.13. reflective_memory
4.1.14. reflective_device
4.1.15. reflective_device_receive
4.1.16. reflective_write
4.1.17. probe
4.1.18. query
4.1.19. cuda
4.2. SISCI API demo and benchmarks programs
4.2.1. scibench2
4.2.2. scipp
4.2.3. dma_bench
4.2.4. intr_bench
4.2.5. reflective_bench
11. Dolphin PCI Express TCP/IP Driver
1. Who should use the TCP/IP driver
1.1. Enable Linux applications to use the Dolphin PCI Express TCP/IP driver
12. SuperSockets
1. Installation
2. Make Generic Linux Applications use SuperSockets
2.1. Launch via wrapper script
2.2. Launch with LD_PRELOAD
3. SuperSockets Functionality and Performance
3.1. SuperSockets Status
3.2. SuperSockets Benchmarks
3.2.1. latency_bench
3.2.2. sockperf
4. Troubleshooting
5. SuperSockets Utilization
6. Kernel Socket Services
7. Command Reference
13. SmartIO
1. Installation
2. Device Lending
3. Transparent Hot Add
4. SISCI API SmartIO
14. Transparent Board Management
1. Installation
2. Diagnostic
3. Firmware upgrade
15. Advanced Topics
1. Notification on Interconnect Status Changes
1.1. Interconnect Status
1.2. Notification Interface
1.3. Setting Up and Controlling Notification
1.3.1. Configure Notification via the dis_netconfig
1.3.2. Configure Notification Manually
1.3.3. Verifying Notification
1.3.4. Disabling and Enabling Notification Temporarily
2. Managing PCIe and eXpressWare Resources
2.1. Allocating Large SISCI Segments
2.2. Connecting to Large Remote SISCI Segments
2.3. Creating large SISCI reflective memory Segments
2.4. Updates with Modified eXpressWare Configuration Files
3. Using dis_diag
16. FAQ
1. Dolphin PCI Express Hardware
2. Software
A. Self-Installing Archive (SIA) Reference
1. SIA Operating Modes
1.1. Full Cluster Installation
1.2. Heterogeneous Cluster Installation
1.3. Node Installation
1.4. Cluster Management Node Installation
1.5. Installation of Configuration File Editor
1.6. Building RPM Packages Only
1.7. Extraction of Source Archive
2. SIA Options
2.1. Node Specification
2.2. Installation Path Specification
2.3. Installing from Binary RPMs
2.4. Enforce Installation
2.5. Configuration File Specification
2.6. PCIe Link width
2.7. SISCI
2.8. SISCI Development package
2.9. eXpressWare CUDA® integration
2.10. eXpressWare Transparent Board Management
2.11. eXpressWare SmartIO
2.12. SuperSockets
2.13. Batch Mode
2.14. Non-GUI Build Mode
2.15. Software Removal
B. dis_admin Reference
1. Startup
2. Interconnect Status View
2.1. Icons
2.2. Operation
2.2.1. Cluster Status
2.2.2. Node Status
3. Node and Interconnect Control
3.1. Admin Menu
3.2. Cluster Menu
3.3. Node Menu
3.4. Cluster Settings
3.5. Adapter Settings
4. Interconnect Testing & Diagnosis
4.1. Cable Test
4.2. Fabric Test
5. Troubleshooting Best Practice
C. Configuration Files
1. Cluster Configuration
1.1. dishosts.conf
1.1.1. Basic settings
1.1.2. Persistent adapter numbers
1.1.3. SuperSockets settings
1.1.4. Miscellaneous Notes
1.2. networkmanager.conf
1.3. cluster.conf
2. SuperSockets Configuration
2.1. supersockets_profiles.conf
2.2. supersockets_ports.conf
3. eXpressWare Driver Configuration
3.1. dis_mx.conf
3.1.1. Memory Preallocation
3.1.2. Cluster Page Size
3.1.3. Configuring multicast memory
3.2. dis_irm.conf
3.2.1. Resource Limitations
3.2.2. Interrupt and DMA polling
3.2.3. Real time behavior
3.2.4. Memory Preallocation
3.2.5. Logging and Messages
3.3. dis_ssocks.conf
D. Platform Issues and Software Limitations
1. Platforms with Known Problems vs Dolphin PCI Express MX
2. Platforms with Known Problems vs Dolphin PCI Express software
3. IRM
4. SISCI
5. SuperSockets
E. eXpressWare License text

List of Figures

4.1. Cluster Edit dialog of dis_netconfig
4.2. Main dialog of dis_netconfig
4.3. Node dialog of dis_netconfig
4.4. Dolphin dis_admin GUI - Connect to Network Manager
4.5. Dolphin dis_admin GUI - Representation of topology
B.1. Fabric has FAILED due to dead Cluster Nodes
B.2. Options in the Admin menu
B.3. Options in the Cluster menu
B.4. Options in the Node menu
B.5. Cluster configuration in dis_admin
B.6. Advanced settings for a Cluster Node
B.7. Result of running cable test on a good cluster
B.8. Result of cable test on a problematic cluster
B.9. Result of fabric test without installing all the necessary rpms
B.10. Result of fabric test on a proper fabric

List of Tables

B.1. Node or Adapter State
B.2. Link State