2. eXpressWare Driver Configuration

The Dolphin eXpressWare drivers are designed to adapt to the environment they are operating in; therefore, manual configuration is rarely required. The upper limit for memory allocation of the low-level driver is the only setting that may need to be adapted for a cluster, but this is also done automatically during the installation.

Warning

Changing parameters in these files can affect reliability and performance of the PCI Express interconnect. Please carefully read the documentation before you make any changes.

2.1. dis_mx.conf

The dis_mx.conf is located in the %WinDir%\System32\drivers\etc\DIS directory and contains options for the eXpressWare hardware driver (kernel module).

Warning

Changing other values than those described below may cause the interconnect to malfunction. Only do so if instructed by Dolphin support.

2.1.1. Memory Preallocation

Preallocation of memory is recommended on systems without IOMMU (like x86 and x86_64). The problem is the memory fragmentation over time which can cause problems to allocate large segments of contiguous physical memory after the system has been running for some time. To overcome this situation, options has been added to let the IRM driver allocate blocks of memory upon initialization and to provide memory from this pool under certain conditions for allocation of remotely accessible memory segments.

Option NameDescriptionUnitValid ValuesDefault Value
ntb_memory_preallocation_size_mbDefines the number of megabytes of memory the driver shall try to allocate upon initialization.MB0: disable preallocation

>0: MB to preallocate in as few blocks as possible

160

2.1.2. Cluster Page Size

ntb_cluster_page_size_kb is the cluster page size in KB. In a cluster, the system page size might vary on the different hosts. In this case, the cluster page size must be adjusted to the MAX system page size on the hosts in the cluster. The cluster page size must be the same on all hosts in a cluster. The default page size is 4KB on most systems. Some ARM systems have page sizes of 64K. If the page size is higher than 4K, all hosts communicating with this node must adjust the cluster page size parameter to match the highest page size in the cluster.

Option NameDescriptionUnitValid ValuesDefault Value
ntb_cluster_page_size_kbDefines the cluster page size in kilobytes.KB4 - 644

2.1.3. Configuring multicast memory

The default max multicast segment size is 2 Megabytes and 2 multicast groups. Please modify the following parameters to change the multicast setup. If you are making changes to these settings, please reboot the system. If the driver fails to allocate the specified amount of memory, the eXpressWare drivers will fail during initialization. If that happens, try to add more memory to your server or reduce the requirements. Please take a look at Section 13, “Support” if you have any questions.

Option NameDescriptionUnitValid ValuesDefault Value
ntb_mcast_group_sizeDefines the maximum size of a single multicast group that should be allocated upon driver initialization. The default setting is 21, i,e, 2^21 = 2 Megabytes.Integer0: disable preallocation

>0: log2(size)

21
ntb_mcast_max_groupsDefines the number of multicast groups that should be allocated upon driver initialization.Integer1-42

2.2. dis_irm.conf

dis_irm.conf is located in the %WinDir%\System32\drivers\etc\DIS directory and contains options for the IRM driver (dis_irm kernel module).

Only a few options are to be modified by the user.

Warning

Changing other values in dis_irm.conf than those described below may cause the interconnect to malfunction. Only do so if instructed by Dolphin support.

Whenever a setting in this file is changed, the driver needs to be reloaded to make the new settings effective. Please note that some of the possible settings are commented out in the dis_irm.conf file. Please remove the leading # to change these settings.

2.2.1. Resource Limitations

These parameters control memory allocations that are only performed on driver initialization.

Option NameDescriptionUnitValid ValuesDefault Value
max-vc-numberMaximum number of virtual channels (one virtual channel is needed per remote memory connection; i.e. 2 per SuperSocket connection)n/aintegers > 0

The upper limit is the consumed memory; values > 16384 are typically not necessary.

1024

2.2.2. Interrupt and DMA polling

These parameters control the IRM driver interrupt polling mechanism used to minimize system interrupt latencies. This functionality will reduce the remote PCIe fabric interrupt latency and interrupt latency associated with DMA transfers. The remote interrupt and DMA polling can be adjusted independently to custom specific requirements or turned off. The polling thread will consume CPU resources but often also significantly reduce DMA latency.

The dis_tool commands "control-intr-polling" and "control-dma-polling" can be used to experiment and change the values on a running system. Values set in dis_irm.conf will automatically be applied each time the driver is reloaded etc.

Option NameDescriptionUnitValid ValuesDefault Value
intr_poll_modeControls the interrupt poll mechanism available for optimizing remote interrupts. SISCI, SuperSockets and IPoPCIe communication will benefit from this.n/a

0 : Disabled.

1 : Delayed On.

2 : Immediate On.

3 : Always On.

2
intr_poll_thresh_onControls the delayed on threshold. Interrupt polling will start if the number of interrupts pr second exceeds the threshold on value.interrupts/sec

Integers > 0

1000
intr_poll_thresh_offControls the delayed off threshold for interrupt polling. Interrupt polling will be turned off when the number of interrupts pr second is lower than the threshold off value.interrupts/sec

Integers > 0

100
dma_poll_modeControls the interrupt poll mechanism available for optimizing system interrupt latency with DMA transfers. SISCI, SuperSockets and IPoPCIe communication will benefit from this when DMA operations are used.n/a

0 : Disabled.

1 : Delayed On.

2 : Immediate On.

3 : Always On.

2
dma_poll_thresh_onControls the delayed on threshold for DMA interrupt polling. DMA Interrupt polling will start if the number of interrupts pr second exceeds the threshold on value.interrupts/sec

Integers > 0

1000
dma_poll_thresh_offControls the delayed off threshold for DMA interrupt polling. DMA Interrupt polling will be turned off when the number of interrupts pr second is lower than the threshold off value.interrupts/sec

Integers > 0

100

2.2.3. Real time behavior

These parameters control some driver real-time settings. Changes here are normally only needed if you run a real time application or simulation using the SISCI API.

Option NameDescriptionUnitValid ValuesDefault Value
linkWatchdogEnabledControls the link watchdog behaviour. The link watchdog is a high availability feature to ensure detection of non operational links. This feature is normally not needed but should be left on for additional high availability. The feature introduces a microsecond level jitter. Should be turned off for real-time applications.Seconds0 : Disabled.

Integers > 0 : Watchdog period in seconds.

3
sessionHeartbeatsEnabledControls the session heartbeats. The session heartbeat mechanism is used for end to end internal heart beating. This feature introduces a microsecond level jitter. Should be turned off for real-time applicationsn/a0 : Disabled

1 : Enabled.

1 - enabled

2.2.4. Memory Preallocation

Memory preallocation settings has been moved to dis_mx.conf

2.2.5. Logging and Messages

Option NameDescriptionUnitValid ValuesDefault Value
link-messages-enabledControl logging of non critical link messages during operation.n/a

0: no link messages

1: show link messages

0
notes-disabledControl logging of non critical notices during operation.n/a0: show notice messages

1: no notice messages

1
warn-disabledControl logging of general warnings during operation.n/a0: show warning messages

1: no warning messages

0
dis_report_resource_outtagesControl logging of out-of-resource messages during operation.n/a0: no messages

1: show messages

0
notes-on-log-file-onlyControl printing of driver messages to the system consolen/a0: also print to console

1: only print to kernel message log

0