1. Installation under Load

This section describes how to perform the initial Dolphin PCI Express installation on a cluster in operation without the requirement to stop the whole cluster from operating.

This type of installation does not require more than one Cluster Node at a time being offline.

  1. Installing the drivers on the Cluster Nodes

    On all Cluster Nodes, run the SIA with the option --install-node. This is a local operation which will build and install the drivers on the local machine only. If the PCI Express hardware is not yet installed, this operation will report errors which can be ignored. Do not reboot the Cluster Nodes now!

    Tip

    You can speed up this Cluster Node installation by re-using binary RPMs that have been build on another Cluster Node with the same kernel version and the same CPU architecture. To do so, proceed as follows:

    1. After the first installation on a Cluster Node, the binary RPMs are located in the directories node_RPMS and frontend_RPMS, located in the directory where you launched the SIA. Copy these sub-directories to a path that is accessible from the other Cluster Nodes.

    2. When installing on another Cluster Node with the same Linux kernel version and CPU architecture, use the --use-rpms option to tell SIA where it can find matching RPMs for this Cluster Node, so it does not have to build them once more.

  2. Installing the PCI Express hardware

    For an installation under load, perform the following steps for each Cluster Node one by one:

    1. Shut down your application processes on the current Cluster Node.

    2. Power off the Cluster Node, and install the PCI Express adapter (see .

    3. Power on the Cluster Node and boot it up. The Dolphin PCI Express drivers should load successfully now, although the SuperSockets service will not be configured. Verify this via dis_services:

      # dis_services status
      Dolphin kOSIF 5.5.0 is running
      Dolphin MX 5.5.0 is running
      Dolphin IRM (GX) 5.5.0 ( January 10th 2018 ) is running.
      Dolphin Node Manager is running (pid 3172).
      Dolphin SmartIO 1.0 (rev unknown) is running.
      Dolphin SISCI 5.5.0 ( January 10th 2018 ) is running.
      Dolphin SuperSockets 5.5.0 "Express Train", January 10th 2018 (built January 10th
      2018) loaded, but not configured.
    4. Stop the SuperSockets service:

      # service dis_supersockets stop
      Stopping Dolphin SuperSockets drivers                      [  OK  ]
    5. Start all your own applications on the current Cluster Node and make sure the whole cluster operates normally.

    6. Repeat the same procedure for all Cluster Nodes.

  3. Creating the cluster configuration files

    If you have a Linux machine with X available which can run GUI applications, run the SIA with the --install-editor option to install the tool dis_netconfig. Ideally, this step is performed on the Cluster Management Node. If this is the case, you should create the directory /etc/dis and make it writable for root:

    # mkdir /etc/dis
    # chmod 755 /etc/dis

    After the SIA has completed the installation, start the tool dis_mkconf (default installation location is /opt/DIS/sbin):

    # /opt/DIS/sbin/dis_mkconf

    or dis_netconfig (default installation location is /opt/DIS/sbin) for GUI-based installation:

    # /opt/DIS/sbin/dis_netconfig

    Information on how to work with this tool can be found in Chapter 4, Initial Installation,Section 2.4, “Working with the Dolphin Network Configurator, dis_netconfig”. Make sure you create the cabling instructions needed in the next step.

    If the dis_netconfig was run as root on the Cluster Management Node, proceed with the next step. Otherwise, copy the configuration files dishosts.conf and networkmanager.conf which you have just created to the Cluster Management Node and place it there under /etc/dis (you may need to create this directory, see above).

  4. Cable Installation

    Using the cabling instructions created by dis_netconfig in the previous step, the interconnect cables should now be connected (see Chapter 4, Initial Installation, Section 2.5, “Cluster Cabling”).

  5. Cluster Management Node Installation

    On the Cluster Management Node, run the SIA with the --install-frontend option. This will start the Network Manager, which will then configure the whole cluster according to the configuration files created in the previous steps.

  6. Start all services on all the Cluster Nodes:

    # dis_services start
    Starting Dolphin KOSIF 5.5.0                               [  OK  ]
    Starting Dolphin PX 5.5.0                                  [  OK  ]
    Starting Dolphin IRM 5.5.0 ( January 10th 2018 )           [  OK  ]
    Starting Dolphin Node Manager                              [  OK  ]
    Starting Dolphin SISCI 5.5.0 ( January 10th 2018 )         [  OK  ]
    Starting Dolphin SuperSockets drivers                      [  OK  ]
  7. Verify the functionality and performance according to Chapter 4, Initial Installation,Section 1, “Verifying Functionality and Performance”.

  8. At this point, PCI Express and SuperSockets are ready to use, but your application is still running on Ethernet. To make your application use SuperSockets, you need to perform the following steps on each Cluster Node one-by-one:

    1. Shut down your application processes on the current Cluster Node.

    2. Refer to Chapter 4, Initial Installation, Section 2.10, “Making Cluster Application use PCI Express” to determine the best way to have you application use SuperSockets. Typically, this can be achieved by simply starting the process via the dis_ssocks_run wrapper script (located in /opt/DIS/bin by default), like:

    3. Start all your own applications on the current Cluster Node and make sure the whole cluster operates normally. Because SuperSockets fall back to Ethernet transparently, your applications will start up normally independently from applications on the other Cluster Nodes already using SuperSockets or not.

    After you have performed these steps on all Cluster Nodes, all applications that have been started accordingly will now communicate via SuperSockets.

Note

This single-node installation mode will not adapt the driver configuration dis_irm.conf to optimally fit your cluster. This might be necessary for clusters with more than 4 Cluster Nodes. Please refer to Appendix C, Configuration Files, Section 3.2, “dis_irm.conf to perform recommended changes, or contact Dolphin support.