System Requirements

This page describes the system requirements of SmartIO. The exact requirements depend on the SmartIO functionality used on a specific node. We recommend that you refer to the “System Requirements” sections in the Quick Start Guides or Adding Devices to the Pool or Borrowing Pooled Devices.

Platform Requirements

SmartIO is supported on most x86_64 systems as well as selected ARM64 platforms [1].

Warning

Lending out local devices is not yet supported on Intel Xeon Ice Lake and newer. Please contact support for advice.

Platform

Lending

Borrowing

SISCI

AMD Ryzen

Supported

Supported

Supported

AMD EPYC

Limited, see No irq handler for vector in console

Supported

Supported

AMD Threadripper

Supported

Supported

Supported

Intel Core series

Peer-to-peer support limited, see PCIe peer-to-peer (P2P) support

Supported

Supported

Intel Xeon Cooper Lake and older

Supported

Supported

Supported

Intel Xeon Ice Lake

To be supported in future release

Supported

Supported

Intel Xeon Rocket Lake, Sapphire Rapids, Emerald Rapids and Granite Rapids

To be supported in future release

Supported

Supported

NVIDIA Xavier

Peer-to-peer support limited, see PCIe peer-to-peer (P2P) support

Supported (in 5.24)

Supported

Hint

Server platforms like Intel Xeon and AMD EPYC have more available PCIe lanes than their desktop counterparts. This is important to achieve the full performance of PCIe devices.

IOMMU / VT-d

It’s generally recommended to enable IOMMU / VT-d on all nodes when using SmartIO. Enabling IOMMU eases the requirements for Host NTB Adapter Prefetchable Size by allowing a smaller DMA window. The IOMMU isolates the devices from the rest of the system and the other nodes in the fabric. This can protect against bugs in the target device driver. It can also be very useful when a device is used with the SISCI API. On the lending side, the target device will only be granted access to designated DMA regions on the borrow side. On the borrow side, these DMA regions are mapped as requested by the target device driver while the rest of the region is protected. If the target device misbehaves and tries to DMA to a protected region, the system kernel will typically log this.

When the IOMMU is enabled all traffic is forced to go through the CPU. This can re-route peer-to-peer traffic between the NTB adapter and a lent device. When the NTB adapter and the lent device is directly connected to CPU-provided PCIe lanes, this causes only a minor performance penalty. Systems with on-board PCIe switches or using a PCIe expansion chassis can get a substantial performance penalty. In these cases disabling the IOMMU on the lender may be advisable.

ACS and IOMMU groups

ACS (Access Control Services) is a PCIe feature that is enabled to enforce isolation when IOMMU is enabled. The Linux kernel will group devices that cannot be isolated from each other into “IOMMU groups”. To lend a device in an IOMMU group, all of the other devices in the same group must also be added and made available. This prevents those other devices from being used locally while one or more of the devices in the group is borrowed. This is normally not a problem on most server and workstation systems. Systems with on-board PCIe switches or expansion chassis, or devices attached to the chipset can have IOMMU group issues.

Hint

On desktop platforms like Intel Core and AMD Ryzen it’s common for some of the PCIe slots to have lanes attached to the chipset. These slots are more likely to have ACS related issues.

NUMA and Multi-Socket Systems

Performance can be significantly impacted when peer-to-peer traffic between the NTB adapter and the lent device needs to cross between CPU sockets. The same effect can be seen on some single CPU systems like AMD EPYC [2]. To achieve the optimal performance it’s recommended that the NTB adapter and the devices to be lent are on the same NUMA node or I/O Die quadrant. Please refer to your system or motherboard manual for more details.

Hint

On NUMA systems, the NUMA node of a PCIe device can be discovered with lspci.

On AMD EPYC systems the NPS=4 BIOS setting exposes each CPU as 4 NUMA nodes. This can make is easier to discover which PCIe slots are associated with the same I/O Die / NUMA node. This setting can affect performance including memory bandwidth so you may want to set NPS back to the default value after determining the optimal PCIe slots. Refer to your systems manual for more details.

PCIe peer-to-peer (P2P) support

The lending side must support PCIe peer-to-peer transactions between the slot where the Dolphin PCIe adapter and the slot where the target device is installed. Peer-to-peer support is not needed to borrow remote devices. In our experience, most AMD and Intel Xeon systems support peer-to-peer. We recommend that you ask your system vendor if your system supports peer-to-peer. If your system has PCIe switches or you have a topology with transparent switches, peer-to-peer traffic can take the shortest path between two devices, avoiding going via the CPU/root complex. The shortest path routing will only happen if the IOMMU is disabled.

On platforms where peer-to-peer is not supported, but there is a direct path between the NTB and the target device via one or more PCIe switches, lending out that device can still work. This requires that the IOMMU is disabled.

Hint

In some cases, it’s possible to have partial support for p2p in a system. This can for instance be the case in systems with an internal PCIe switch or when using an expansion box. In these cases, note that enabling the IOMMU will force p2p transactions to go through the CPU. Enabling IOMMU requires the CPU to support p2p. This may cause p2p to work only when the IOMMU is off.

NTB cluster Requirements

Supported Host Adapters

Dolphin MXH Host Adapter

Status

Notes

MXH530

Supported in eXpressWare 5.24 and newer

Requires kernel 6.6 or newer [3]

MXH950

Supported

MXH940

Supported

MXH930

Supported

MXH830

Supported

Dolphin PXH Host Adapter

Status

Notes

PXH840

Supported

PXH830

Supported

PXH810

Partially supported

Fabric Attached Devices are not supported.

PXH820 / PXH824

Partially supported

Fabric Attached Devices are not supported.

Host NTB Adapter Prefetchable Size

It’s recommended to configure a large NTB prefetchable memory on the NTB host adapter when using SmartIO. The exact size required depends on the device and system configuration. While the default size of 256MiB may be enough for some cases, it’s recommended that the user increases the prefetchable memory if possible. It’s recommended to configure all nodes to have at least 32GiB of prefetchable memory.

Warning

Make sure 4G decoding is enabled in the BIOS / Firmware on your system before trying to increasing the prefetchable size. Increasing the prefetchable size with 4G decoding disabled can cause the system to fail to boot.

When borrowing a device, the borrower’s NTB adapter must map all the BARs of the device. Taking into account alignment requirements and additional space required for general communication, the NTB prefetch size must be larger than the combined size of all the BARs of all the devices that will be borrowed simultaneously. For instance, borrowing a device with one 8GiB BAR2 requires at least 8GiB mapping space, but due to the additional mapping space used for communication as well as alignment requirements, a prefetchable memory size of 8GiB is insufficient. Since the NTB prefetchable size can only be set in powers of two, the next step up is 16GiB which is enough for one such device, but not enough for 2 devices with 8GiB BARs.

NTB prefetchable memory space is also used on the lender side when a local device is borrowed by another node. The exact size depends on the DMA window size used by the borrower.

Setting the Adapter Prefetchable Size

The prefetchable size is set with physical DIP switches on the MXH adapter. Please refer to the User Guide of your NTB Adapter for specific instructions. For MXS switches with D-Switch topology the prefetch size is set in the switch web interface. Please refer to the MXS User Guide for instructions.

The prefetchable size can be set using the dis_config tool. Refer to Configuring the adapter card for instructions.

The prefetchable size can be set using the dis_config tool. Refer to Configuring the adapter card for instructions.

Operating System Requirements

Operating system

Lending

Borrowing

SISCI

Notes

RHEL 10 (AlmaLinux 10, Rockylinux 10, CentOS 10 Stream)

Supported

Supported

Supported

RHEL 9 (AlmaLinux 9, Rockylinux 9, CentOS 9 Stream)

Supported

Supported

Supported

MXH500-series supported since kernel-5.14.0-406.el9 [3]

RHEL 8 (AlmaLinux 8, Rockylinux 8, CentOS 8 Stream)

Supported

Supported

Supported

MXH500-series not supported [3]

Ubuntu 24.04 LTS

Supported

Supported

Supported

Ubuntu 22.04 LTS with Hardware Enablement (HWE) stack

Supported

Supported

Supported

Ubuntu 22.04 LTS

Supported

Supported

Supported

MXH500-series not supported [3]

Windows

Preview [4]

Not supported

Supported

Linux Kernel

eXpressWare and SmartIO is designed to work with a wide range of kernel versions, but eXpressWare is only tested and qualified with the kernels shipped by the distributions we support. Please refer to the table above. Contact support for additional information or support.

Supported Devices

SmartIO is designed to work with all PCIe compliant devices and device drivers by implementing support for all the required PCIe features. Some legacy PCI features are not supported, but this does not impact most (if any) PCIe devices. Verified to be working device types include:

  • NVMe drives from multiple vendors

  • NVIDIA GPUs from multiple generations

  • Intel Ethernet adapters

  • Mellanox/NVIDIA ConnectX network adapters

  • Various FPGAs

In general, almost all PCIe devices are supported with SmartIO and device lending. More specifically SmartIO supports the following features:

Feature

Support Status

Memory Space Device Registers (BARs), prefetchable and non-prefetchable.

Supported

DMA to/from Device to RAM (“Zero-Copy”)

Supported

MSI Interrupts

Supported

MSI-X Interrupts

Supported

Peer-to-peer

Supported. See PCIe peer-to-peer

SR-IOV

Supported. See Lending Virtual Function of an SR-IOV device

Configuration Space

Supported

Legacy Pin-based interrupts (INTx).

Not supported [5]

IO Space Device Registers (BARs)

Not supported [6]

NVIDIA GPU

NVIDIA GPUs are fully functional including support for CUDA, Unified Memory, peer-to-peer and graphical applications. The following list specifies the verified NVIDIA GPU architectures:

NVIDIA GPU Architecture

Tested on

Notes

Blackwell

GeForce RTX 5070, RTX 4050 PRO

Ada Lovelace

NVIDIA RTX 2000 Ada, GeForce RTX 4070

Hopper

Untested

Ampere

RTX A4500

See Addressing Limitations on Older NVIDIA GPUs

Volta

Tesla V100

See Addressing Limitations on Older NVIDIA GPUs

Touring

NVIDIA T600

See Addressing Limitations on Older NVIDIA GPUs

Pascal

NVIDIA P400

See Addressing Limitations on Older NVIDIA GPUs

Addressing Limitations on Older NVIDIA GPUs

NVIDIA GPUs before the Ada Generation have DMA addressing limitations that can cause issues during lending. These GPU requires one of the following workarounds:

  • Turn on IOMMU on the Lender. The IOMMU is used to remap the high addresses to lower virtual addresses the GPU can address. See IOMMU / VT-d.

  • Set BIOS setting MMIOH to 1TB or lower on the Lender. This forces the NTB’s BAR address to be lower allowing the GPU to address it. This may limit the maximum system memory. Please refer to your system manual.

  • Disable 4G decoding in BIOS of the Lender. This greatly limits the BAR sizes supported by the system and is not recommended.

SR-IOV

SmartIO supports lending both SR-IOV Physical Functions as well as individual Virtual Functions. The Physical Function cannot be lended when SR-IOV is enabled / virtual functions have been instantiated. Some SR-IOV Virtual Functions may not work when borrowed because the VF driver expect to run in a Virtual Machine.

Device

Lending Physical Function

Lending Virtual Function

Mellanox / NVIDIA ConnectX-5

Supported

Supported

Samsung PM1725a

Supported

Supported