Quick Start Guides

SmartIO enables multiple different use-cases for sharing and using PCIe devices in PCIe fabrics. The quickstart pages describe common usage scenarios and how SmartIO can be used to achieve them. The guides are not exhaustive, but serve as a jumping off point.

Sharing PCIe devices in an NTB cluster using Device Lending

Device Lending allows you to share PCIe devices in a cluster by lending local PCIe devices and borrowing them on other nodes. This mechanism enables many different use-cases:

  • Pool GPUs in a cluster and allowing a single node to borrow all GPUs for a heavy workload.

  • React to dynamic workload demands by borrowing additional devices from the pool when needed.

  • Enable non-clustered GPU applications to scale beyond the GPUs available in a single machine, for example 3D rendering or large models in ollama.

  • Concurrently sharing SR-IOV devices in a cluster.

  • Passthrough of pooled non-SR-IOV devices to guest OSes in a cluster.

See Device Lending to get started. If you are running a multi-tenant cluster, get started with Device Lending with guest OSes.

Flexible PCIe expansion using Fabric Attached Devices (NTB Hot-Add)

SmartIO allows a more flexible PCIe expansion using fabric attached devices. This enables:

  • PCIe surprise hot-add and hot-remove on a system that does not support it.

  • Expanding the number of devices that can be installed in a single system.

  • Using PCIe devices with BARs larger than what the system supports.

See Fabric Attached Devices (NTB Hot-Add) to get started.

Sharing a non-SR-IOV NVMe drive with Dolphin’s dis_nvme driver

Dolphin’s dis_nvme driver allows you to share an NVMe drive without hardware support for SR-IOV. Sharing an NVMe drive in a cluster allows you to:

  • Share an NVMe drive without SR-IOV support.

  • Use a shared-disk filesystem while retaining the high bandwidth and low latency of an NVMe drive.

See Sharing a non-SR-IOV NVMe drive with dis_nvme to get started.

Sharing and Accessing a Device Using SISCI API

SmartIO integrates with the SISCI API, allowing you not only to write cluster applications, but clustered device drivers. This enables you to:

  • Write a device driver in user space for an FPGA (or others) using the full capabilities of the SISCI API.

  • Using PCIe multicast to efficiently replicate data DMA-ed by your device at the hardware level.

  • Utilizing the DMA engine on the Dolphin NTB Host Adapter to efficiently transfer data to and from your device.

  • Concurrently sharing a non-SR-IOV device in a cluster.

See Using SISCI API with SmartIO devices to get started.