smartio_tool

smartio_tool - Low level tool for interacting with SmartIO

Synopsis

smartio_tool <command> [options...]

Description

smartio_tool interacts with both the lending side and borrow side of the SmartIO module. See below sections for a list of commands for each side. For an in-depth description, see the SmartIO in NTB cluster guide.

Common commands

config <option> <value>

Set configuration variable option to value. This setting is not persistent accross reboots.

list

Lists all devices of all connected “borrows”. See Connection to borrowers

lspci [--raw] [--emulated] <fdid> [lspci args]

Runs lspci command on a fabric device. The –raw flag can be used to show the raw config space without any caching or filtering. –emulated can be used to see config space as it would be presented to the kernel when the device is borrowed. For lspci args see manpage for lspci(8).

show <fdid>

Show extended information about a fabric device

Lending side commands

connect <nodeid> <adapter>

Connect to a remote node, allowing it to borrow this node’s devices. See Connection to borrowers

add <BDF>

Add local device to the list of devices that can be borrowed from this node. This does not interfere with current use of the device.

remove <BDF>

Undo the add command. This removes the given device from the list of devices that can be borrowed from this node.

available <BDF>

Mark an added device as available. This unbinds the local device driver from the device.

unavailable <BDF>

Mark an added device as unavailable. This rebinds the local device driver

add-filter <filter>

Add multiple devices using a filter. See lspci-based filters section

available-filter <filter>

Mark multiple devices as available. See lspci-based filters section

unavailable-filter <filter>

Mark multiple devices as available. See lspci-based filters section

Borrow side commands

borrow <fdid> [dma-window-size]

Borrow a fabric device and inject as a local device. See section Borrowing devices.

return <fdid>

Undo borrow command

enable-p2p <source-fdid> <target-fdid> [BAR]

Requires device to be borrowed Enable Peer-to-Peer access from the source device given by source-fdid to the target device given by target-fdid. If BAR is not given, all BARs of the target device are mapped.

get-vdev <fdid>

Get the virtual BDF of a borrowed device.

scan <node id> [adapter [link_no]]

Scan for and initialize transparent devices connected to the NTB adapter. The given nodeid is assigned to the transparent tree, and thus must be unique in the cluster. If the given nodeid is already initialized the tree will be scanned for changes. The adapter parameter specifies the local adapter number to be used. If not given, adapter 0 is used. If the adapter is configured with multiple links link_no must be used to specify which port to scan. If link_no is not given, 0 is assumed.

BDF

Commands accepting a BDF expects the same format as accepted by lspci -s. Example 01:00.0. It also accepts the same short-hand notation as lspci when this uniquely identifies a single device: 1:

Connection to borrowers

In SmartIO, nodes connect in pairs, each node having a distinct role: borrower and lender. The lender initiates the connection. The connected-to borrow-side can see and borrow devices from lenders that have connected to it. Two nodes can also connect to each other.

Hint

Each lender will have a loopback connection to its own node’s borrow-side.

Borrowing devices

Fabric devices that are marked as available can be borrowed. The borrow command takes one mandatory argument, the fdid and a dma window size. The window size controls the amount of RAM the target device driver can make available for the borrowed device at any given time. If the window size is omitted a reasonable default value will be used.

Resource consumption:

Note

Suitable DMA size depends on both the device and the target device driver as well as the use of the device. For example, an idle NVMe drive can work with very little DMA space (i.e 16MB), but under heavy load can need much more (i.e. 1GB or more).

Some target device drivers will handle out-of-mapping resources gracefully while other may simply crash, or behave incorrectly.

Note

Devices cannot be borrowed from the local node (i.e. device is in same node as command is run on), but in this case, the borrow command will instead do an unavailable and will behave similarly.

lspci-based filters

Some smartio_tool commands operate on a lspci-based filter. These commands will operate on all devices that lspci would list given the same filter. Such filters can be tested by running lspci <filter>. Example:

# lspci -d 10de:
09:00.0 VGA compatible controller: NVIDIA Corporation GK107GL [Quadro K420] (rev a1)
# smartio_tool add-filter -d 10de:
Adding 09:00.0 VGA compatible controller: NVIDIA Corporation GK107GL [Quadro K420] (rev a1)

See manpage for lspci(1) for details.

Examples

Adding, setting as available and borrowing a device:

Lending side:

$ smartio_tool connect 4
$ smartio_tool connect 8
$ smartio_tool add 06:00.0
$ smartio_tool list
80002: Non-Volatile memory controller: Intel Corporation Device [unavailable]
$ smartio_tool available 06:00.0
$ smartio_tool list
80002: Non-Volatile memory controller: Intel Corporation Device [available]

Borrow side:

$ smartio_tool borrow 80002 512
Name: Non-Volatile memory controller Intel Corporation Device f1a5
Local users: 1
Local virtual device: 0000:04:05.0
Bound to driver: nvme
NVMe namespace: nvme0
$ ls /dev/nvme0*
/dev/nvme0  /dev/nvme0n1  /dev/nvme0n1p1
$ mount /dev/nvme0n1p1 /mnt
$ umount /mnt
$ smartio_tool return 80000

Running smartio_example SISCI demo program:

$ smartio_example -dev 80c00
Shared mem[0]:
0x00000000
BAR[0]:
0x0e7330a2
IO address of local segment seen from the device 80c00: 410e1000
IO address of remote shared device memory seen from the device 80c00: 48b8e1000

Exit code

Exits with standard errno values. Some value have special meaning, see table below. There can sometimes be useful error messages in the kernel log (dmesg) after a command fails.

Value

Meaning

EIO

Unable to communicate with other node

EBUSY

Device is busy or not available

ENODEV

Invalid device specified

ENOMEM

Failed to allocate a buffer, or to map BARs or DMA window.

See also

lspci(8)