12. Appendix: creating client mdtmconfig.json

mdtmconfig.json configures a mdtmFTP client parameters. It is used for mdtmFTP client with versions >= 1.1.1. It should be located at mdtmFTP client’s working directory.

mdtmFTP client’s configuration is similar to that of mdtmFTP server, except that mdtmFTP client does not need to configure a virtual device and has no server section.

12.1. Topology section

The syntax is defined as:

"topology": [
     {
      "type" : Device_Type,
      "name" : Device_Name,
      "numa" : Numa_ID
     },
     ...
 ]

Device_Type refers to MDTM device type. MDTM defines three types of devices: network, block, and virtual.

  • Network refers to a network I/O device.

  • Block refers to a storage/disk I/O device.

  • Virtual refers to a virtual device, which is defined particularly for mdtmFTP server.

Numa_ID sets which NUMA node a device belongs to (i.e., NUMA location).

Device_Name specifies a device name.

MDTM middleware is typically able to detect physical I/O devices and their locations (i.e., which NUMA node that a I/O device belongs to) on a NUMA system. However, there are two cases that MDTM middleware cannot detect physical I/O devices or their locations correctly:

  1. In a fully virtualized environment, where information on physical I/O devices is not exposed to guest OS.

  2. Some vendors’ I/O devices may not comply to OS rules to expose device information properly.

Under these conditions, system admin should manually configure I/O devices and their NUMA locations.

Virtual device is defined particularly for mdtmFTP server to monitor data transfer status. mdtmFTP server spawns a dedicated management thread to collect and record data transfer statistics. The management thread is associated with a virtual device, which will be pinned to a specified NUMA node.

12.2. Online section

The syntax is defined as:

"online": [
            Device_Name1,
            Device_Name2,
            ...
          ]

This section specifies the I/O devices that are assigned for data transfer.

For example, assume a DTN has the following I/O devices:

  • Ethernet NIC devices

    • eth0 – configured for management access

    • eth1 – configured for WAN data transfer

  • Block I/O devices

    • /dev/sda – system disk

    • /dev/sdb – data repository for WAN data transfer

In this case, the online section would be defined as:

<Online>
  <Device>eth1</Device>
  <Device>sdb</Device>
</Online>
  • For network I/O devices, a user can run ifconfig to list network I/O devices available on the system.

  • For storage/disk IO devices, a user can run lsblk to list storage/disk I/O devices available on the system; and then run df to find out on which storage/disk I/O devices that a data transfer folder will be located.

    Assuming that a DTN system’s lsblk output is:

    $ lsblk
    NAME                          MAJ:MIN  RM  SIZE RO  TYPE MOUNTPOINT
    sda                             8:0     0  1.8T  0  disk
    ├─sda1                          8:1     0  500M  0  part /boot
    └─sda2                          8:2     0  1.8T  0  part
    ├─scientific_bde1-root          253:0   0  50G   0  lvm  /
    ├─scientific_bde1-swap          253:1   0  4G    0  lvm  [SWAP]
    └─scientific_bde1-home          253:2   0  1.8T  0  lvm  /home
    loop0                           7:0     0  100G  0  loop
    └─docker-253:0-203522131-pool   253:3   0  100G  0  dm
    loop1                           7:1     0  2G    0  loop
    └─docker-253:0-203522131-pool   253:3   0  100G  0  dm
    nvme0n1                         259:0   0  1.1T  0  disk /data1
    

    And df output is:

    $ df
    Filesystem                       1K-blocks  Used       Available  Use% Mounted on
    /dev/mapper/scientific_bde1-root 52403200   15999428   36403772   31%  /
    devtmpfs                         65855232   0          65855232   0%   /dev
    /dev/nvme0n1                     1153584388 104952744  990009612  10%  /data1
    /dev/mapper/scientific_bde1-home 1895386900 23602284   1871784616 2%   /home
    /dev/sda1                        508588     376264     132324     74%  /boot
    

If /data1 is used as data transfer folder, the corresponding storage/disk I/O device is nvme0n1.

12.3. Thread section

The syntax is defined as:

"threads": [
               {
                    "type" : "Device_Type",
                    "name" : "Device_Name",
                    "threads" : Num
               },
               ...
 ]

This section defines the number of threads that needs to be allocated for an I/O device. The number of threads allocated for an I/O device should be proportional to the device’s I/O bandwidth. The rule of thumb is that a thread can handle an I/O rate of 10Gbps. For example, four threads should be allocated for a 40GE NIC while one thread be allocated for a 10GE NIC.

Default_Num sets the default number of threads allocated for each I/O device.

If a different number of threads should be allocated for a particular I/O device, a separate entry for the device should to be specified here.

12.4. File section

The syntax is defined as:

"filesegment": File_Size_Threshold

MDTM splits a large file into segments, which are spread to different threads for disk and network operations to increase performance.

File_Size_Threshold sets a file size threshold. A file with a size that exceeds the threshold will be split into multiple segments, which are spread across I/O threads to be transferred in parallel.

12.5. Manually_configured_cpus section

The syntax is defined as:

"manually_configured_cpus" : {
         "storage" : [CPU_index,...],
         "network" : [CPU_index,...]
     }

This section allows users to manually specify core(s) for mdtmFTP I/O threads. It is optional. In some cases, experienced users may want to manually configure cores for mdtmFTP I/O threads to achieve optimum performance.

  • If manually_configured_cpus is not configured, mdtmFTP calls MDTM middleware scheduling service to schedule cores for its threads. For each I/O thread, MDTM middleware first selects a core near the I/O device (e.g., NIC or disk) the thread uses, and then pins the thread to the chosen core.

  • If manually_configured_cpus is configured, mdtmFTP will bypass its normal core scheduling mechanisms. Instead, it assigns and binds its I/O threads to the cores specified in manually_configured_cpus one by one.

12.6. Example

  • A sample mdtmconfig.json without manually_configured_cpus:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
     "topology": [
        {
            "type" : "block",
            "name" : "nvme0n1",
            "numa" : "0"
        }
     ],
     "online": [
         "enp4s0f0",
         "nvme0n1"
     ],
     "threads": [
         {
              "type" : "network",
              "name" : "enp4s0f0",
              "threads" : 2
         },
         {
              "type" : "block",
              "name" : "nvme0n1",
              "threads" : 2
         }
     ],
     "filesegment": "2G"
  }
  • A sample mdtmconfig.json with manually_configured_cpus:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
     "topology": [
        {
            "type" : "block",
            "name" : "nvme0n1",
            "numa" : "0"
        }
     ],
     "online": [
         "enp4s0f0",
         "nvme0n1"
     ],
     "threads": [
         {
              "type" : "network",
              "name" : "enp4s0f0",
              "threads" : 2
         },
         {
              "type" : "block",
              "name" : "nvme0n1",
              "threads" : 2
         }
     ],
     "filesegment": "2G",
     "cpus" : {
         "storage" : [0, 1, 2, 3],
         "network" : [4, 5, 6, 7]
     }
  }

In general, you need not configure manually_configured_cpus.