mdtmFTP installation and configuration manual

About mdtmFTP

To address the high-performance challenges of data transfer in the big data era, Fermilab network research group has developed mdtmFTP, a high-performance data transfer tool for big data.

mdtmFTP has a number of advanced features.

  • First, it adopts a pipelined I/O design. Data transfer tasks are carried out in a pipelined manner across multiple cores. Dedicated threads are spawned to perform network and disk I/O operations in parallel.

  • Second, mdtmFTP uses multicore-aware data transfer middleware (MDTM) to schedule an optimal core for each thread, based on system configuration, to optimize throughput across the underlying multicore core platform.

  • Third, mdtmFTP implements a large virtual file mechanism to efficiently handle lots-of-small-files (LOSF) situations.

  • Finally, mdtmFTP unitizes optimization mechanisms such as zero copy, asynchronous I/O, batch processing, and pre-allocated buffer pools, to maximize performance.

This document describes the installation of mdtmFTP from source code release, and its basic use. For manuals of Docker or Singuarity releases of mdtmFTP, please visit the project website: http://mdtm.fnal.gov.

For technical details about mdtmFTP, please refer to the paper:

Liang Zhang,Wenji Wu, Phil DeMar, Eric Pouyoul: mdtmFTP and its evaluation on ESNET SDN testbed. Future Generation Comp. Syst.79:199-204 (2018)

Contacts

Intended audience

This manual is intended for users and system administrators responsible for installing, running, and managing DTNs. The manual assumes familiarity with multicore and DTN concepts.

Acknowledgements

mdtmFTP uses several Globus modules for rapid prototyping. We sincerely thank Globus folks at Argonne National Laboratory and University of Chicago.

Here is a list of Globus modules that mdtmFTP uses:

  • GridFTP protocol module

  • Globus xio module

  • Globus security module

  • Globus user interface

As of January 2018, Globus Toolkit is retired, and now it is succeeded by software maintained by Grid Community Forum (GCF).