Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Fusion Engineering and Design 89 (2014) 770–774

Contents lists available at ScienceDirect

Fusion Engineering and Design


journal homepage: www.elsevier.com/locate/fusengdes

A TCP/IP-based constant-bit-rate file transfer protocol and its


extension to multipoint data delivery
Kenjiro Yamanaka a,∗ , Shigeo Urushidani a , Hideya Nakanishi b , Takashi Yamamoto b ,
Yoshio Nagayama b
a
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan
b
National Institute of Fusion Science, 322-6 Orochi, Toki, Gifu, Japan

a r t i c l e i n f o a b s t r a c t

Article history: We present a new TCP/IP-based file transfer protocol which enables high-speed daisy-chain transfer. By
Received 25 May 2013 using this protocol, we can send a file to a series of destination hosts simultaneously because intermediate
Received in revised form 4 February 2014 hosts relay received file fragments to the next host. We achieved daisy-chain file transfer from Japan to
Accepted 6 February 2014
Europe via USA at about 800 Mbps by using a prototype. The experimental result is also reported. A
Available online 19 March 2014
total link length of a data delivery network can be reduced by daisy chaining, so it enables cost-effective
international data sharing.
Keywords:
© 2014 Elsevier B.V. All rights reserved.
Network
Data transfer
LFN
Daisy-chain
Clock driven programming

1. Introduction Furthermore, it should make efficient use of network bandwidth


because long-distance broadband lines are expensive. If ITER data
Leading-edge scientific projects require high-speed networks will be delivered to each country by point to point, each partici-
and an efficient data transfer method to share measured or cal- pating country should prepare its own broadband network line to
culated data among geographically distant locations. Examples of Cadarache which results in wasted network bandwidth because the
such projects include experimental analyses and simulations in sci- same data will be delivered by each network line. On the other hand,
entific disciplines such as high-energy physics, climate modeling, using daisy-chain transfer enables efficient use of network band-
earthquake engineering, and astronomy. In these projects, many width. Each country transfers data in sequence until data reaches
research groups share data in global collaboration, since each facil- the final destination from Cadarache. Network bandwidth is shared
ity tends to be expensive. efficiently by countries in the sequence. However, there is a weak
The ITER project also requires high-speed networks and the point of this transfer: the transfer rate is restricted by the lowest
efficient data transfer method. The ITER is a research and engineer- speed link in sequence. Furthermore, it is well-known that obtain-
ing project to design, build, and operate an experimental Tokamak ing a high-speed over long-distance line by using TCP/IP is difficult.
Nuclear Fusion Reactor. It is constructing in Cadarache, France with As a result, if we use a traditional TCP-based data transfer method
international collaboration of the EU, Japan, Korea, China, India, for a daisy-chaining data delivery, we cannot establish high-speed
Russia, and USA. The ITER will generate 100 or 1000 GB data per data transfer because the transfer speed of the longest network line
shot [1] and these data must be delivered in a timely way to partic- limits overall transfer speed.
ipant countries. An efficient and effective way to multipoint data We have developed a new TCP-based file transfer protocol, Mas-
delivery is required. sively Multi-Channel File Transfer Protocol (MMCFTP). By using this
Since participants are spread around the world, the data deliv- protocol, we can transfer a file at a specified bit rate regardless of the
ery method should support long-distance high-speed transfer. transmission distance. We introduce MMCFTP briefly, and present
an extension of MMCFTP for a daisy-chaining multipoint data deliv-
ery. As this extension enables high-speed multipoint data delivery,
∗ Corresponding author. Tel.: +81 342122696; fax: +81 342128430. it is useful for international collaboration of leading-edge scientific
E-mail address: yamanaka@nii.ac.jp (K. Yamanaka). projects, such as the ITER project.

http://dx.doi.org/10.1016/j.fusengdes.2014.02.028
0920-3796/© 2014 Elsevier B.V. All rights reserved.
K. Yamanaka et al. / Fusion Engineering and Design 89 (2014) 770–774 771

2. MMCFTP and its extension to multipoint data delivery Network


Receiver program
The MMCFTP is a new TCP/IP-based file transfer protocol devel-
buffer module
oped in National Institute of Informatics (NII) and has the following receive module 12
chunk
features: chunk
9 chunk

(1) The user can specify a transfer rate. 1 transmit module 8


(extended) file
(2) Unless the specified rate exceeds the limit of execution
buffer 1 2 3
environment,2 the transfer is performed at this rate, regardless
memory 5 6
of the transmission distance. output module
4 7
(3) TCP tuning is not required.
10 11
test.dat
The second feature is provided by changing the number of TCP
connections (channels) automatically in accordance with the spec- Fig. 1. Block diagram – receiver.
ified rate and TCP transfer rate. This function makes the third
feature possible. If TCP tuning is not done, MMCFTP achieves a
transfer. The receiver program receives chunks and keeps them in a
specified transfer rate using more channels. Several hundreds or
buffer memory. Chunks are then written to disk in accordance with
thousands of channels are used in the long-distance high-speed
their sequence numbers. In transmission using different channels,
transfer. Therefore, we call it a “Massively Multi-Channel” FTP.
reversing the order of the reception can occur. Therefore, random
write is used.
2.1. MMCFTP overview
2.2. MMCFTP extension for daisy-chain transfer
The inputs of the sender program are as follows:

H: Receiver host name. The receiver program should be started To support daisy-chain transfer, the input H of the sender pro-
before transfer at host H. gram is extended as follows.Hs : Receiver host sequence.
F: File name to be sent. When the receiver program accepts the initial connection, it
T: Timer period. Range: 31.2 ms–1 s.3 checks Hs and the next host. If the next host exists, the receiver
C: Number of chunks to be sent in a timer cycle. Range: 1–384. program removes its name from Hs , makes a TCP connection to the
S: Chunk size. Three sizes are selectable: 64 KiB, 256 KiB, 1 MiB. next host, sends received data to the next host, then waits for a
response of the next host. After receiving the accept response, the
File F is divided into fixed sized chunks, and each chunk is receiver sends the same response to the previous host, then pre-
assigned a sequence number. Before sending file F, the sender pro- pares more than enough channels. Before chunks are written to
gram makes a TCP connection to host H, and sends all inputs and the disk, the receiver program sends chunks to the next host by the
file size of F to H, then waits for the receiver’s response. After receiv- same manner of the sender program. Fig. 1 shows a block diagram
ing the accept response, the sender program prepares more than of the extended receiver program.
enough channels to be used for data transmission. By using a peri-
odic timer, the sender program sends out C chunks with sequence
3. Experimental result
numbers, for every T period.4
Therefore, the transmit rate Vs is represented as follows:
We experimented with a daisy-chain transfer in a real net-
work environment using a prototype program of MMCFTP. Fig. 2
SC
Vs = (1) shows this experiment environment. The experiment was per-
T
formed using NII’s campus network that is used on a daily basis.
For example, Receiver hosts were rented from a public cloud service, Ama-
zon EC2. As MMCFTP does not require a tuning that depends on
(T, S, C) = (62.4 ms, 96, 64 KiB) ⇒ Vs = 806 Mbps, machines, rental servers were sufficient. We used the OS default for
the TCP configuration. Therefore, the congestion control algorithm
of the Tokyo machine and others are NewReno and Compound
(T, S, C) = (31.2 ms, 384, 1 MiB) ⇒ Vs = 100 Gbps.
TCP, respectively. The storage specification of machines is special.
Solid-state drives (SSDs) were selected because hard disk drives
When sending a chunk, the sender program looks for a chan- are too slow to achieve a transfer of 800 Mbps. In the experi-
nel which is not sending another chunk,5 then it sends the chunk ment, we specified 806 Mbps as the transfer rate using a 62.4 ms
by using the found channel. The number of TCP channels to be timer period, and used a 11.6 GB file as test data. The estimated
used is automatically balanced out to the TCP transfer rate by this transfer time was 1 min 55 s. To compare the difference in trans-
mechanism. This mechanism is the key of the constant bit rate mission characteristics depending on differences in the chunk
size, we did transfer experiments by specifying the same trans-
port rate in chunks of three different sizes, 64 KiB, 256 KiB, and
1
Rate range is from 525Kbps to 100Gbps, currently. 1 MiB.
2
The limit of execution environment includes network bandwidth, CPU speed,
and storage access speed.
3
Timer period is specified as a multiple number of the timer resolution. In Win- 3.1. Chunk size: 64 KiB
dows OS, the time resolution is 15.6ms.
4
All tasks in MMCFTP programs are performed in a timer handler. This program- The result is summarized in Table 1, and details are shown in
ming style is called Clock driven programming (CDP) [2]. The CDP is a software Figs. 3 and 4.
representation of the synchronous circuit design. To keep timer cycle correctly, we
do not use blocked I/O. We use asynchronous or non-blocked I/O in CDP programs.
TCP transfer rates are different, but the number of channels
5
To detect transmission completion correctly, non-buffered sockets [3] are used is automatically adjusted to keep the specified total rate. This
in the sender program. property enables high-speed daisy-chain transfer. As shown in
772 K. Yamanaka et al. / Fusion Engineering and Design 89 (2014) 770–774

Fig. 2. Experiment environment.

Table 1
Result – 64 KiB.

Japan to USA USA

Total transfer rate (Mbps) 777.2 771.7

Throughput [Kbps]

Number of chunks
TCP transfer rate (Mbps) 1.6 2.6
Channels in use (ch.) 480.7 294.4

TCP Transfer rate


TCP Transfer rate (weighted average)
Number of sent chunks
Number of channels
Throughput [Kbps]

Total transfer rate

TCP transfer rate (Magnified 100x)

Number of channels in use Fig. 5. TCP throughput per channels – from Japan to USA (64 KiB).

Number of connected channels

Table 2
Result – 256 KiB.

Japan to USA USA to Europe


Time [sec]
Total transfer rate (Mbps) 770.3 767.3
TCP transfer rate (Mbps) 6.1 9.6
Fig. 3. Throughput and number of channels – from Japan to USA (64 KiB).
Channels in use (ch.) 127.0 80.0

Fig. 3, the TCP transfer rate is stable after a slow start phase. Fig. 5
shows the TCP transfer rate and number of sent chunks of each 3.2. Chunk size: 256 KiB
channel from Japan to the USA. The TCP transfer rates of right-side
channels are low because these channels are used only in a slow The result is summarized in Table 2, and details are shown in
start phase. The TCP transfer rates of center and left-side channels Figs. 6 and 7.
are almost identical because the congestion window is larger than As shown in Fig. 6, when the TCP transfer rate goes down, the
the chunk. Because the TCP transfer rate of each channel is uniform, number of channels goes up automatically, so the total transfer rate
maintaining a constant total transfer rate is straightforward. becomes stable (see Fig. 8).

1000

900

800
Number of channels
Throughput [Kbps]

Number of channels

700
Throughput [Kbps]

600

500
Total transfer rate
400 TCP transfer rate (Magnified 10x)
Number of channels in use
300
Number of connected channels
Total transfer rate 200
TCP transfer rate (Magnified 100x)
Number of channels in use 100
Number of connected channels
0
-10 10 30 50 70 90 110 130
Time [sec] Time [sec]

Fig. 4. Throughput and number of channels – from USA to Europe (64 KiB). Fig. 6. Throughput and number of channels – from Japan to USA (256 KiB).
K. Yamanaka et al. / Fusion Engineering and Design 89 (2014) 770–774 773

1,000,000 1000 1,000,000 100

900,000 900 900,000 90


800,000 800 800,000 80

Number of channels

Number of channels
Throughput [Kbps]

700,000 700

Throughput [Kbps]
700,000 70
600,000 600 Total transfer rate
600,000 TCP transfer rate (Magnified 10x) 60
500,000 500 Number of channels in use
Total transfer rate
500,000 50
Number of connected channels
400,000 TCP transfer rate (Magnified 10x) 400
400,000 40
Number of channels in use
300,000 Number of connected channels
300
300,000 30
200,000 200
200,000 20
100,000 100
100,000 10
0 0
-10 10 30 50 70 90 110 130 0 0
Time [sec] -10 10 30 50 70 90 110 130
Time [sec]
Fig. 7. Throughput and number of channels – from USA to Europe (256 KiB).
Fig. 10. Throughput and number of channels – from USA to Europe (1 MiB).

TCP Transfer rate


TCP Transfer rate (Weighted average) TCP Transfer rate
Number of sent chunks
TCP transfer rate (Weighed average)
Number of sent chunks
Throughput [Kbps]

Number of chunks

Throughput [Kbps]

Number of chunks
Fig. 8. TCP throughput per channels – from Japan to USA (256 KiB).
Fig. 11. TCP throughput per channels – from Japan to USA (1 MiB).

Table 3
Result – 1 MiB. In this case, the TCP transfer rate of each channels varies with the
Japan to USA USA to Europe congestion control as shown in Fig. 11. As a result, the total transfer
rate also varies. However as the number of channels is balanced
Total transfer rate (Mbps) 754.5 751.4
TCP transfer rate (Mbps) 15.5 32.9
out to the TCP transfer rate, the average transfer rate is kept at the
Channels in use (ch.) 48.8 22.8 specified rate.

4. Related works
3.3. Chunk size: 1 MiB
GridFTP [4] and bbFTP [5] are well-known protocols which use
The result is summarized in Table 3, and shown in more detail multi-channel data transfer to increase their performance. In these
in Figs. 9 and 10. protocols, user should specify an adequate number of TCP chan-
nels in accordance with the network environment to maximize
throughput and it is difficult to determine an adequate number.
Specifying too many channels decreases throughput because multi
TCP streams compete to get more bandwidth, and cause conges-
tion. Such competition does not happen in MMCFTP because users
can restrict the total transfer rate of multi TCP streams.
Number of channels
Throughput [Kbps]

Tanida et al. [6] have transferred data between Tokyo and


Cadarache at about 860 Mbps by using the Inter Packet Gap tuning
Total transfer rate
technique. Using this technique can reduce the number of packet
TCP transfer rate losses. However, these losses are unavoidable in usual networks,
Number of channels in use
such as the Internet. Its effectiveness in the usual networks is
Number of connected channels
unclear.

5. Conclusion

We presented a file transfer protocol MMCFTP and its extension


Time [sec]
to daisy-chain transfer. By using this protocol, we can send a file to
Fig. 9. Throughput and number of channels – from Japan to USA (1 MiB). a series of destination hosts simultaneously. Daisy-chain transfer
774 K. Yamanaka et al. / Fusion Engineering and Design 89 (2014) 770–774

was used widely in the UUCP network because it enables efficient [2] K. Yamanaka, Clock driven programming: a programming paradigm which
use of expensive and limited network resources. Even in modern enables machine-independent performance design, in: Proceedings of the
Third Joint WOSP/SIPEW International Conference on Performance Engineer-
TCP/IP networks, long-distance broadband lines are expensive and ing, ICPE 2012, ACM, New York, NY, USA, 2012, pp. 267–270, http://dx.doi.
resources are limited. Therefore the TCP/IP based daisy-chain trans- org/10.1145/2188286.2188335.
fer method is useful for international collaboration of leading-edge [3] Socket overlapped I/O versus blocking/nonblocking mode, http://support.
microsoft.com/kb/181611/
scientific projects, such as the ITER project. [4] B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, et al.,
Data management and transfer in high-performance computational grid envi-
References ronments, Parallel Computing 28 (5) (2002) 749–771.
[5] Bbftp home page, http://doc.in2p3.fr/bbftp/
[6] N. Tanida, K. Hiraki, M. Inaba, Efficient disk-to-disk copy through longdistance
[1] H. Nakanishi, M. Ohsuna, M. Kojima, S. Imazu, M. Nonomura, T. Yamamoto, et al.,
high-speed networks with background traffic, Fusion Engineering and Design
Data acquisition system for steady-state experiments at multiple sites, Nuclear
85 (3) (2010) 553–556.
Fusion 51 (2011) http://stacks.iop.org/0029-5515/51/i=11/a=113014

You might also like