Fio fsync, end_fsync, fdatasync and sync

Intro to fsync, end_fsync, fdatasync and sync

  • fsync=int

If writing to a file, issue an fsync(2) (or its equivalent) of the dirty data for every number of blocks given. For example, if you give 32 as a parameter, fio will sync the file after every 32 writes issued. If fio is using non-buffered I/O, we may not sync the file. The exception is the sg I/O engine, which synchronizes the disk cache anyway. Defaults to 0, which means fio does not periodically issue and wait for a sync to complete. Also see end_fsync and fsync_on_close.

  • end_fsync=bool

If true, fsync(2) file contents when a write stage has completed. Default: false.

  • fsync_on_close=bool

If true, fio will fsync(2) a dirty file on close. This differs from end_fsync in that it will happen on every file close, not just at the end of the job. Default: false.

  • fdatasync=int

Like fsync but uses fdatasync(2) to only sync data and not metadata blocks. In Windows, DragonFlyBSD or OSX there is no fdatasync(2) so this falls back to using fsync(2). Defaults to 0, which means fio does not periodically issue and wait for a data-only sync to complete.

  • sync=str

Whether, and what type, of synchronous I/O to use for writes. The allowed values are:

  • none - Do not use synchronous IO, the default.
  • 0 - Same as none.
  • sync - Use synchronous file IO. For the majority of I/O engines, this means using O_SYNC.
  • 1 - Same as sync.
  • dsync - Use synchronous data IO. For the majority of I/O engines, this means using O_DSYNC.

Source

Create a 100MB file only

Here we only create a 100MB file which means “lay out IO file” in the fio context. There is no actual I/O happening after the file creation even though we specify the I/O related options, such as blocksize=8k.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fsync=1 --numjobs=1 --direct=1 --group_reporting --create_only=1
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)

Run status group 0 (all jobs):

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

From the strace output, the disk space is allocated for the file.

$ cat strace.out
[..]
15801 write(1, "Starting 1 process\n", 19) = 19
15801 stat("/mnt/bench1/test.0.0", 0x7ffc477fcc20) = -1 ENOENT (No such file or directory)
15801 write(1, "test: Laying out IO file (1 file"..., 43)) = 43
15801 unlink("/mnt/bench1/test.0.0")    = -1 ENOENT (No such file or directory)
15801 open("/mnt/bench1/test.0.0", O_WRONLY|O_CREAT, 0644) = 3
15801 fallocate(3, 0, 0, 104857600)     = 0
15801 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
15801 close(3)
[..]

Create and write 100MB file with fsync=1

Here we write 100MB data to a file with 8k blocksize.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fsync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=15988: Mon Mar 14 22:28:51 2022
  write: IOPS=6174, BW=48.2MiB/s (50.6MB/s)(100MiB/2073msec)
    slat (usec): min=29, max=163, avg=56.07, stdev=12.67
    clat (usec): min=19, max=118, avg=30.21, stdev= 7.39
     lat (usec): min=52, max=197, avg=86.72, stdev=18.43
    clat percentiles (usec):
     |  1.00th=[   21],  5.00th=[   24], 10.00th=[   25], 20.00th=[   26],
     | 30.00th=[   27], 40.00th=[   29], 50.00th=[   29], 60.00th=[   30],
     | 70.00th=[   31], 80.00th=[   32], 90.00th=[   38], 95.00th=[   47],
     | 99.00th=[   61], 99.50th=[   67], 99.90th=[   80], 99.95th=[   84],
     | 99.99th=[  102]
   bw (  KiB/s): min=45700, max=53392, per=99.44%, avg=49121.00, stdev=3324.04, samples=4
   iops        : min= 5712, max= 6674, avg=6140.00, stdev=415.68, samples=4
  lat (usec)   : 20=0.66%, 50=95.25%, 100=4.07%, 250=0.02%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=95, max=7603, avg=235.07, stdev=105.18
    sync percentiles (nsec):
     |  1.00th=[  107],  5.00th=[  165], 10.00th=[  193], 20.00th=[  211],
     | 30.00th=[  211], 40.00th=[  213], 50.00th=[  217], 60.00th=[  221],
     | 70.00th=[  225], 80.00th=[  274], 90.00th=[  338], 95.00th=[  342],
     | 99.00th=[  398], 99.50th=[  426], 99.90th=[  486], 99.95th=[  540],
     | 99.99th=[ 6880]
  cpu          : usr=7.63%, sys=29.92%, ctx=89626, majf=0, minf=13
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,12799 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=48.2MiB/s (50.6MB/s), 48.2MiB/s-48.2MiB/s (50.6MB/s-50.6MB/s), io=100MiB (105MB), run=2073-2073msec

Disk stats (read/write):
  nvme0n1: ios=0/34843, merge=0/11135, ticks=0/400, in_queue=399, util=95.01%

From the strace output, fsync is issued after each 8k block write.

$ cat strace.out
[..]
15988 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
15988 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
15988 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 fsync(3)                          = 0
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 fsync(3)                          = 0
[..]
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\340?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=158715}, ru_stime={tv_sec=0, tv_usec=620431}, ...}) = 0
15988 close(3)
[..]

Create and write 100MB file with end_fsync=1

Here we write 100MB data to a file and fsync is issued after the job completes.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --end_fsync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=16648: Mon Mar 14 22:50:22 2022
  write: IOPS=9631, BW=75.2MiB/s (78.9MB/s)(100MiB/1329msec)
    slat (usec): min=42, max=131, avg=61.18, stdev=11.19
    clat (usec): min=29, max=100, avg=39.34, stdev= 6.50
     lat (usec): min=77, max=193, avg=101.11, stdev=10.93
    clat percentiles (nsec):
     |  1.00th=[32128],  5.00th=[32640], 10.00th=[33024], 20.00th=[34048],
     | 30.00th=[35072], 40.00th=[35584], 50.00th=[36608], 60.00th=[38656],
     | 70.00th=[43776], 80.00th=[44800], 90.00th=[46848], 95.00th=[50944],
     | 99.00th=[60672], 99.50th=[62720], 99.90th=[70144], 99.95th=[72192],
     | 99.99th=[80384]
   bw (  KiB/s): min=77168, max=77408, per=100.00%, avg=77288.00, stdev=169.71, samples=2
   iops        : min= 9646, max= 9676, avg=9661.00, stdev=21.21, samples=2
  lat (usec)   : 50=94.53%, 100=5.46%, 250=0.01%
  cpu          : usr=7.91%, sys=42.70%, ctx=51220, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,1 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=75.2MiB/s (78.9MB/s), 75.2MiB/s-75.2MiB/s (78.9MB/s-78.9MB/s), io=100MiB (105MB), run=1329-1329msec

Disk stats (read/write):
  nvme0n1: ios=0/12803, merge=0/5, ticks=0/203, in_queue=203, util=91.40%

From the strace output, the fsync is issued at the end of fio job.

$ cat strace.out
[..]
16648 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
16648 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
16648 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
[..]
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\340?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
16648 fsync(3)                          = 0
16648 close(3)                          = 0
[..]

Create and write 100MB file with fdatasync=1

Here we write 100MB data to a file with 8k blocksize. fdatasync is issued after each block write.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fdatasync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=23011: Tue Mar 15 02:45:48 2022
  write: IOPS=4857, BW=37.0MiB/s (39.8MB/s)(100MiB/2635msec)
    slat (usec): min=30, max=183, avg=70.02, stdev=12.85
    clat (usec): min=20, max=280, avg=45.46, stdev= 9.98
     lat (usec): min=54, max=348, avg=116.11, stdev=20.49
    clat percentiles (usec):
     |  1.00th=[   25],  5.00th=[   26], 10.00th=[   31], 20.00th=[   41],
     | 30.00th=[   45], 40.00th=[   46], 50.00th=[   48], 60.00th=[   49],
     | 70.00th=[   49], 80.00th=[   50], 90.00th=[   52], 95.00th=[   58],
     | 99.00th=[   75], 99.50th=[   84], 99.90th=[  137], 99.95th=[  143],
     | 99.99th=[  172]
   bw (  KiB/s): min=37168, max=43632, per=100.00%, avg=38934.40, stdev=2648.49, samples=5
   iops        : min= 4646, max= 5454, avg=4866.80, stdev=331.06, samples=5
  lat (usec)   : 50=86.37%, 100=13.41%, 250=0.21%, 500=0.01%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=107, max=11447, avg=232.89, stdev=147.40
    sync percentiles (nsec):
     |  1.00th=[  127],  5.00th=[  161], 10.00th=[  185], 20.00th=[  211],
     | 30.00th=[  221], 40.00th=[  227], 50.00th=[  231], 60.00th=[  241],
     | 70.00th=[  247], 80.00th=[  258], 90.00th=[  266], 95.00th=[  282],
     | 99.00th=[  334], 99.50th=[  350], 99.90th=[  516], 99.95th=[  620],
     | 99.99th=[ 9408]
  cpu          : usr=6.19%, sys=31.13%, ctx=89613, majf=0, minf=13
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,0 short=12799,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=37.0MiB/s (39.8MB/s), 37.0MiB/s-37.0MiB/s (39.8MB/s-39.8MB/s), io=100MiB (105MB), run=2635-2635msec

Disk stats (read/write):
  nvme0n1: ios=0/35262, merge=0/11272, ticks=0/416, in_queue=416, util=96.06%

From the strace output, fdatasync is issued after each 8k block write.

$ cat strace.out
[..]
23011 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
23011 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
23011 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 fdatasync(3)
[..]
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 ?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104841216}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 fdatasync(3)                      = 0
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 ?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=164077}, ru_stime={tv_sec=0, tv_usec=820386}, ...}) = 0
23011 close(3)
[..]

Create and write 100MB file with sync=1

Here we write 100MB data, by doing 8k sequential write with 1 job. The I/O is synchronous.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --sync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=16435: Mon Mar 14 22:43:59 2022
  write: IOPS=9377, BW=73.3MiB/s (76.8MB/s)(100MiB/1365msec)
    slat (usec): min=28, max=146, avg=47.45, stdev= 7.00
    clat (usec): min=33, max=581, avg=57.00, stdev=11.75
     lat (usec): min=79, max=619, avg=104.88, stdev=14.94
    clat percentiles (usec):
     |  1.00th=[   43],  5.00th=[   45], 10.00th=[   48], 20.00th=[   51],
     | 30.00th=[   52], 40.00th=[   53], 50.00th=[   57], 60.00th=[   59],
     | 70.00th=[   60], 80.00th=[   62], 90.00th=[   67], 95.00th=[   76],
     | 99.00th=[   95], 99.50th=[   99], 99.90th=[  113], 99.95th=[  131],
     | 99.99th=[  578]
   bw (  KiB/s): min=70896, max=77461, per=98.88%, avg=74178.50, stdev=4642.16, samples=2
   iops        : min= 8862, max= 9682, avg=9272.00, stdev=579.83, samples=2
  lat (usec)   : 50=18.82%, 100=80.74%, 250=0.42%, 750=0.02%
  cpu          : usr=4.62%, sys=28.52%, ctx=63963, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=73.3MiB/s (76.8MB/s), 73.3MiB/s-73.3MiB/s (76.8MB/s-76.8MB/s), io=100MiB (105MB), run=1365-1365msec

Disk stats (read/write):
  nvme0n1: ios=0/32201, merge=0/10294, ticks=0/356, in_queue=356, util=91.92%

From the strace output, the file is opened with “O_SYNC” flag since we specified “–sync=1” in the fio command. This means all the incoming writes are synchronous.

$ cat strace.out
[..]
16435 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_SYNC|O_DIRECT, 0600) = 3
16435 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
16435 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 \0\0\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
[..]
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\200?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
16435 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=64153}, ru_stime={tv_sec=0, tv_usec=389856}, ...}) = 0
16435 close(3)
[..]