Windows에서 Disk I/O 성능 측정하기 - diskspd
2020.03.11 17:39
원문 : http://www.ischo.net -- 조인상 // 시스템 엔지니어
Writer : http://www.ischo.net -- ischo // System Engineer in Replubic Of Korea
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Windows에서 Disk I/O 성능 측정하기
diskspd 유틸리티를 이용한 Disk I/O성능 측정하는 방법을 소개한다.
사용법 확인
>Diskspd.exe /?
Usage: Diskspd.exe [options] target1 [ target2 [ target3 ...] ]
version 2.0.21a (2018/9/21)
Available targets:
file_path
#<physical drive number>
<partition_drive_letter>:
Available options:
-? display usage information
-ag group affinity - affinitize threads round-robin to cores in Processor Groups 0 - n.
Group 0 is filled before Group 1, and so forth.
[default; use -n to disable default affinity]
-ag#,#[,#,...]> advanced CPU affinity - affinitize threads round-robin to the CPUs provided. The g# notation
specifies Processor Groups for the following CPU core #s. Multiple Processor Groups
may be specified, and groups/cores may be repeated. If no group is specified, 0 is assumed.
Additional groups/processors may be added, comma separated, or on separate parameters.
Examples: -a0,1,2 and -ag0,0,1,2 are equivalent.
-ag0,0,1,2,g1,0,1,2 specifies the first three cores in groups 0 and 1.
-ag0,0,1,2 -ag1,0,1,2 is equivalent.
-b<size>[K|M|G] block size in bytes or KiB/MiB/GiB [default=64K]
-B<offs>[K|M|G|b] base target offset in bytes or KiB/MiB/GiB/blocks [default=0]
(offset from the beginning of the file)
-c<size>[K|M|G|b] create files of the given size.
Size can be stated in bytes or KiB/MiB/GiB/blocks
-C<seconds> cool down time - duration of the test after measurements finished [default=0s].
-D<milliseconds> Capture IOPs statistics in intervals of <milliseconds>; these are per-thread
per-target: text output provides IOPs standard deviation, XML provides the full
IOPs time series in addition. [default=1000, 1 second].
-d<seconds> duration (in seconds) to run test [default=10s]
-f<size>[K|M|G|b] target size - use only the first <size> bytes or KiB/MiB/GiB/blocks of the file/disk/partition,
for example to test only the first sectors of a disk
-f<rst> open file with one or more additional access hints
r : the FILE_FLAG_RANDOM_ACCESS hint
s : the FILE_FLAG_SEQUENTIAL_SCAN hint
t : the FILE_ATTRIBUTE_TEMPORARY hint
[default: none]
-F<count> total number of threads (conflicts with -t)
-g<bytes per ms> throughput per-thread per-target throttled to given bytes per millisecond
note that this can not be specified when using completion routines
[default inactive]
-h deprecated, see -Sh
-i<count> number of IOs per burst; see -j [default: inactive]
-j<milliseconds> interval in <milliseconds> between issuing IO bursts; see -i [default: inactive]
-I<priority> Set IO priority to <priority>. Available values are: 1-very low, 2-low, 3-normal (default)
-l Use large pages for IO buffers
-L measure latency statistics
-n disable default affinity (-a)
-N<vni> specify the flush mode for memory mapped I/O
v : uses the FlushViewOfFile API
n : uses the RtlFlushNonVolatileMemory API
i : uses RtlFlushNonVolatileMemory without waiting for the flush to drain
[default: none]
-o<count> number of outstanding I/O requests per target per thread
(1=synchronous I/O, unless more than 1 thread is specified with -F)
[default=2]
-O<count> number of outstanding I/O requests per thread - for use with -F
(1=synchronous I/O)
-p start parallel sequential I/O operations with the same offset
(ignored if -r is specified, makes sense only with -o2 or greater)
-P<count> enable printing a progress dot after each <count> [default=65536]
completed I/O operations, counted separately by each thread
-r<align>[K|M|G|b] random I/O aligned to <align> in bytes/KiB/MiB/GiB/blocks (overrides -s)
-R<text|xml> output format. Default is text.
-s[i]<size>[K|M|G|b] sequential stride size, offset between subsequent I/O operations
[default access=non-interlocked sequential, default stride=block size]
In non-interlocked mode, threads do not coordinate, so the pattern of offsets
as seen by the target will not be truly sequential. Under -si the threads
manipulate a shared offset with InterlockedIncrement, which may reduce throughput,
but promotes a more sequential pattern.
(ignored if -r specified, -si conflicts with -T and -p)
-S[bhmruw] control caching behavior [default: caching is enabled, no writethrough]
non-conflicting flags may be combined in any order; ex: -Sbw, -Suw, -Swu
-S equivalent to -Su
-Sb enable caching (default, explicitly stated)
-Sh equivalent -Suw
-Sm enable memory mapped I/O
-Su disable software caching, equivalent to FILE_FLAG_NO_BUFFERING
-Sr disable local caching, with remote sw caching enabled; only valid for remote filesystems
-Sw enable writethrough (no hardware write caching), equivalent to FILE_FLAG_WRITE_THROUGH or
non-temporal writes for memory mapped I/O (-Sm)
-t<count> number of threads per target (conflicts with -F)
-T<offs>[K|M|G|b] starting stride between I/O operations performed on the same target by different threads
[default=0] (starting offset = base file offset + (thread number * <offs>)
makes sense only with #threads > 1
-v verbose mode
-w<percentage> percentage of write requests (-w and -w0 are equivalent and result in a read-only workload).
absence of this switch indicates 100% reads
IMPORTANT: a write test will destroy existing data without a warning
-W<seconds> warm up time - duration of the test before measurements start [default=5s]
-x use completion routines instead of I/O Completion Ports
-X<filepath> use an XML file for configuring the workload. Cannot be used with other parameters.
-z[seed] set random seed [with no -z, seed=0; with plain -z, seed is based on system run time]
Write buffers:
-Z zero buffers used for write tests
-Zr per IO random buffers used for write tests - this incurrs additional run-time
overhead to create random content and shouln't be compared to results run
without -Zr
-Z<size>[K|M|G|b] use a <size> buffer filled with random data as a source for write operations.
-Z<size>[K|M|G|b],<file> use a <size> buffer filled with data from <file> as a source for write operations.
By default, the write buffers are filled with a repeating pattern (0, 1, 2, ..., 255, 0, 1, ...)
Synchronization:
-ys<eventname> signals event <eventname> before starting the actual run (no warmup)
(creates a notification event if <eventname> does not exist)
-yf<eventname> signals event <eventname> after the actual run finishes (no cooldown)
(creates a notification event if <eventname> does not exist)
-yr<eventname> waits on event <eventname> before starting the run (including warmup)
(creates a notification event if <eventname> does not exist)
-yp<eventname> stops the run when event <eventname> is set; CTRL+C is bound to this event
(creates a notification event if <eventname> does not exist)
-ye<eventname> sets event <eventname> and quits
Event Tracing:
-e<q|c|s> Use query perf timer (qpc), cycle count, or system timer respectively.
[default = q, query perf timer (qpc)]
-ep use paged memory for the NT Kernel Logger [default=non-paged memory]
-ePROCESS process start & end
-eTHREAD thread start & end
-eIMAGE_LOAD image load
-eDISK_IO physical disk IO
-eMEMORY_PAGE_FAULTS all page faults
-eMEMORY_HARD_FAULTS hard faults only
-eNETWORK TCP/IP, UDP/IP send & receive
-eREGISTRY registry calls
Examples:
Create 8192KB file and run read test on it for 1 second:
Diskspd.exe -c8192K -d1 testfile.dat
Set block size to 4KB, create 2 threads per file, 32 overlapped (outstanding)
I/O operations per thread, disable all caching mechanisms and run block-aligned random
access read test lasting 10 seconds:
Diskspd.exe -b4K -t2 -r -o32 -d10 -Sh testfile.dat
Create two 1GB files, set block size to 4KB, create 2 threads per file, affinitize threads
to CPUs 0 and 1 (each file will have threads affinitized to both CPUs) and run read test
lasting 10 seconds:
Diskspd.exe -c1G -b4K -t2 -d10 -a0,1 testfile1.dat testfile2.dat
예제 : 60초, 4개 쓰레드, 랜덤I/O, 30%쓰기 70%읽기, 1GB 용량 io.dat 파일 생성 테스트
> Diskspd.exe -d60 -o2 -t4 -r -w30 -c1G io.dat
Command Line: Diskspd.exe -d60 -o2 -t4 -r -w30 -c1G io.dat
Input parameters:
timespan: 1
-------------
duration: 60s
warm up time: 5s
cool down time: 0s
random seed: 0
path: 'io.dat'
think time: 0ms
burst size: 0
using software cache
using hardware write cache, writethrough off
performing mix test (read/write ratio: 70/30)
block size: 65536
using random I/O (alignment: 65536)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal
System information:
computer name:
start time: 2020/03/11 08:30:24 UTC
Results for timespan 1:
*******************************************************************************
actual test time: 60.00s
thread count: 4
proc count: 8
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 99.61%| 6.25%| 93.36%| 0.39%
1| 99.45%| 5.47%| 93.98%| 0.55%
2| 99.32%| 5.86%| 93.46%| 0.68%
3| 99.38%| 4.32%| 95.05%| 0.63%
4| 31.46%| 17.60%| 13.85%| 68.54%
5| 22.24%| 11.67%| 10.57%| 77.76%
6| 26.22%| 13.23%| 12.99%| 73.78%
7| 22.76%| 12.66%| 10.10%| 77.24%
-------------------------------------------
avg.| 62.56%| 9.63%| 52.92%| 37.44%
Total IO
thread | bytes | I/Os | MiB/s | I/O per s | file
------------------------------------------------------------------------------
0 | 75493015552 | 1151932 | 1199.92 | 19198.75 | io.dat (1024MiB)
1 | 79343255552 | 1210682 | 1261.12 | 20177.91 | io.dat (1024MiB)
2 | 80271048704 | 1224839 | 1275.87 | 20413.86 | io.dat (1024MiB)
3 | 81315233792 | 1240772 | 1292.46 | 20679.40 | io.dat (1024MiB)
------------------------------------------------------------------------------
total: 316422553600 | 4828225 | 5029.37 | 80469.91
Read IO
thread | bytes | I/Os | MiB/s | I/O per s | file
------------------------------------------------------------------------------
0 | 52800192512 | 805667 | 839.23 | 13427.70 | io.dat (1024MiB)
1 | 55544578048 | 847543 | 882.85 | 14125.63 | io.dat (1024MiB)
2 | 56232050688 | 858033 | 893.78 | 14300.46 | io.dat (1024MiB)
3 | 56919195648 | 868518 | 904.70 | 14475.21 | io.dat (1024MiB)
------------------------------------------------------------------------------
total: 221496016896 | 3379761 | 3520.56 | 56329.00
Write IO
thread | bytes | I/Os | MiB/s | I/O per s | file
------------------------------------------------------------------------------
0 | 22692823040 | 346265 | 360.69 | 5771.05 | io.dat (1024MiB)
1 | 23798677504 | 363139 | 378.27 | 6052.28 | io.dat (1024MiB)
2 | 24038998016 | 366806 | 382.09 | 6113.40 | io.dat (1024MiB)
3 | 24396038144 | 372254 | 387.76 | 6204.19 | io.dat (1024MiB)
------------------------------------------------------------------------------
total: 94926536704 | 1448464 | 1508.81 | 24140.92