sysbench的部分基準性能測試學習
命令
Compiled-in tests:
fileio - File I/O test
cpu - CPU performance test
memory - Memory functions speed test
threads - Threads subsystem performance test
mutex - Mutex performance test
通用參數
General options:
--threads=N number of threads to use [1]
--events=N limit for total number of events [0]
--time=N limit for total execution time in seconds [10]
--warmup-time=N execute events for this many seconds with statistics disabled before the actual benchmark run with statistics enabled [0]
--forced-shutdown=STRING number of seconds to wait after the --time limit before forcing shutdown, or 'off' to disable [off]
--thread-stack-size=SIZE size of stack per thread [64K]
--thread-init-timeout=N wait time in seconds for worker threads to initialize [30]
--rate=N average transactions rate. 0 for unlimited rate [0]
--report-interval=N periodically report intermediate statistics with a specified interval in seconds. 0 disables intermediate reports [0]
--report-checkpoints=[LIST,...] dump full statistics and reset all counters at specified points in time. The argument is a list of comma-separated values representing the amount of time in seconds elapsed from start of test when report checkpoint(s) must be performed. Report checkpoints are off by default. []
--debug[=on|off] print more debugging info [off]
--validate[=on|off] perform validation checks where possible [off]
--help[=on|off] print help and exit [off]
--version[=on|off] print version and exit [off]
--config-file=FILENAME File containing command line options
--luajit-cmd=STRING perform LuaJIT control command. This option is equivalent to 'luajit -j'. See LuaJIT documentation for more information
測試CPU
--cpu-max-prime=N
sysbench的CPU測試是在指定時間內,
進行多輪次的素數計算。
除了1和它自身外,不能被其他自然數整除的數叫做素數(質數)。
一次event代表一輪的素數計算,即算出*–cpu-max-prime*以內的所有素數。
能獲得的測量指標:
每秒完成的events數
N%events的耗時範圍。例:95%的events耗時在0.5ms以內
總耗時
完成的events總數
所有events的最小、最大、平均耗時
所有線程耗時總和
平均每線程完成events數/標準差
平均每線程耗時/標準差
來源:
https://blog.csdn.net/squirrel100/article/details/120289743
測試CPU
sysbench --time=60 --threads=4 --report-interval=3 --cpu-max-prime=10000 cpu run
比如我這邊的測試結果:
CPU speed:
events per second: 13480.42
Throughput:
events/s (eps): 13480.4163
time elapsed: 60.0013s
total number of events: 808843
Latency (ms):
min: 0.30
avg: 0.30
max: 0.52
95th percentile: 0.30
sum: 239646.46
Threads fairness:
events (avg/stddev): 202210.7500/49.35
execution time (avg/stddev): 59.9116/0.01
測試內存
memory options:
--memory-block-size=SIZE # 內存塊大小 [1K]
--memory-total-size=SIZE # 傳輸數據的總大小 [100G]
--memory-scope=STRING # 內存訪問範圍 {global,local} [global]
--memory-hugetlb[=on|off] # 從HugeTLB池中分配內存 [off]
--memory-oper=STRING # 內存操作類型 {read, write, none} [write]
--memory-access-mode=STRING # 內存訪問模式 {seq,rnd} [seq]
測試內存
sysbench --threads=8 --time=60 --report-interval=10 --memory-block-size=8K --memory-total-size=4096G --memory-access-mode=seq memory run
注意 total-size 一定要足夠大才可以.
要是太小測試出來的結果可能比較失真.
Total operations: 164485793 (2741367.30 per second)
1285045.26 MiB transferred (21416.93 MiB/sec)
Throughput:
events/s (eps): 2741367.2990
time elapsed: 60.0014s
total number of events: 164485793
Latency (ms):
min: 0.00
avg: 0.00
max: 0.29
95th percentile: 0.00
sum: 346268.33
Threads fairness:
events (avg/stddev): 20560724.1250/1260446.77
execution time (avg/stddev): 43.2835/0.96
測試IO
# fileio options([]爲默認參數):
--file-num=N # 創建的文件數量 [128]
--file-block-size=N # 在所有IO操作中使用的塊大小 [16384]
--file-total-size=SIZE # 要創建的文件的總大小 [2G]
--file-test-mode=STRING # 測試模式 {seqwr(順序寫), seqrewr(順序重寫), seqrd(順序讀), rndrd(隨機讀), rndwr(隨機寫), rndrw(隨機讀寫)}
--file-io-mode=STRING # 文件操作模式 {sync(同步),async(異步),mmap} [sync]
--file-extra-flags=[LIST,...] # 用於打開文件的附加標誌列表 {sync,dsync,direct} []
--file-fsync-freq=N # 執行N條請求數量後執行fsync() (0 - don't use fsync()) [100]
--file-fsync-all[=on|off] # 每條寫指令後執行fsync() [off]
--file-fsync-end[=on|off] # 測試執行後執行fsync() [on]
--file-fsync-mode=STRING # 同步方式 {fsync, fdatasync} [fsync]
--file-merged-requests=N # 允許範圍內,最多合併IO請求數量 (0 - don't merge) [0]
--file-rw-ratio=N # 組合測試讀/寫比率 [1.5]
測試IO
# 線程數=8 每隔2s輸出一次結果 測試時間=10s
# 文件數=32 文件總大小=1G 文件操作模式=隨機讀寫
# 塊大小 8KB
sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw prepare
sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw run
sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw prepare
測試結果一般爲:
Throughput:
read: IOPS=8498.15 132.78 MiB/s (139.23 MB/s)
write: IOPS=5665.43 88.52 MiB/s (92.82 MB/s)
fsync: IOPS=4472.09
Latency (ms):
min: 0.00
avg: 0.43
max: 715.34
95th percentile: 0.17
sum: 80180.16
測試線程
--thread-yields=N number of yields to do per request [1000]
--thread-locks=N number of locks per thread [8]
參數詳解:
--thread-yields=N 指定每個請求的壓力,默認爲1000
--thread-locks=N 指定每個線程的鎖數量,默認爲8
線程調度:線程併發執行,循環響應信號量花費的時間{越少越好}
測試線程調度器的性能。對於高負載情況下測試線程調度器的行爲非常有用
測試線程
sysbench --threads=64 --report-interval=2 --time=10 threads run
注意 線程 64時的結果:
Throughput:
events/s (eps): 3413.1149
time elapsed: 10.0492s
total number of events: 34299
Latency (ms):
min: 0.65
avg: 18.72
max: 298.47
95th percentile: 125.52
sum: 642034.54
Threads fairness:
events (avg/stddev): 535.9219/56.01
execution time (avg/stddev): 10.0318/0.01
線程 1 時的結果
Throughput:
events/s (eps): 1595.8521
time elapsed: 10.0016s
total number of events: 15961
Latency (ms):
min: 0.61
avg: 0.63
max: 0.75
95th percentile: 0.64
sum: 9996.12
Threads fairness:
events (avg/stddev): 15961.0000/0.00
execution time (avg/stddev): 9.9961/0.00
鮮橙汁更加了 64倍 但是event才增加了一倍.
測試mutex
mutex options:
--mutex-num=N total size of mutex array [4096]
--mutex-locks=N number of mutex locks to do per thread [50000]
--mutex-loops=N number of empty loops to do inside mutex lock [10000]
參數詳解:
--mutex-num=N 數組互斥的總大小。默認是4096
--mutex-locks=N 每個線程互斥鎖的數量。默認是50000
--mutex-loops=N 內部互斥鎖的空循環數量。默認是10000
互斥鎖:併發線程同時申請互斥鎖循環一定次數花費的時間{越少越好}
測試互斥鎖的性能,方式是模擬所有線程在同一時刻併發運行,並都短暫請求互斥鎖