sysbench的部分基準性能測試學習

sysbench的部分基準性能測試學習


命令

Compiled-in tests:
  fileio - File I/O test
  cpu - CPU performance test
  memory - Memory functions speed test
  threads - Threads subsystem performance test
  mutex - Mutex performance test

通用參數
General options:
  --threads=N                     number of threads to use [1]
  --events=N                      limit for total number of events [0]
  --time=N                        limit for total execution time in seconds [10]
  --warmup-time=N                 execute events for this many seconds with statistics disabled before the actual benchmark run with statistics enabled [0]
  --forced-shutdown=STRING        number of seconds to wait after the --time limit before forcing shutdown, or 'off' to disable [off]
  --thread-stack-size=SIZE        size of stack per thread [64K]
  --thread-init-timeout=N         wait time in seconds for worker threads to initialize [30]
  --rate=N                        average transactions rate. 0 for unlimited rate [0]
  --report-interval=N             periodically report intermediate statistics with a specified interval in seconds. 0 disables intermediate reports [0]
  --report-checkpoints=[LIST,...] dump full statistics and reset all counters at specified points in time. The argument is a list of comma-separated values representing the amount of time in seconds elapsed from start of test when report checkpoint(s) must be performed. Report checkpoints are off by default. []
  --debug[=on|off]                print more debugging info [off]
  --validate[=on|off]             perform validation checks where possible [off]
  --help[=on|off]                 print help and exit [off]
  --version[=on|off]              print version and exit [off]
  --config-file=FILENAME          File containing command line options
  --luajit-cmd=STRING             perform LuaJIT control command. This option is equivalent to 'luajit -j'. See LuaJIT documentation for more information

測試CPU

--cpu-max-prime=N

​sysbench的CPU測試是在指定時間內,
進行多輪次的素數計算。
除了1和它自身外,不能被其他自然數整除的數叫做素數(質數)。
一次event代表一輪的素數計算,即算出*–cpu-max-prime*以內的所有素數。

​能獲得的測量指標:

每秒完成的events數
N%events的耗時範圍。例:95%的events耗時在0.5ms以內
總耗時
完成的events總數
所有events的最小、最大、平均耗時
所有線程耗時總和
平均每線程完成events數/標準差
平均每線程耗時/標準差
來源: 
https://blog.csdn.net/squirrel100/article/details/120289743

測試CPU

sysbench --time=60 --threads=4 --report-interval=3  --cpu-max-prime=10000 cpu  run

比如我這邊的測試結果:
CPU speed:
    events per second: 13480.42

Throughput:
    events/s (eps):                      13480.4163
    time elapsed:                        60.0013s
    total number of events:              808843

Latency (ms):
         min:                                    0.30
         avg:                                    0.30
         max:                                    0.52
         95th percentile:                        0.30
         sum:                               239646.46

Threads fairness:
    events (avg/stddev):           202210.7500/49.35
    execution time (avg/stddev):   59.9116/0.01

測試內存

memory options:
  --memory-block-size=SIZE    # 內存塊大小 [1K]
  --memory-total-size=SIZE    # 傳輸數據的總大小 [100G]
  --memory-scope=STRING       # 內存訪問範圍 {global,local} [global]
  --memory-hugetlb[=on|off]   # 從HugeTLB池中分配內存 [off]
  --memory-oper=STRING        # 內存操作類型 {read, write, none} [write]
  --memory-access-mode=STRING # 內存訪問模式 {seq,rnd} [seq]

測試內存

sysbench --threads=8 --time=60 --report-interval=10  --memory-block-size=8K --memory-total-size=4096G --memory-access-mode=seq memory  run

注意 total-size 一定要足夠大才可以. 
要是太小測試出來的結果可能比較失真.

Total operations: 164485793 (2741367.30 per second)

1285045.26 MiB transferred (21416.93 MiB/sec)


Throughput:
    events/s (eps):                      2741367.2990
    time elapsed:                        60.0014s
    total number of events:              164485793

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    0.29
         95th percentile:                        0.00
         sum:                               346268.33

Threads fairness:
    events (avg/stddev):           20560724.1250/1260446.77
    execution time (avg/stddev):   43.2835/0.96

測試IO

# fileio options([]爲默認參數):
  --file-num=N                  # 創建的文件數量 [128]
  --file-block-size=N           # 在所有IO操作中使用的塊大小 [16384]
  --file-total-size=SIZE        # 要創建的文件的總大小 [2G]
  --file-test-mode=STRING       # 測試模式 {seqwr(順序寫), seqrewr(順序重寫), seqrd(順序讀), rndrd(隨機讀), rndwr(隨機寫), rndrw(隨機讀寫)}
  --file-io-mode=STRING         # 文件操作模式 {sync(同步),async(異步),mmap} [sync]
  --file-extra-flags=[LIST,...] # 用於打開文件的附加標誌列表 {sync,dsync,direct} []
  --file-fsync-freq=N           # 執行N條請求數量後執行fsync() (0 - don't use fsync()) [100]
  --file-fsync-all[=on|off]     # 每條寫指令後執行fsync() [off]
  --file-fsync-end[=on|off]     # 測試執行後執行fsync() [on]
  --file-fsync-mode=STRING      # 同步方式 {fsync, fdatasync} [fsync]
  --file-merged-requests=N      # 允許範圍內,最多合併IO請求數量 (0 - don't merge) [0]
  --file-rw-ratio=N             # 組合測試讀/寫比率 [1.5]

測試IO

# 線程數=8 每隔2s輸出一次結果 測試時間=10s
# 文件數=32 文件總大小=1G 文件操作模式=隨機讀寫
# 塊大小 8KB
sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw prepare

sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw run

sysbench fileio --threads=8 --report-interval=2 --time=10 --file-num=32 --file-total-size=1G --file-test-mode=rndrw prepare

測試結果一般爲:
Throughput:
         read:  IOPS=8498.15 132.78 MiB/s (139.23 MB/s)
         write: IOPS=5665.43 88.52 MiB/s (92.82 MB/s)
         fsync: IOPS=4472.09

Latency (ms):
         min:                                  0.00
         avg:                                  0.43
         max:                                715.34
         95th percentile:                      0.17
         sum:                              80180.16

測試線程

  --thread-yields=N      number of yields to do per request [1000]
  --thread-locks=N       number of locks per thread [8]
參數詳解: 
  --thread-yields=N      指定每個請求的壓力,默認爲1000
  --thread-locks=N       指定每個線程的鎖數量,默認爲8

線程調度:線程併發執行,循環響應信號量花費的時間{越少越好}
測試線程調度器的性能。對於高負載情況下測試線程調度器的行爲非常有用

測試線程

sysbench  --threads=64 --report-interval=2 --time=10 threads run 

注意 線程 64時的結果:
Throughput:
    events/s (eps):                      3413.1149
    time elapsed:                        10.0492s
    total number of events:              34299

Latency (ms):
         min:                                    0.65
         avg:                                   18.72
         max:                                  298.47
         95th percentile:                      125.52
         sum:                               642034.54

Threads fairness:
    events (avg/stddev):           535.9219/56.01
    execution time (avg/stddev):   10.0318/0.01

線程 1 時的結果
Throughput:
    events/s (eps):                      1595.8521
    time elapsed:                        10.0016s
    total number of events:              15961

Latency (ms):
         min:                                    0.61
         avg:                                    0.63
         max:                                    0.75
         95th percentile:                        0.64
         sum:                                 9996.12

Threads fairness:
    events (avg/stddev):           15961.0000/0.00
    execution time (avg/stddev):   9.9961/0.00

鮮橙汁更加了 64倍 但是event才增加了一倍. 

測試mutex

mutex options:

  --mutex-num=N        total size of mutex array [4096]
  --mutex-locks=N      number of mutex locks to do per thread [50000]
  --mutex-loops=N      number of empty loops to do inside mutex lock [10000]

參數詳解:

  --mutex-num=N    數組互斥的總大小。默認是4096
  --mutex-locks=N    每個線程互斥鎖的數量。默認是50000
  --mutex-loops=N    內部互斥鎖的空循環數量。默認是10000

互斥鎖:併發線程同時申請互斥鎖循環一定次數花費的時間{越少越好}
測試互斥鎖的性能,方式是模擬所有線程在同一時刻併發運行,並都短暫請求互斥鎖
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章