[HOWTO]: Linux/Android常用調試工具

本文介紹Linux/Android一些常用的調試工具及其使用說明,作爲備忘,持續更新中。

注意:大部分都不是本人原創,是從各地方蒐集而來,原作者也未一一追溯,所以沒有出處說明,如有冒犯,請評論或私信,我會盡快修改。


FIQ-Debugger

fiq debugger是集成到內核中的一種系統調試手段。

FIQ在arm架構中相當於nmi中斷,fiq debugger把串口註冊成fiq中斷,在串口fiq中斷服務程序中集成了一些系統調試命令。

一般情況下串口是普通的console模式,minicom下輸入切換命令"Ctrl + A + F",串口會切換到fiq debugger模式。

因爲FIQ是不可屏蔽中斷,所以這種調試手段適合調試cpu被hang住的情況,可以在hang住的時候用fiq debugger打印出cpu的故障現場,常用命令是sysrq。

要使用fiq debugger,需要內核配置:

CONFIG_FIQ_DEBUGGER                         // 使能fiq debugger
CONFIG_FIQ_DEBUGGER_CONSOLE                 // fiq debugger與console可以互相切換
CONFIG_FIQ_DEBUGGER_CONSOLE_DEFAULT_ENABLE  // 啓動時默認串口在console模式
Fiq debugger相關使用命令:

debug> help
FIQ Debugger commands:
 pc            PC status
 regs          Register dump
 allregs       Extended Register dump
 bt            Stack trace
 reboot [<c>]  Reboot with command <c>
 reset [<c>]   Hard reset with command <c>
 irqs          Interupt status
 sleep         Allow sleep while in FIQ
 nosleep       Disable sleep while in FIQ
 console       Switch terminal to console
 cpu           Current CPU
 cpu <number>  Switch to CPU<number>
 ps            Process list
 sysrq         sysrq options
 sysrq <param> Execute sysrq with <param>


SysRq

在定位死機問題時,有時會碰到這樣的場景:系統掛死,但是又不復位。系統不主動復位就無法獲得復位之前打印出的故障堆棧信息,在這種情況下,如果系統中斷還是使能的情況下,可以使用組合鍵調用sysrq的方式來主動dump出系統堆棧信息。

要想啓用SysRq,需要在配置內核選項CONFIG_MAGIC_SYSRQ。對於支持SysRq的內核,/proc/sys/kernel/sysrq控制SysRq的啓用與否。關於 sysrq的更多描述,請參考內核文檔Documentation/sysrq.txt。

SysRq一系列的調試命令如下:

*  What are the 'command' keys?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'b'     - Will immediately reboot the system without syncing or unmounting your disks.
'c'    - Will perform a system crash by a NULL pointer dereference. A crashdump will be taken if configured.
'd'    - Shows all locks that are held.
'e'     - Send a SIGTERM to all processes, except for init.
'f'    - Will call oom_kill to kill a memory hog process.
'g'    - Used by kgdb (kernel debugger)
'h'     - Will display help (actually any other key than those listed here will display help. but 'h' is easy to remember :-)
'i'     - Send a SIGKILL to all processes, except for init.
'j'     - Forcibly "Just thaw it" - filesystems frozen by the FIFREEZE ioctl.
'k'     - Secure Access Key (SAK) Kills all programs on the current virtual console. NOTE: See important comments below in SAK section.
'l'     - Shows a stack backtrace for all active CPUs.
'm'     - Will dump current memory info to your console.
'n'    - Used to make RT tasks nice-able
'o'     - Will shut your system off (if configured and supported).
'p'     - Will dump the current registers and flags to your console.
'q'     - Will dump per CPU lists of all armed hrtimers (but NOT regular timer_list timers) and detailed information about all
          clockevent devices.
'r'     - Turns off keyboard raw mode and sets it to XLATE.
's'     - Will attempt to sync all mounted filesystems.
't'     - Will dump a list of current tasks and their information to your console.
'u'     - Will attempt to remount all mounted filesystems read-only.
'v'    - Forcefully restores framebuffer console 'v'    - Causes ETM buffer dump [ARM-specific]
'w'    - Dumps tasks that are in uninterruptable (blocked) state.
'x'    - Used by xmon interface on ppc/powerpc platforms.
'y'    - Show global CPU Registers [SPARC-64 specific]
'z'    - Dump the ftrace buffer
'0'-'9' - Sets the console log level, controlling which kernel messages will be printed to your console. ('0', for example would make
          it so that only emergency messages like PANICs or OOPSes would make it to your console.)
*  Okay, so what can I use them for?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
如我們調試hang(死機無響應問題)時,需要找出進程狀態是D的進程(這種進程是Uninterruptible Sleep,不接受任何外來信號,即是說用kill無法殺死這些進程):

echo w > /proc/sysrq-trigger

P.S. 列一下Process/Thread 狀態:

"R (running)",  /*   0 */
"S (sleeping)",  /*   1 */
"D (disk sleep)", /*   2 */
"T (stopped)",  /*   4 */
"t (tracing stop)", /*   8 */
"Z (zombie)",  /*  16 */
"X (dead)",  /*  32 */
"x (dead)",  /*  64 */
"K (wakekill)",  /* 128 */
"W (waking)",  /* 256 */

通常一般的Process處於的狀態都是S(sleeping),而如果一旦發現處於如D(disk sleep)、T(stopped)、Z(zombie)等就要認真審查。

debuggerd

debuggerd是android的一個daemon進程,負責在進程異常出錯時,將進程的運行時信息dump出來供分析。debuggerd生成的coredump數據是以文本形式呈現,被保存在 /data/tombstone/ 目錄下(名字取的也很形象,tombstone是墓碑的意思),共可保存10個文件,當超過10個時,會覆蓋重寫最早生成的文件。從4.2版本開始,debuggerd同時也是一個實用工具:可以在不中斷進程執行的情況下打印當前進程的native堆棧;使用方法是:debuggerd -b <pid>

這可以協助我們分析進程執行行爲,但最有用的地方是:它可以非常簡單的定位到native進程中鎖死或錯誤邏輯引起的死循環的代碼位置。


devmem

busybox中集成了一個直接讀寫物理內存的工具devmem:

devmem is a small program that reads and writes from physical memory using /dev/mem.

Usage: devmem ADDRESS [WIDTH [VALUE]]

例如,我們需要了解一些GPIO引腳的配置,由於這些GPIO配置寄存器會映射到一個特別的內存段上,即SFR(Special Function Registers),我們讀取相應的內存地址就可以了,如下讀取0x13470000的值然後往0x13470000寫入0x0:

# busybox devmem 0x13470000 32                                 
0x00022222
# busybox devmem 0x13470000 32 0x0

--to be continued...

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章