玩轉ptrace(一)

轉自：http://blog.csdn.net/silentvoid/article/details/1477439

by Pradeep Padala

Created 2002-11-01 02:00

譯者序：在開發Hust Online Judge的過程中，查閱了不少資料，關於調試器技術的資料在網上是很少，即便是UNIX編程鉅著《UNIX環境高級編程》中，相關內容也不多，直到我在http://www.linuxjournal.com上找到這篇文章，如獲至寶，特翻譯之，作爲鄙人翻譯技術文檔的第一次嘗試，必定會有不少蹩腳之處，各位就將就一下吧，歡迎大力拍磚。

你想過怎麼實現對系統調用的攔截嗎？你嘗試過通過改變系統調用的參數來愚弄你的系統kernel嗎？你想過調試器是如何使運行中的進程暫停並且控制它嗎？

你可能會開始考慮怎麼使用複雜的kernel編程來達到目的，那麼，你錯了。實際上Linux提供了一種優雅的機制來完成這些：ptrace系統函數。 ptrace提供了一種使父進程得以監視和控制其它進程的方式，它還能夠改變子進程中的寄存器和內核映像，因而可以實現斷點調試和系統調用的跟蹤。

使用ptrace，你可以在用戶層攔截和修改系統調用(sys call)

在這篇文章中，我們將學習如何攔截一個系統調用，然後修改它的參數。在本文的第二部分我們將學習更先進的技術：設置斷點，插入代碼到一個正在運行的程序中；我們將潛入到機器內部，偷窺和纂改進程的寄存器和數據段。

基本知識

操作系統提供了一種標準的服務來讓程序員實現對底層硬件和服務的控制（比如文件系統），叫做系統調用(system calls)。當一個程序需要作系統調用的時候，它將相關參數放進系統調用相關的寄存器，然後調用軟中斷0x80，這個中斷就像一個讓程序得以接觸到內核模式的窗口，程序將參數和系統調用號交給內核，內核來完成系統調用的執行。

在i386體系中(本文中所有的代碼都是面向i386體系)，系統調用號將放入%eax,它的參數則依次放入%ebx, %ecx, %edx, %esi 和 %edi。比如，在以下的調用

Write(2, “Hello”, 5)

的彙編形式大概是這樣的

movl $4, %eax

movl $2, %ebx

movl $hello, %ecx

movl $5, %edx

int $0x80

這裏的$hello指向的是標準字符串”Hello”。

那麼，ptrace會在什麼時候出現呢？在執行系統調用之前，內核會先檢查當前進程是否處於被“跟蹤”(traced)的狀態。如果是的話，內核暫停當前進程並將控制權交給跟蹤進程，使跟蹤進程得以察看或者修改被跟蹤進程的寄存器。

讓我們來看一個例子，演示這個跟蹤程序的過程

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <unistd.h>

#include <linux/user.h> /**//* For constants

ORIG_EAX etc */

int main()

...{

pid_t child;

long orig_eax;

child = fork();

if(child == 0) ...{

ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl("/bin/ls", "ls", NULL);

}

else ...{

wait(NULL);

orig_eax = ptrace(PTRACE_PEEKUSER,

child, 4 * ORIG_EAX,

NULL);

printf("The child made a "

"system call %ld ", orig_eax);

ptrace(PTRACE_CONT, child, NULL, NULL);

}

return 0;

}

運行這個程序，將會在輸出ls命令的結果的同時，輸出:

The child made a system call 11

說明：11是execve的系統調用號，這是該程序調用的第一個系統調用。

想知道系統調用號的詳細內容，察看 /usr/include/asm/unistd.h。

在以上的示例中，父進程fork出了一個子進程，然後跟蹤它。在調用exec函數之前，子進程用PTRACE_TRACEME作爲第一個參數調用了ptrace函數，它告訴內核：讓別人跟蹤我吧！然後，在子進程調用了execve()之後，它將控制權交還給父進程。當時父進程正使用wait()函數來等待來自內核的通知，現在它得到了通知，於是它可以開始察看子進程都作了些什麼，比如看看寄存器的值之類。

出現系統調用之後，內核會將eax中的值（此時存的是系統調用號）保存起來，我們可以使用PTRACE_PEEKUSER作爲ptrace的第一個參數來讀到這個值。

我們察看完系統調用的信息後，可以使用PTRACE_CONT作爲ptrace的第一個參數，調用ptrace使子進程繼續系統調用的過程。

ptrace函數的參數

Ptrace有四個參數

long ptrace(enum __ptrace_request request,

            pid_t pid,

            void *addr,

            void *data);

第一個參數決定了ptrace的行爲與其它參數的使用方法，可取的值有:

PTRACE_ME

PTRACE_PEEKTEXT

PTRACE_PEEKDATA

PTRACE_PEEKUSER

PTRACE_POKETEXT

PTRACE_POKEDATA

PTRACE_POKEUSER

PTRACE_GETREGS

PTRACE_GETFPREGS,

PTRACE_SETREGS

PTRACE_SETFPREGS

PTRACE_CONT

PTRACE_SYSCALL,

PTRACE_SINGLESTEP

PTRACE_DETACH

在下文中將對這些常量的用法進行說明。

讀取系統調用的參數

通過將PTRACE_PEEKUSER作爲ptrace 的第一個參數進行調用，可以取得與子進程相關的寄存器值。

先看下面這個例子

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <unistd.h>

#include <linux/user.h>

#include <sys/syscall.h> /**//* For SYS_write etc */

int main()

...{

pid_t child;

long orig_eax, eax;

long params[3];

int status;

int insyscall = 0;

child = fork();

if(child == 0) ...{

ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl("/bin/ls", "ls", NULL);

}

else ...{

while(1) ...{

wait(&status);

if(WIFEXITED(status))

break;

orig_eax = ptrace(PTRACE_PEEKUSER,

child, 4 * ORIG_EAX, NULL);

if(orig_eax == SYS_write) ...{

if(insyscall == 0) ...{

/**//* Syscall entry */

insyscall = 1;

params[0] = ptrace(PTRACE_PEEKUSER,

child, 4 * EBX,

NULL);

params[1] = ptrace(PTRACE_PEEKUSER,

child, 4 * ECX,

NULL);

params[2] = ptrace(PTRACE_PEEKUSER,

child, 4 * EDX,

NULL);

printf("Write called with "

"%ld, %ld, %ld ",

params[0], params[1],

params[2]);

}

else ...{ /**//* Syscall exit */

eax = ptrace(PTRACE_PEEKUSER,

child, 4 * EAX, NULL);

printf("Write returned "

"with %ld ", eax);

insyscall = 0;

}

ptrace(PTRACE_SYSCALL,

child, NULL, NULL);

}

return 0;

}

這個程序的輸出是這樣的

ppadala@linux:~/ptrace > ls

a.out        dummy.s      ptrace.txt

libgpm.html  registers.c  syscallparams.c

dummy        ptrace.html  simple.c

ppadala@linux:~/ptrace > ./a.out

Write called with 1, 1075154944, 48

a.out        dummy.s      ptrace.txt

Write returned with 48

Write called with 1, 1075154944, 59

libgpm.html  registers.c  syscallparams.c

Write returned with 59

Write called with 1, 1075154944, 30

dummy        ptrace.html  simple.c

Write returned with 30

以上的例子中我們跟蹤了write系統調用，而ls命令的執行將產生三個write系統調用。使用PTRACE_SYSCALL作爲ptrace的第一個參數，使內核在子進程做出系統調用或者準備退出的時候暫停它。這種行爲與使用PTRACE_CONT，然後在下一個系統調用/進程退出時暫停它是等價的。

在前一個例子中，我們用PTRACE_PEEKUSER來察看write系統調用的參數。系統調用的返回值會被放入%eax。

wait函數使用status變量來檢查子進程是否已退出。它是用來判斷子進程是被ptrace暫停掉還是已經運行結束並退出。有一組宏可以通過status的值來判斷進程的狀態，比如WIFEXITED等，詳情可以察看wait(2) man。

讀取寄存器的值

如果你想在系統調用或者進程終止的時候讀取它的寄存器，使用前面那個例子的方法是可以的，但是這是笨拙的方法。使用PRACE_GETREGS作爲ptrace的第一個參數來調用，可以只需一次函數調用就取得所有的相關寄存器值。

獲得寄存器值得例子如下：

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <unistd.h>

#include <linux/user.h>

#include <sys/syscall.h>

int main()

...{

pid_t child;

long orig_eax, eax;

long params[3];

int status;

int insyscall = 0;

struct user_regs_struct regs;

child = fork();

if(child == 0) ...{

ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl("/bin/ls", "ls", NULL);

}

else ...{

while(1) ...{

wait(&status);

if(WIFEXITED(status))

break;

orig_eax = ptrace(PTRACE_PEEKUSER,

child, 4 * ORIG_EAX,

NULL);

if(orig_eax == SYS_write) ...{

if(insyscall == 0) ...{

/**//* Syscall entry */

insyscall = 1;

ptrace(PTRACE_GETREGS, child,

NULL, &regs);

printf("Write called with "

"%ld, %ld, %ld ",

regs.ebx, regs.ecx,

regs.edx);

}

else ...{ /**//* Syscall exit */

eax = ptrace(PTRACE_PEEKUSER,

child, 4 * EAX,

NULL);

printf("Write returned "

"with %ld ", eax);

insyscall = 0;

}

ptrace(PTRACE_SYSCALL, child,

NULL, NULL);

}

return 0;

}

這段代碼與前面的例子是比較相似的，不同的是它使用了PTRACE_GETREGS。其中的user_regs_struct結構是在<linux/user.h>中定義的。

來點好玩的

現在該做點有意思的事情了，我們將要把傳給write系統調用的字符串給反轉。

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <unistd.h>

#include <linux/user.h>

#include <sys/syscall.h>

const int long_size = sizeof(long);

void reverse(char *str)

...{

int i, j;

char temp;

for(i = 0, j = strlen(str) - 2;

i <= j; ++i, --j) ...{

temp = str[i];

str[i] = str[j];

str[j] = temp;

}

void getdata(pid_t child, long addr,

char *str, int len)

...{

char *laddr;

int i, j;

union u ...{

long val;

char chars[long_size];

}data;

i = 0;

j = len / long_size;

laddr = str;

while(i < j) ...{

data.val = ptrace(PTRACE_PEEKDATA,

child, addr + i * 4,

NULL);

memcpy(laddr, data.chars, long_size);

++i;

laddr += long_size;

}

j = len % long_size;

if(j != 0) ...{

data.val = ptrace(PTRACE_PEEKDATA,

child, addr + i * 4,

NULL);

memcpy(laddr, data.chars, j);

}

str[len] = '';

}

void putdata(pid_t child, long addr,

char *str, int len)

...{

char *laddr;

int i, j;

union u ...{

long val;

char chars[long_size];

}data;

i = 0;

j = len / long_size;

laddr = str;

while(i < j) ...{

memcpy(data.chars, laddr, long_size);

ptrace(PTRACE_POKEDATA, child,

addr + i * 4, data.val);

++i;

laddr += long_size;

}

j = len % long_size;

if(j != 0) ...{

memcpy(data.chars, laddr, j);

ptrace(PTRACE_POKEDATA, child,

addr + i * 4, data.val);

}

int main()

...{

pid_t child;

child = fork();

if(child == 0) ...{

ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl("/bin/ls", "ls", NULL);

}

else ...{

long orig_eax;

long params[3];

int status;

char *str, *laddr;

int toggle = 0;

while(1) ...{

wait(&status);

if(WIFEXITED(status))

break;

orig_eax = ptrace(PTRACE_PEEKUSER,

child, 4 * ORIG_EAX,

NULL);

if(orig_eax == SYS_write) ...{

if(toggle == 0) ...{

toggle = 1;

params[0] = ptrace(PTRACE_PEEKUSER,

child, 4 * EBX,

NULL);

params[1] = ptrace(PTRACE_PEEKUSER,

child, 4 * ECX,

NULL);

params[2] = ptrace(PTRACE_PEEKUSER,

child, 4 * EDX,

NULL);

str = (char *)calloc((params[2]+1)

* sizeof(char));

getdata(child, params[1], str,

params[2]);

reverse(str);

putdata(child, params[1], str,

params[2]);

}

else ...{

toggle = 0;

}

ptrace(PTRACE_SYSCALL, child, NULL, NULL);

}

return 0;

}

輸出是這樣的：

ppadala@linux:~/ptrace > ls

a.out        dummy.s      ptrace.txt

libgpm.html  registers.c  syscallparams.c

dummy        ptrace.html  simple.c

ppadala@linux:~/ptrace > ./a.out

txt.ecartp      s.ymmud      tuo.a

c.sretsiger     lmth.mpgbil  c.llacys_egnahc

c.elpmis        lmth.ecartp  ymmud

這個例子中涵蓋了前面討論過的所有知識點，當然還有些新的內容。這裏我們用PTRACE_POKEDATA作爲第一個參數，以此來改變子進程中的變量值。它以與PTRACE_PEEKDATA相似的方式工作，當然，它不只是偷窺變量的值了，它可以修改它們。

單步

ptrace提供了對子進程進行單步的功能。 ptrace(PTRACE_SINGLESTEP, …) 會使內核在子進程的每一條指令執行前先將其阻塞，然後將控制權交給父進程。下面的例子可以查出子進程當前將要執行的指令。爲了便於理解，我用匯編寫了這個受控程序，而不是讓你爲c的庫函數到底會作那些系統調用而頭痛。

以下是被控程序的代碼 dummy1.s，使用gcc –o dummy1 dummy1.s來編譯

.data

hello:

    .string "hello world/n"

.globl  main

main:

    movl    $4, %eax

    movl    $2, %ebx

    movl    $hello, %ecx

    movl    $12, %edx

    int     $0x80

    movl    $1, %eax

    xorl    %ebx, %ebx

    int     $0x80

ret

以下的程序則用來完成單步：

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <unistd.h>

#include <linux/user.h>

#include <sys/syscall.h>

int main()

...{

pid_t child;

const int long_size = sizeof(long);

child = fork();

if(child == 0) ...{

ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl("./dummy1", "dummy1", NULL);

}

else ...{

int status;

union u ...{

long val;

char chars[long_size];

}data;

struct user_regs_struct regs;

int start = 0;

long ins;

while(1) ...{

wait(&status);

if(WIFEXITED(status))

break;

ptrace(PTRACE_GETREGS,

child, NULL, &regs);

if(start == 1) ...{

ins = ptrace(PTRACE_PEEKTEXT,

child, regs.eip,

NULL);

printf("EIP: %lx Instruction "

"executed: %lx ",

regs.eip, ins);

}

if(regs.orig_eax == SYS_write) ...{

start = 1;

ptrace(PTRACE_SINGLESTEP, child,

NULL, NULL);

}

else

ptrace(PTRACE_SYSCALL, child,

NULL, NULL);

}

return 0;

}

程序的輸出是這樣的：

你可能需要察看Intel的用戶手冊來了解這些指令代碼的意思。

更復雜的單步，比如設置斷點，則需要很仔細的設計和更復雜的代碼纔可以實現。

在第二部分，我們將會看到如何在程序中加入斷點，以及將代碼插入到已經在運行的程序中。

OGRE架構

Posix 多線程編程—線程屬性

Linux實現 memcpy和memmove

OCTREE 教程

STL中map用法詳解

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結