Linux C/C++ 多線程死鎖的gdb調試方法


死鎖的原因就不多說了,本質上,就是有一些線程在請求鎖的時候,永遠也請求不到。

 

先把有死鎖的多線程代碼貼出來


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>

pthread_mutex_t g_smutex ; 

void * func(void *arg)
{
	int i=0;

	//lock

	pthread_mutex_lock( &g_smutex);
	
	for(i = 0 ;i < 0x7fffffff; i++)
	{

	}

	//forget unlock
	
	return NULL;
}

int main()
{
	pthread_t  thread_id_01;
	pthread_t  thread_id_02;
	pthread_t  thread_id_03;
	pthread_t  thread_id_04;
	pthread_t  thread_id_05;
	
	pthread_mutex_init( &g_smutex, NULL );

	pthread_create(&thread_id_01, NULL, func, NULL);
	pthread_create(&thread_id_02, NULL, func, NULL);
	pthread_create(&thread_id_03, NULL, func, NULL);
	pthread_create(&thread_id_04, NULL, func, NULL);
	pthread_create(&thread_id_05, NULL, func, NULL);

	while(1)
	{
		sleep(0xfff);
	}
	return 0;
}


第一個線程啓動func函數後,忘記unlock解鎖了,導致其他線程怎麼也獲得不到鎖,這裏就舉這種比較簡單的死鎖。

 

編譯:

gcc New0001.c -g -lpthread -o a.out

 

這裏加上-g是有必要的,加上-g可以產生調試信息,符號信息等。千萬不要對生成的a.out文件執行strip命令,strip會導致調試時看不到哪行代碼有問題。

 

 

第一種方法:

1.使用gdb a.out(可執行文件),並輸入r命令運行程序

 

gdb a.out

GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10

Copyright (C) 2015 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.  Type "show copying"

and "show warranty" for details.

This GDB was configured as "i686-linux-gnu".

Type "show configuration" for configuration details.

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>.

Find the GDB manual and other documentation resources online at:

<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

Type "apropos word" to search for commands related to "word"...

Reading symbols from a.out...done.

 

 

 

(gdb) r

Starting program: /share/a.out

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".

[New Thread 0xb7de6b40 (LWP 15436)]

[New Thread 0xb75e5b40 (LWP 15437)]

[New Thread 0xb6de4b40 (LWP 15438)]

[New Thread 0xb65e3b40 (LWP 15439)]

[New Thread 0xb5de2b40 (LWP 15440)]

[Thread 0xb7de6b40 (LWP 15436) exited]

 

2.在運行的過程中按下ctrl + c

 

^C

Program received signal SIGINT, Interrupt.

0xb7fdbbe8 in __kernel_vsyscall ()


 

3.查看線程棧信息,info stack,這個命令只能查看當前正在運行的某個線程的棧信息

 

(gdb) info stack

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7e9c3e6 in nanosleep () at ../sysdeps/unix/syscall-template.S:81

#2  0xb7e9c1a9 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138

#3  0x08048679 in main () at New0001.c:46

 

4.info threads查看所有線程id,前面有*的,代表正在運行的線程,其他沒有*的極有可能是在阻塞或者死鎖的。

 

(gdb) info threaads

  Id   Target Id         Frame

  6    Thread 0xb5de2b40 (LWP 15440) "a.out" 0xb7fdbbe8 in __kernel_vsyscall ()

  5    Thread 0xb65e3b40 (LWP 15439) "a.out" 0xb7fdbbe8 in __kernel_vsyscall ()

  4    Thread 0xb6de4b40 (LWP 15438) "a.out" 0xb7fdbbe8 in __kernel_vsyscall ()

  3    Thread 0xb75e5b40 (LWP 15437) "a.out" 0xb7fdbbe8 in __kernel_vsyscall ()

* 1    Thread 0xb7de7700 (LWP 15432) "a.out" 0xb7fdbbe8 in __kernel_vsyscall ()

 

 

5. thread apply all bt thread apply all  命令,gdb會讓所有線程都執行這個命令,比如命令爲bt,查看所有線程的具體的棧信息)

 

需要注意的是:如果系統運行着很多線程的時候,不可能使用thread  id(這個id比如上面的1 ,2 ,3, ,4, 5, 6),這樣要查到什麼時候呢 ,100個線程你還輸入100次嗎

 

因此最好還是直接使用thread apply all bt

 

 

(gdb)thread apply all bt

 

Thread 6 (Thread 0xb5de2b40 (LWP 15440)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7fb2302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb7fac5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb7faa1aa in start_thread (arg=0xb5de2b40) at pthread_create.c:333

#5  0xb7ed2fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 5 (Thread 0xb65e3b40 (LWP 15439)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7fb2302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb7fac5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb7faa1aa in start_thread (arg=0xb65e3b40) at pthread_create.c:333

#5  0xb7ed2fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 4 (Thread 0xb6de4b40 (LWP 15438)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7fb2302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb7fac5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb7faa1aa in start_thread (arg=0xb6de4b40) at pthread_create.c:333

#5  0xb7ed2fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 3 (Thread 0xb75e5b40 (LWP 15437)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7fb2302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb7fac5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb7faa1aa in start_thread (arg=0xb75e5b40) at pthread_create.c:333

#5  0xb7ed2fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 1 (Thread 0xb7de7700 (LWP 15432)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

---Type <return> to continue, or q <return> to quit---

#1  0xb7e9c3e6 in nanosleep () at ../sysdeps/unix/syscall-template.S:81

#2  0xb7e9c1a9 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138

#3  0x08048679 in main () at New0001.c:46

 

 

6.看到的lock_wait就是被死鎖的線程

 

多按照上述步驟運行幾次,看到那些線程老是出現lock_wait的,就很明顯可能是死鎖的線程了。

 

比如線程3

Thread 3 (Thread 0xb75e5b40 (LWP 15437)):

#0  0xb7fdbbe8 in __kernel_vsyscall ()

#1  0xb7fb2302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb7fac5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb7faa1aa in start_thread (arg=0xb75e5b40) at pthread_create.c:333

#5  0xb7ed2fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

 

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

就是死鎖的位置,可以從這裏開始定位代碼,看看哪個地方可能沒有釋放鎖。

 


第二種方法

 

先讓程序跑起來,打開另外一個會話,通過ps -aux| grep 可執行文件 ,

找到程序的進程號

 

ps -axu | grep a.out

root     15463  0.4  0.1  43320   732 pts/4    Sl+  19:29   0:03 ./a.out

root     15476  0.0  0.3   4540  1864 pts/6    S+   19:44   0:00 grep --color=auto a.out

 

由上可知進程號是 15463

 

 

1.使用gdb  attach  進程號

  或者是進入gdb後, attach 進程號

  或者是 gdb 可執行文件  進程號,此時也會自動attach

 

 

root@ubuntu:/share# gdb

GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10

Copyright (C) 2015 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.  Type "show copying"

and "show warranty" for details.

This GDB was configured as "i686-linux-gnu".

Type "show configuration" for configuration details.

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>.

Find the GDB manual and other documentation resources online at:

<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

Type "apropos word" to search for commands related to "word".

(gdb)

 

attach 進程號

 

(gdb) attach 15463

Attaching to process 15463

Reading symbols from /share/a.out...done.

Reading symbols from /lib/i386-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/libpthread-2.21.so...done.

done.

[New LWP 15467]

[New LWP 15466]

[New LWP 15465]

[New LWP 15464]

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".

Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/libc-2.21.so...done.

done.

Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/ld-2.21.so...done.

done.

0xb773abe8 in __kernel_vsyscall ()

 

2.查看線程信息

 

(gdb) info threads

  Id   Target Id         Frame

  5    Thread 0xb7545b40 (LWP 15464) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  4    Thread 0xb6d44b40 (LWP 15465) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  3    Thread 0xb6543b40 (LWP 15466) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  2    Thread 0xb5d42b40 (LWP 15467) "a.out" 0xb773abe8 in __kernel_vsyscall ()

* 1    Thread 0xb7546700 (LWP 15463) "a.out" 0xb773abe8 in __kernel_vsyscall ()

 

 

3.查看所有線程信息並執行bt

 

(gdb) thread apply all bt

 

Thread 5 (Thread 0xb7545b40 (LWP 15464)):

#0  0xb773abe8 in __kernel_vsyscall ()

#1  0xb7711302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb770b5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb77091aa in start_thread (arg=0xb7545b40) at pthread_create.c:333

#5  0xb7631fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 4 (Thread 0xb6d44b40 (LWP 15465)):

#0  0xb773abe8 in __kernel_vsyscall ()

#1  0xb7711302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb770b5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb77091aa in start_thread (arg=0xb6d44b40) at pthread_create.c:333

#5  0xb7631fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 3 (Thread 0xb6543b40 (LWP 15466)):

#0  0xb773abe8 in __kernel_vsyscall ()

#1  0xb7711302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb770b5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb77091aa in start_thread (arg=0xb6543b40) at pthread_create.c:333

#5  0xb7631fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 2 (Thread 0xb5d42b40 (LWP 15467)):

#0  0xb773abe8 in __kernel_vsyscall ()

#1  0xb7711302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb770b5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb77091aa in start_thread (arg=0xb5d42b40) at pthread_create.c:333

#5  0xb7631fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

Thread 1 (Thread 0xb7546700 (LWP 15463)):

#0  0xb773abe8 in __kernel_vsyscall ()

---Type <return> to continue, or q <return> to quit---

#1  0xb75fb3e6 in nanosleep () at ../sysdeps/unix/syscall-template.S:81

#2  0xb75fb1a9 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138

#3  0x08048679 in main () at New0001.c:46

 

 

 

4.選有lock_wait的來查看以下

比如線程 gdb id 4的線程

(gdb) thread 4

[Switching to thread 4 (Thread 0xb6d44b40 (LWP 15465))]

#0  0xb773abe8 in __kernel_vsyscall ()

(gdb) bt

#0  0xb773abe8 in __kernel_vsyscall ()

#1  0xb7711302 in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144

#2  0xb770b5fe in __GI___pthread_mutex_lock (mutex=0x804a030 <g_smutex>) at ../nptl/pthread_mutex_lock.c:80

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

#4  0xb77091aa in start_thread (arg=0xb6d44b40) at pthread_create.c:333

#5  0xb7631fde in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:122

 

查看棧上的第三幀

(gdb) frame 3

#3  0x080485b5 in func (arg=0x0) at New0001.c:16

16 pthread_mutex_lock( &g_smutex);

調用鎖阻塞了

 

(gdb) p  g_smutex

$1 = {__data = {__lock = 2, __count = 0, __owner = 15468, __kind = 0, __nusers = 1, {__elision_data = {__espins = 0,

        __elision = 0}, __list = {__next = 0x0}}},

  __size = "\002\000\000\000\000\000\000\000l<\000\000\000\000\000\000\001\000\000\000\000\000\000", __align = 2}

 

 

鎖的擁有者線程id15468,但該線程id已經結束,說明是線程結束了,忘記解鎖了。

 

附上第二步看到的僅剩下的線程

(gdb) info threads

  Id   Target Id         Frame

  5    Thread 0xb7545b40 (LWP 15464) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  4    Thread 0xb6d44b40 (LWP 15465) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  3    Thread 0xb6543b40 (LWP 15466) "a.out" 0xb773abe8 in __kernel_vsyscall ()

  2    Thread 0xb5d42b40 (LWP 15467) "a.out" 0xb773abe8 in __kernel_vsyscall ()

* 1    Thread 0xb7546700 (LWP 15463) "a.out" 0xb773abe8 in __kernel_vsyscall ()

 



第三種方法不是gdb,是pstack工具

使用方法:pstack   進程號

注意pstack不支持64位


並且我的ubuntu系統莫名使用不了pstack來查看,pstack已經安裝了。


root@ubuntu:/share# pstack 15463


15463: ./a.out
(No symbols found in )
(No symbols found in /lib/i386-linux-gnu/libc.so.6)
(No symbols found in /lib/ld-linux.so.2)
0xb773abe8: _fini + 0x25f14 (0, 0, 0, 0, 0, 0) + 400d04fc
crawl: Input/output error
Error tracing through process 15463


誰懂這是啥原因!





發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章