記一次PyQT5 core dump調試過程

1. 首先設置系統允許生成core dump文件

步驟一:開啓core dump文件生成

ulimit -c unlimited

步驟二:設置core dump文件位置

vi /etc/sysctl.conf

修改(添加)如下兩個變量

kernel.core_pattern =/var/core/core_%e_%p

kernel.core_uses_pid= 0

這裏是改爲生成目錄在/var/core/,%e代表程序名稱,%p是進程ID

如果想直接生成在可執行文件相同目錄,前面不要加任何目錄,直接

kernel.core_pattern =core_%e_%p

步驟三:讓修改生效

sysctl -p/etc/sysctl.conf

2. 準備工作

準備工作

安裝 gdb 和 python2.7-dbg:

sudo apt-get install gdb python2.7-dbg

設置 /proc/sys/kernel/yama/ptrace_scope:

echo 0 |sudo tee /proc/sys/kernel/yama/ptrace_scope

3. gdb調試Python core文件的方法

使用gdb打開並調試core文件

gdb python core
$ gdb python core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/04/9b3068eb18127661de41257e012a54934fb0ee.debug...done.
done.


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `python2 mantra_hmi_pro.py'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f04fdf36428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
54	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f047f7fe700 (LWP 31079))]

可用的 python 相關的命令

可以通過輸入 py 然後加 tab 鍵的方式來查看可用的命令:

(gdb) py
py-bt               py-down             py-locals           py-up               python-interactive
py-bt-full          py-list             py-print            python

可以通過 help cmd 查看各個命令的說明:

(gdb) help py-bt
Display the current python frame and all the frames within its call stack (if any)
(gdb) py-list 
 258                        else:
 259                            self.window.label_3.setText("Joint3 ( %.3f deg )" % float(curr_joints[2] / pi * 180.0))
 260                    if curr_joints[3] < 0:
 261                        if float(curr_joints[3] / pi * 180.0) <= -100:
 262                            self.window.label_4.setText("Joint4 (%.2f deg )" % float(curr_joints[3] / pi * 180.0))
>263                        else:
 264                            self.window.label_4.setText("Joint4 (%.3f deg )" % float(curr_joints[3] / pi * 180.0))
 265                    else:
 266                        if float(curr_joints[3] / pi * 180.0) >= 100:
 267                            self.window.label_4.setText("Joint4 ( %.2f deg )" % float(curr_joints[3] / pi * 180.0))
 268                        else:
(gdb) py-bt
Traceback (most recent call first):
  File "mantra_hmi_pro.py", line 264, in run
    self.window.label_4.setText("Joint4 (%.3f deg )" % float(curr_joints[3] / pi * 180.0))

 可以看出,程序掛在264行,self.window.label_4.setText函數處

(gdb) bt
#0  0x00007f04fdf36428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f04fdf3802a in __GI_abort () at abort.c:89
#2  0x00007f04fdf787ea in __libc_message (do_abort=do_abort@entry=2, 
    fmt=fmt@entry=0x7f04fe091ed8 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f04fdf83651 in malloc_printerr (ar_ptr=0x7f047f7fd400, ptr=0x7f0470003bdf, 
    str=0x7f04fe0922e0 "malloc(): memory corruption (fast)", action=3) at malloc.c:5006
#4  _int_malloc (av=av@entry=0x7f0470000020, bytes=bytes@entry=68) at malloc.c:3386
#5  0x00007f04fdf85184 in __GI___libc_malloc (bytes=68) at malloc.c:2913
#6  0x00007f04e254ee28 in QArrayData::allocate(unsigned long, unsigned long, unsigned long, QFlags<QArrayData::AllocationOption>) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#7  0x00007f04e25dba63 in QString::QString(int, Qt::Initialization) ()
   from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#8  0x00007f04e278c5c7 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#9  0x00007f04e25e2182 in QString::fromUtf8_helper(char const*, int) ()
   from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#10 0x00007f04e25e21f4 in QString::fromAscii_helper(char const*, int) ()
   from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#11 0x00007f04e2b93523 in ?? ()
   from /usr/lib/python2.7/dist-packages/PyQt5/QtCore.x86_64-linux-gnu.so
#12 0x00007f04df802c3c in ?? () from /usr/lib/python2.7/dist-packages/sip.x86_64-linux-gnu.so
#13 0x00007f04df804ab0 in ?? () from /usr/lib/python2.7/dist-packages/sip.x86_64-linux-gnu.so
#14 0x00007f04df8052a3 in ?? () from /usr/lib/python2.7/dist-packages/sip.x86_64-linux-gnu.so
#15 0x00007f04df805510 in ?? () from /usr/lib/python2.7/dist-packages/sip.x86_64-linux-gnu.so
#16 0x00007f04db03d8fc in ?? ()
   from /usr/lib/python2.7/dist-packages/PyQt5/QtWidgets.x86_64-linux-gnu.so
---Type <return> to continue, or q <return> to quit---
#17 0x00000000004bc9ba in call_function (oparg=<optimized out>, pp_stack=0x7f047f7fd920)
    at ../Python/ceval.c:4350
#18 PyEval_EvalFrameEx () at ../Python/ceval.c:2987
#19 0x00000000004ba036 in PyEval_EvalCodeEx () at ../Python/ceval.c:3582
#20 0x00000000004d5909 in function_call.lto_priv () at ../Objects/funcobject.c:523
#21 0x00000000004eec9e in PyObject_Call (kw=0x0, 
    arg=(<WindowThread(window=<MyWindow(fp=<file at remote 0x7f04da327d20>, xyz_step=<float at remote 0x237e5a0>, group='arm', mCmButton_1=<QPushButton at remote 0x7f04c40ada68>, label_18=<QLabel at remote 0x7f04c40aaa68>, label_step_1=<QLabel at remote 0x7f04d9f0bd60>, comboBox=<QComboBox at remote 0x7f04c40aa0e8>, label_14=<QLabel at remote 0x7f04c40ad8a0>, label_3=<QLabel at remote 0x7f04c40aa8a0>, label_12=<QLabel at remote 0x7f04c40ad640>, label_13=<QLabel at remote 0x7f04c40ad808>, label_10=<QLabel at remote 0x7f04c40ad770>, label_11=<QLabel at remote 0x7f04c40ad478>, setHomeButton=<QPushButton at remote 0x7f04d9f0b8a0>, label_5=<QLabel at remote 0x7f04d9f0bf28>, label_6=<QLabel at remote 0x7f04c40aa3e0>, label_7=<QLabel at remote 0x7f04d9f0bc30>, label_1=<QLabel at remote 0x7f04c40aa938>, label_2=<QLabel at remote 0x7f04c40aa808>, horizontalSlider=<QSlider at remote 0x7f04d9f0b770>, joint_step=<float at remote 0x237e588>, label_8=<QLabel at remote 0x7f04d9f0b808>, label_9=<QLabel at remote 0x7f04c40ad348>, pus...(truncated), func=<function at remote 0x7f04d8aee758>) at ../Objects/abstract.c:2546
#22 instancemethod_call.lto_priv () at ../Objects/classobject.c:2602
#23 0x00000000004a5a9e in PyObject_Call () at ../Objects/abstract.c:2546
#24 0x00000000004c6380 in PyEval_CallObjectWithKeywords () at ../Python/ceval.c:4219
#25 0x00007f04df801e84 in ?? () from /usr/lib/python2.7/dist-packages/sip.x86_64-linux-gnu.so
#26 0x00007f04e2afd700 in ?? ()
   from /usr/lib/python2.7/dist-packages/PyQt5/QtCore.x86_64-linux-gnu.so
#27 0x00007f04e2bbf1b3 in ?? ()
   from /usr/lib/python2.7/dist-packages/PyQt5/QtCore.x86_64-linux-gnu.so
#28 0x00007f04e254d7be in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
---Type <return> to continue, or q <return> to quit---
#29 0x00007f04fe2d26ba in start_thread (arg=0x7f047f7fe700) at pthread_create.c:333
#30 0x00007f04fe00841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

從調用堆棧可以看出,程序掛是因爲QArrayData::allocate函數中malloc失敗,即問題在於Python傳給QT的QString::QString。

代碼中setText傳入的是Python字符串,在pyqt5中,這是合法的。但是具體什麼原因導致傳入字符串時產生core dump原因未知。

4. 問題解決

上面的debug已經找到了導致core dump的代碼,即PyQT5的label.setText()函數,但是這個函數僅僅是寫一下界面中label的字符串而已,沒有數組訪問越界等問題,而最終的錯誤出現在malloc,並且是malloc(): memory corruption即內存破壞。

雖然沒有找到問題的直接解決辦法,因爲label.setText()不存在多線程同步互斥方面的問題,僅一個線程會執行此函數。

重點來了,label.setText()所處的這個線程是界面刷新線程,即用來更新界面數據的線程。要解決這個問題,肯定是從這個界面刷新線程入手。先看看,界面刷新線程的寫法:

class WindowThread(QtCore.QThread):
    def __init__(self, window_):
        super(WindowThread, self).__init__()
        self.window = window_

    def run(self):
        global movej_rad_deg_flag
        r = rospy.Rate(2)  # 2hz
        time.sleep(1)  # 休眠一秒等待界面初始化

        while not rospy.is_shutdown():
            # print(curr_joints)
            # 關節角刷新顯示
            if movej_rad_deg_flag is 0:
                if curr_joints[0] < 0:
                    self.window.label_1.setText("Joint1 (%.3f rad )" % float(curr_joints[0]))
            ...
            ...

這個線程是可以工作的,它繼承自QtCore.QThread,並且傳入主窗口實例,關鍵就是這個傳入實例的過程:self.window = window_,這應該是導致界面中label更新字符串時出錯的原因。但經過測試,線程中的self.window和傳入的window_的id是一樣的,即兩個是同一對象,指向同一內存地址。這就很玄學了!!!

不管怎麼樣,反正這麼寫界面更新線程肯定是存在問題的,參考博文——PyQt5多線程刷新界面防假死,重寫下界面更新線程:

class UpdateThread(QtCore.QThread):
    """
    界面刷新線程,通過信號通知主窗口類中實現的刷新函數

    這很重要,通過將主窗口實例傳入界面刷新線程實例的方法不可行
    """
    update_signal = pyqtSignal()

    def __init__(self):
        super(UpdateThread, self).__init__()

    def __del__(self):
        self.wait()

    def run(self):
        r = rospy.Rate(5)  # 2hz
        while not rospy.is_shutdown():
            self.update_signal.emit()
            r.sleep()
    
    def stop(self):
        self.terminate()

主窗口類:

class MyWindow(QtWidgets.QWidget, Ui_Form):
    def __init__(self):
        super(MyWindow, self).__init__()
        
        # 啓動界面刷新進程
        self.update_thread = UpdateThread()  # 創建線程
        self.update_thread.update_signal.connect(self.update)  # 連接信號
        self.update_thread.start()

    def update(self):
        self.label1.setText("test")

即在主窗口對象中創建界面更新線程對象,然後在界面更新線程中通過信號,通知主窗口對象中的update()函數更新界面信息。這樣的話就不用傳遞主窗口對象到線程對象中了,果然,這的確解決了問題!!!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章