user版本連接wifi必現重啓問題總結

user版本連接wifi必現重啓問題總結

轉自:https://www.jianshu.com/p/f9c3792fac7c

問題現象

  • 復現步驟
  • Android 7.0平臺(剛bring up完成)
  • user版本只要連接特定wifi, system_server進程就必現native crash。
  • userdebug版本沒有此問題。

分析定位

初步分析

  • tombstone文件如下
ABI: 'x86_64'
pid: 5891, tid: 7173, name: Thread-8  >>> system_server <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x3400a6
  rax 000000006f528f00  rbx 00007fa0a04f7e30  rcx 00007fa0a04f7e01  rdx 0000000000000001
  rsi 00007fa0a04f7ed4  rdi 000000006f1ed630
  r8  0000000000000002  r9  00007fa0a04f7d38  r10 000000006f1c38d8  r11 0000000000000000
  r12 0000000000200015  r13 000000000034002e  r14 00007fa0a04f7ed8  r15 000000006f528f00
  cs  0000000000000033  ss  000000000000002b
  rip 00007fa0bde2338d  rbp 00007fa0a04f7d80  rsp 00007fa0a04f7be0  eflags 0000000000010206

backtrace:
  #00 pc 000000000057738d  /system/lib64/libart.so (_ZN3art12InvokeMethodERKNS_33ScopedObjectAccessAlreadyRunnableEP8_jobjectS4_S4_m+125)
  #01 pc 00000000004cfad8  /system/lib64/libart.so (_ZN3artL24Constructor_newInstance0EP7_JNIEnvP8_jobjectP13_jobjectArray+1432)
  #02 pc 00000000006e3d4d  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (java.io.UnixFileSystem.canonicalize0 [DEDUPED]+235)
  #03 pc 0000000000b22caf  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.x509.X500Name.asX500Principal+157)
  #04 pc 0000000000b327a7  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.x509.X509CertInfo.getX500Name+517)
  #05 pc 0000000000b34b45  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.x509.X509CertInfo.get+835)
  #06 pc 0000000000b2fe15  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.x509.X509CertImpl.getSubjectX500Principal+99)
  #07 pc 0000000000a93baf  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PolicyChecker.mergeExplicitPolicy+93)
  #08 pc 0000000000a9341e  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PolicyChecker.checkPolicy+2652)
  #09 pc 0000000000a98df1  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PolicyChecker.check+95)
  #10 pc 0000000000a91f9b  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PKIXMasterCertPathValidator.validate+2265)
  #11 pc 0000000000a907d1  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PKIXCertPathValidator.validate+2895)
  #12 pc 0000000000a911cc  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PKIXCertPathValidator.validate+1546)
  #13 pc 0000000000a9166b  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (sun.security.provider.certpath.PKIXCertPathValidator.engineValidate+233)
  #14 pc 0000000000819f52  /system/framework/x86_64/boot-core-oj.oat (offset 0x660000) (java.security.cert.CertPathValidator.validate+64)
  #15 pc 00000000000b6d14  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.verifyChain+1266)
  #16 pc 00000000000b5fab  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkTrustedRecursive+2697)
  #17 pc 00000000000b57f1  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkTrustedRecursive+719)
  #18 pc 00000000000b5bb0  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkTrustedRecursive+1678)
  #19 pc 00000000000b5207  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkTrusted+645)
  #20 pc 00000000000b54a7  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkTrusted+421)
  #21 pc 00000000000b7713  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.TrustManagerImpl.checkServerTrusted+273)
  #22 pc 00000000000acf15  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.Platform.checkServerTrusted+323)
  #23 pc 00000000000a4885  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.OpenSSLSocketImpl.verifyCertificateChain+787)
  #24 pc 00000000001a6564  /system/lib64/libart.so (art_quick_invoke_stub+756)
  #25 pc 00000000001b4727  /system/lib64/libart.so (_ZN3art9ArtMethod6InvokeEPNS_6ThreadEPjjPNS_6JValueEPKc+231)
  #26 pc 0000000000575967  /system/lib64/libart.so (_ZN3artL18InvokeWithArgArrayERKNS_33ScopedObjectAccessAlreadyRunnableEPNS_9ArtMethodEPNS_8ArgArrayEPNS_6JValueEPKc+87)
  #27 pc 00000000005771be  /system/lib64/libart.so (_ZN3art35InvokeVirtualOrInterfaceWithVarArgsERKNS_33ScopedObjectAccessAlreadyRunnableEP8_jobjectP10_jmethodIDP13__va_list_tag+382)
  #28 pc 000000000046507c  /system/lib64/libart.so (_ZN3art3JNI15CallVoidMethodVEP7_JNIEnvP8_jobjectP10_jmethodIDP13__va_list_tag+860)
  #29 pc 000000000001eb51  /system/lib64/libjavacrypto.so (_ZN7_JNIEnv14CallVoidMethodEP8_jobjectP10_jmethodIDz+161)
  #30 pc 000000000001f3e7  /system/lib64/libjavacrypto.so
  #31 pc 0000000000021468  /system/lib64/libssl.so
  #32 pc 0000000000015fd8  /system/lib64/libssl.so
  #33 pc 00000000000149ab  /system/lib64/libssl.so
  #34 pc 000000000001954b  /system/lib64/libjavacrypto.so
  #35 pc 000000000008061a  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.NativeCrypto.SSL_do_handshake+376)
  #36 pc 00000000000a33aa  /system/framework/x86_64/boot-conscrypt.oat (offset 0x70000) (com.android.org.conscrypt.OpenSSLSocketImpl.startHandshake+1944)
  #37 pc 00000000000986ba  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.Connection.connectTls+488)
  #38 pc 00000000000981d2  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.Connection.connectSocket+192)
  #39 pc 000000000009a17e  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.Connection.connect+860)
  #40 pc 000000000009a4ed  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.Connection.connectAndSetOwner+203)
  #41 pc 00000000000b0ffd  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.OkHttpClient$1.connectAndSetOwner+75)
  #42 pc 00000000000cede7  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.http.HttpEngine.connect+501)
  #43 pc 00000000000d32d5  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.http.HttpEngine.sendRequest+755)
  #44 pc 00000000000dba50  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute+222)
  #45 pc 00000000000dbfe3  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse+145)
  #46 pc 00000000000de079  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.huc.HttpURLConnectionImpl.getInputStream+135)
  #47 pc 00000000000e0648  /system/framework/x86_64/boot-okhttp.oat (offset 0x8f000) (com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getInputStream+54)
  #48 pc 00000000011d9af0  /system/framework/oat/x86_64/services.odex (offset 0xc82000)
  • 根據堆棧,初步分析ArtMethod*指向的內存被篡改.
由於user版本, 只有一個tombstone文件,無法獲取更多有用信息.
  • 調整版本配置,獲取core dump後,看到出問題附近內存有RSA encryption相關的數據

ArtMethod 由前面的abstract_method計算而來,m = 0x000000006f528f00,
該地址附近的內存都被一些加密相關的字符篡改了.

000000006f528f00 004900570034002e 002e003100480054  ..4.W.I.T.H.1...
000000006f528f10 00340038002e0032 00310031002e0030  2...8.4.0...1.1.
000000006f528f20 0039003400350033 0031002e0031002e  3.5.4.9...1...1.
000000006f528f30 000000340031002e 000000006f2139b8  ..1.4....9!o....
000000006f528f40 1a1723730000000d 0032004100480053  ....s#..S.H.A.2.
000000006f528f50 0049005700360035 0053005200480054  5.6.W.I.T.H.R.S.
000000006f528f60 0000000000000041 000000006f2139b8  A........9!o....
000000006f528f70 968ed33600000017 0032004100480053  ....6...S.H.A.2.
000000006f528f80 0069005700360035 0053005200680074  5.6.W.i.t.h.R.S.
000000006f528f90 0063006e00450041 0074007000790072  A.E.n.c.r.y.p.t.
000000006f528fa0 0000006e006f0069 000000006f2139b8  i.o.n....9!o....
000000006f528fb0 eeb292290000002e 00360031002e0032  ....)...2...1.6.
000000006f528fc0 003000340038002e 0031002e0031002e  ..8.4.0...1...1.
000000006f528fd0 0033002e00310030 0032002e0034002e  0.1...3...4...2.

user/userdebug版本區別

  • 從ART虛擬機角度而言,user和userdebug配置的dexpreopt不同:user版本配置爲true, userdebug版本配置爲false
  • user版本配置爲false後,問題不再出現
  • userdebug版本配置爲true後,問題必現
  • 爲便於調試,基於userdebug版本,將dexpreopt配置爲true,編譯出新的image。目前問題表明跟dexpreopt有點關聯。

證明非內存篡改

  • 出問題ArtMethod* = 0x7042af00 位於下面的*.art文件
0x700cb000         0x705f1000   0x526000        0x0 /data/dalvik-cache/x86_64/system@[email protected]
  • 查看ArtMethod*的52個bytes內容
(gdb) x /52xb 0x7042af00
0x7042af00:     0x2e    0x00    0x38    0x00    0x34    0x00    0x30    0x00
0x7042af08:     0x2e    0x00    0x31    0x00    0x2e    0x00    0x31    0x00
0x7042af10:     0x30    0x00    0x31    0x00    0x2e    0x00    0x33    0x00
0x7042af18:     0x2e    0x00    0x34    0x00    0x2e    0x00    0x32    0x00
0x7042af20:     0x2e    0x00    0x34    0x00    0x57    0x00    0x49    0x00
0x7042af28:     0x54    0x00    0x48    0x00    0x31    0x00    0x2e    0x00
0x7042af30:     0x32    0x00    0x2e    0x00
  • 查看*.art的指定偏移的52個bytes
$ hexdump -C -s 3538688 -n 52 boot-core-oj.art 
0035ff00  2e 00 38 00 34 00 30 00  2e 00 31 00 2e 00 31 00  |..8.4.0...1...1.|
0035ff10  30 00 31 00 2e 00 33 00  2e 00 34 00 2e 00 32 00  |0.1...3...4...2.|
0035ff20  2e 00 34 00 57 00 49 00  54 00 48 00 31 00 2e 00  |..4.W.I.T.H.1...|
0035ff30  32 00 2e 00                                       |2...|
0035ff34
  • 通過以上對比,說明這個boot-core-oj.art 裏面的RSA加密之類的東西已經存在了, 並不是加載到內存後被篡改成這樣的.

目前想到有兩種可能性:

  • ArtMethod地址錯了
  • boot-core-oj.art裏面的內容生成的時候就錯了(只要訪問到這裏的內存就報錯).
    另外,https模塊同事分析java堆棧,發現函數調用存在異常,邏輯上根本不可能調用到。所以上面第二種可能性大些。

board對比差異

  • 對比同分支下其它board生成image的差異

不論是arm平臺還是x86平臺,出問題board生成的相關boot image有些奇怪.
根據文件大小,

  • boot.oat應是boot-radio_interactor_common.oat
  • boot-core-oj.oat應是boot.oat
    board_image_vs.png
  • 對比環境變量
  • 出問題board
BOOTCLASSPATH=/system/framework/radio_interactor_common.jar:/system/framework/core-oj.jar:/system/framework/core-libart.jar:/system/framework/conscrypt.jar:/system/framework/okhttp.jar:/system/framework/core-junit.jar:/system/framework/bouncycastle.jar:/system/framework/ext.jar:/system/framework/framework.jar:/system/framework/telephony-common.jar:/system/framework/voip-common.jar:/system/framework/ims-common.jar:/system/framework/apache-xml.jar:/system/framework/org.apache.http.legacy.boot.jar

正常board

BOOTCLASSPATH=/system/framework/core-oj.jar:/system/framework/core-libart.jar:/system/framework/conscrypt.jar:/system/framework/okhttp.jar:/system/framework/core-junit.jar:/system/framework/bouncycastle.jar:/system/framework/ext.jar:/system/framework/framework.jar:/system/framework/telephony-common.jar:/system/framework/voip-common.jar:/system/framework/ims-common.jar:/system/framework/apache-xml.jar:/system/framework/org.apache.http.legacy.boot.jar:/system/framework/radio_interactor_common.jar

對比可發現,radio_interactor_common.jar在BOOTCLASSPATH中的順序不同.
而通過走讀代碼, build的相關描述如下:

# dex preopt on the bootclasspath produces multiple files.  The first dex file
# is converted into to boot.art (to match the legacy assumption that boot.art
# exists), and the rest are converted to boot-<name>.art.
# In addition, each .art file has an associated .oat file.
LIBART_TARGET_BOOT_ART_EXTRA_FILES := $(foreach jar,$(wordlist 2,999,$(LIBART_TARGET_BOOT_JARS)),boot-$(jar).art boot-$(jar).oat)
LIBART_TARGET_BOOT_ART_EXTRA_FILES += boot.oat

以及

# The order of PRODUCT_BOOT_JARS matters.
PRODUCT_BOOT_JARS := \
  core-oj \
  core-libart \
  conscrypt \
  okhttp \
  core-junit \
  bouncycastle \
  ext \
  framework \
  telephony-common \
  voip-common \
  ims-common \
  apache-xml \
  org.apache.http.legacy.boot

可知build系統默認將PRODUCT_BOOT_JARS中的第一個編譯爲boot.oat/boot.art, 其它爲則編譯爲boot-jar.oat/boot{jar}.oat/boot-{jar}.art`文件

這說明board配置出了問題: radio_interactor_common被錯誤地配置到了PRODUCT_BOOT_JARS的最前面。

Root Cause

  1. 查看radio_interactor_common的使用
PRODUCT_BOOT_JARS += radio_interactor_common
  1. 再查看相關的調用順序,最終找到root cause。
$(call inherit-product, $(PLATDIR)/common/device.mk)
$(call inherit-product, $(SRC_TARGET_DIR)/product/core_64_bit.mk)
$(call inherit-product, $(SRC_TARGET_DIR)/product/aosp_base_telephony.mk)
$(call inherit-product, $(PLATDIR)/common/proprietories.mk)
- device.mk最終include到前面的  PRODUCT_BOOT_JARS += radio_interactor_common
- $(SRC_TARGET_DIR)/product/aosp_base_telephony.mk最終會include到系統默認的boot class: 
build/target/product/core_minimal.mk
- 對比正常board, 都是先include系統默認的boot class, 再追加radio_interactor_common

解決方案

  • 驗證
  • 調整*.mk文件incldue順序後, 查看編譯後的image文件
$ tree system/framework/x86*
system/framework/x86
├── boot-apache-xml.art
├── boot-apache-xml.oat
├── boot.art
├── boot-bouncycastle.art
├── boot-bouncycastle.oat
├── boot-conscrypt.art
├── boot-conscrypt.oat
├── boot-core-junit.art
├── boot-core-junit.oat
├── boot-core-libart.art
├── boot-core-libart.oat
├── boot-ext.art
├── boot-ext.oat
├── boot-framework.art
├── boot-framework.oat
├── boot-ims-common.art
├── boot-ims-common.oat
├── boot.oat
├── boot-okhttp.art
├── boot-okhttp.oat
├── boot-org.apache.http.legacy.boot.art
├── boot-org.apache.http.legacy.boot.oat
├── boot-radio_interactor_common.art
├── boot-radio_interactor_common.oat
├── boot-telephony-common.art
├── boot-telephony-common.oat
├── boot-voip-common.art
└── boot-voip-common.oat
system/framework/x86_64
├── boot-apache-xml.art
├── boot-apache-xml.oat
├── boot.art
├── boot-bouncycastle.art
├── boot-bouncycastle.oat
├── boot-conscrypt.art
├── boot-conscrypt.oat
├── boot-core-junit.art
├── boot-core-junit.oat
├── boot-core-libart.art
├── boot-core-libart.oat
├── boot-ext.art
├── boot-ext.oat
├── boot-framework.art
├── boot-framework.oat
├── boot-ims-common.art
├── boot-ims-common.oat
├── boot.oat
├── boot-okhttp.art
├── boot-okhttp.oat
├── boot-org.apache.http.legacy.boot.art
├── boot-org.apache.http.legacy.boot.oat
├── boot-radio_interactor_common.art
├── boot-radio_interactor_common.oat
├── boot-telephony-common.art
├── boot-telephony-common.oat
├── boot-voip-common.art
└── boot-voip-common.oat
  • 編譯出新版本後,問題不再復現。
  • 爲防止以後再掉在這樣的坑裏,修改build系統代碼,添加check機制:當檢測到配置不對時報錯,這樣在bringup階段就暴露出問題.
diff --git a/core/dex_preopt_libart.mk b/core/dex_preopt_libart.mk
index b469dc0..7145fed 100644
--- a/core/dex_preopt_libart.mk
+++ b/core/dex_preopt_libart.mk
@@ -87,6 +87,9 @@ LIBART_TARGET_BOOT_DEX_FILES := $(foreach jar,$(LIBART_TARGET_BOOT_JARS),$(call
 # is converted into to boot.art (to match the legacy assumption that boot.art
 # exists), and the rest are converted to boot-<name>.art.
 # In addition, each .art file has an associated .oat file.
+ifneq (core-oj,$(word 1,$(LIBART_TARGET_BOOT_JARS)))
+$(error "core-oj" must be the first file in <PRODUCT_BOOT_JARS> but now is "$(word 1,$(LIBART_TARGET_BOOT_JARS))")
+endif
 LIBART_TARGET_BOOT_ART_EXTRA_FILES := $(foreach jar,$(wordlist 2,999,$(LIBART_TARGET_BOOT_JARS)),boot-$(jar).art boot-$(jar).oat)
 LIBART_TARGET_BOOT_ART_EXTRA_FILES += boot.oat
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章