國慶前公司進行了一次迭代上線,前幾天正常運行,沒發現問題,後面偶發性的出現了數據庫超時的異常信息。異常信息記錄如下:
ERROR com.alibaba.druid.pool.DruidDataSource - create connection SQLException, url: jdbc:mysql://xxx.xxx.xxx.xxx:3306/openplatform?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&serverTimezone=PRC, errorCode 0, state 08001
java.sql.SQLNonTransientConnectionException: Could not create connection to database server. Attempted reconnect 3 times. Giving up.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
at com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:905)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:830)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:455)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:240)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:199)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1596)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1662)
at com.alibaba.druid.pool.DruidDataSource$CreateConnectionThread.run(DruidDataSource.java:2601)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:91)
at com.mysql.cj.NativeSession.connect(NativeSession.java:152)
at com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:849)
... 7 common frames omitted
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:155)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:65)
... 9 common frames omitted
剛開始偶發性比較少,不容易復現,因爲數據庫配置沒變過,以爲是數據庫性能問題,和運維溝通過讓他們繼續觀察後就沒留意,4號左右開始偶發的次數比較頻繁。
首先排查是不是sql的問題,sql如下:
SELECT
gf_cust_id AS cust_id,
cu_type,
product_code,
is_active,
is_stock,
data_content
FROM t_gf_cust_re_info
WHERE gf_cust_id=#{_parameter}
ORDER BY import_date DESC
LIMIT 1
很簡單的單表查詢,排除慢sql。
數據庫性能方面:通過運維的排查、客戶反映問題時數據庫的負載和併發都在正常水平,所以排除了數據庫性能方面的原因。
配置方面:判斷druid連接不上數據庫,有可能是連接池裏面的連接失效了,在結合druid的配置參數,發現druid默認是不會將連接池中失效的連接進行重連或者從連接池中刪除。如果應用程序正好從連接池裏面拿到了一個失效的連接進行數據庫訪問,則會發生數據庫連接不上的錯誤。
druid默認配置如下:
問題原因找到後,修改配置重新打包,上線,超時問題沒有復現。
目前數據庫配置如下:
spring:
datasource:
type: com.alibaba.druid.pool.DruidDataSource
url: jdbc:mysql://xxx.xxx.xxx.xxx:3306/openplatform?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&serverTimezone=PRC
username: xxx
password: xxx
driver-class-name: com.mysql.cj.jdbc.Driver
initialSize: 10
minIdle: 10
maxActive: 40
maxWait: 60000
test-on-borrow: true
test-while-idle: true
問題反思:
- 爲什麼druid默認配置會有這種設計思路,是出於性能還是什麼問題考慮?
- druid的連接池有這種問題,那麼如果不用druid連接池,springboot集成mybatis默認用哪個連接池?配置會不會有這個問題?
這兩個問題留給我也留給大家。