Connection reset by peer

部署項目時A服務啓動失敗,報錯:

14-Aug-2019 12:52:49.860 SEVERE [main] org.springframework.web.context.ContextLoader.initWebApplicationContext Context initialization failed
        org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'adminService': Unsatisfied dependency expressed through field 'adminDao'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'adminDao': Unsatisfied dependency expressed through field 'jdbcTemplate'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'jdbcTemplate' defined in URL .
                at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:275)
                ... 80 more
        Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: IO Error: Connection reset by peer, Authentication lapse 98879 ms.
                at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:576)
                ... 82 more
        Caused by: java.sql.SQLRecoverableException: IO Error: Connection reset by peer, Authentication lapse 98879 ms.
                at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:794)
                at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
                ... 89 more
        Caused by: java.io.IOException: Connection reset by peer, Authentication lapse 98879 ms.
                at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:790)
                ... 97 more
        Caused by: java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
                at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
                ... 97 more

這個錯誤是間歇性發作的,本例中大多數情況失敗,極少數情況(1次)項目啓動成功。

因爲A服務需要連接到到B服務器的數據庫,但是能夠確認使用的數據庫賬戶名和密碼無誤。

進一步的,connection reset by peer的含義是往對端寫數據的時候,對端提示已經關閉了連接。一般對一個已經被關閉的socket的寫操作會提示這個錯誤。

所以懷疑是是數據庫服務器接收了大量連接,超過最大連接數後主動關閉了部分連接,導致客戶端報出connection reset by peer,但是查看了數據庫最大連接數1000,已使用數只有140多,所以還不是數據庫的問題。

問題解決鏈接

Based on the symptoms of "happening intermittently". It appears to be a known issue around "/dev/random" and "/dev/urandom".
基於這種間歇性發生的徵狀,這似乎是一個關於“dev/random”和“dev/urandom”的已知問題。

Tried as suggested below and worked around it:
嘗試使用下面的建議去解決這個問題:

1. Open the $JAVA_HOME/jre/lib/security/java.security file in a text editor.
打開JAVA_HOME下的java.security文件。

2. Change the line:
securerandom.source=file:/dev/random
to read:
securerandom.source=file:/dev/urandom
將配置項securerandom.source的值改爲file:/dev/urandom。

3. Save your change and exit the text editor.
保存並退出。

Oracle官方鏈接

The library used for random number generation in Sun's JVM relies on /dev/random by default for UNIX platforms. 
This can potentially block the WebLogic SIP Server process because on some operating systems /dev/random waits for a certain amount of "noise" to be generated on the host machine before returning a result. 
Although /dev/random is more secure, BEA recommends using /dev/urandom if the default JVM configuration delays WebLogic SIP Server startup.

在Sun的JVM中,用於隨機數生成的lib庫默認依賴於UNIX平臺的/dev/random。
這可能會阻止Weblogic SIP服務器進程,因爲在某些操作系統上/dev/random會在返回結果之前等待主機上生成一定數量的"noise",這個等待過程會造成阻塞。
雖然/dev/random更安全,但是如果默認的JVM配置延遲了Weblogic SIP服務器的啓動,那麼BEA建議使用/dev/urandom。

回到當前的的例子,當A服務的jvm平臺使用/dev/random時,由於等待生成noise而造成了阻塞,導致B機器由於超時或其他原因關閉了socket,此時當A服務再次向該socket寫數據時,報出了connection reset by peer的錯誤。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章