關於使用PyCharm遠程調試運行時StanfordCoreNLP報無法找到java的問題解決

關於使用PyCharm遠程調試運行時StanfordCoreNLP報無法找到java的問題解決

最近學習NLP,在PyCharm配置好了遠程調試運行,在使用stanfordcorenlp的時候報錯FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java',原本以爲可以和上一篇文章《關於pyhanlp報FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/jvm'錯誤的解決》一樣,添加環境變量即可,但無濟於事。網上也沒有查到類似的錯誤,看來是我jdk的安裝比較奇葩?

報錯詳情:

ssh://yl@IP:PORT/home/USER/anaconda3/envs/tensorflow/bin/python -u /home/yl/python/nlp/learing/test01.py
Traceback (most recent call last):
  File "/home/yl/python/nlp/learing/test01.py", line 7, in <module>
    snlp = StanfordCoreNLP(os.sep + 'opt' + os.sep + "nlp" + os.sep + 'stanford-corenlp', lang='zh')
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/site-packages/stanfordcorenlp/corenlp.py", line 46, in __init__
    if not subprocess.call(['java', '-version'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) == 0:
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 323, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 774, in __init__
    restore_signals, start_new_session)
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 1522, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java'

原因分析:

老辦法,網上找不到答案,慢慢看源代碼找原因吧。打開文件/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py找到_execute_child()這個函數,並在1522行附近有如下代碼

                if issubclass(child_exception_type, OSError) and hex_errno:
                    errno_num = int(hex_errno, 16)
                    child_exec_never_called = (err_msg == "noexec")
                    if child_exec_never_called:
                        err_msg = ""
                        # The error must be from chdir(cwd).
                        err_filename = cwd
                    else:
                        err_filename = orig_executable
                    if errno_num != 0:
                        err_msg = os.strerror(errno_num)
                        if errno_num == errno.ENOENT:
                            err_msg += ': ' + repr(err_filename)
                    raise child_exception_type(errno_num, err_msg, err_filename)
                

可以看到當errno_no不爲0的時候報錯,依次向上查看,可以看到錯誤來源 errno_no -> hex_errno -> errpipe_data -> errpipe_read -> self.pid = _posixsubprocess.fork_exec()執行時產生(1452行左右),從該函數輸入參數名來看,應該是executable_list和env_list影響了是否能找到java位置。於是在executable_list生成附近print了查看其變化情況,如下(1436行至1442行)

                    executable = os.fsencode(executable)
                    print('executable: ', executable)    # 打印 從上面傳入的初始值
                    if os.path.dirname(executable):
                        executable_list = (executable,)
                    else:
                        # This matches the behavior of os._execvpe().
                        print('env: ', env)    # 打印 env
                        print('get_exec_path of env: ', os.get_exec_path(env))  # 打印 從env獲取系統可執行路徑 應該是 PATH 變量
                        executable_list = tuple(
                            os.path.join(os.fsencode(dir), executable)
                            for dir in os.get_exec_path(env))
                        print('executable_list: ', executable_list)    # 打印 最終的路徑結果

PyCharm中導入stanfordcorenlp執行StanfordCoreNLP時輸出如下:

executable:  b'java'
env:  None
get_exec_path of env:  ['/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games']
executable_list:  (b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java')
Traceback (most recent call last):

在linux終端中運行輸出如下:

yl@ylhome [20:24:07] ~$ /home/yl/anaconda3/envs/tensorflow/bin/python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> from stanfordcorenlp import StanfordCoreNLP 
>>> stanford_nlp = StanfordCoreNLP(os.sep + '/opt' + os.sep + "/nlp" + os.sep + '/stanford-corenlp', lang='zh')
executable:  b'java'
env:  None
get_exec_path of env:  ['/home/yl/.local/bin', '/home/yl/bin', '/usr/local/cuda/bin', '/home/yl/anaconda3/bin', '/usr/local/java/latest/bin', '/usr/local/cuda/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games', '/snap/bin']
executable_list:  (b'/home/yl/.local/bin/java', b'/home/yl/bin/java', b'/usr/local/cuda/bin/java', b'/home/yl/anaconda3/bin/java', b'/usr/local/java/latest/bin/java', b'/usr/local/cuda/bin/java', b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java', b'/snap/bin/java')
executable:  b'/bin/sh'
>>> 

命令行中讀取到的PATH的值是正確的,PyCharm遠程調用時無法獲取用戶自行添加的PATH。那麼,一個便捷的方式是將java鏈接到PyCharm調用時能讀取到的位置,如/usr/local/bin中。

具體內在的原因,由於時間匆忙就不予深究了,暫時解決問題以後再來回顧。

解決辦法:

將java命令鏈接到系統默認的可執行目錄,如/usr/bin或/usr/local/bin等地方。我的配置:

sudo ln -sf /usr/local/java/latest/bin/java /usr/local/bin/java

運行效果:

然後在PyCharm中運行stanfordcorenlp包,可以正常運行

from stanfordcorenlp import StanfordCoreNLP
import os

snlp = StanfordCoreNLP(os.sep + 'opt' + os.sep + "nlp" + os.sep + 'stanford-corenlp', lang='zh')

str = '今天晚上吃火鍋啊!'
print(snlp.ner(str))

結果:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章