pysider的配置
pyspider, centos 7.4 , python 3.6.5
問題的提出
在啓動pyspider的過程中,碰到如下的問題:
其中的信息如下:
[root@AY131203102210033c39Z ~]# pyspider
[W 180813 11:23:41 run:413] phantomjs not found, continue running without it.
[I 180813 11:23:44 result_worker:49] result_worker starting...
Process Process-4:
Traceback (most recent call last):
File "/root/.pyenv/versions/3.6.5/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 236, in fetcher
Fetcher = load_cls(None, None, fetcher_cls)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 48, in load_cls
return utils.load_object(value)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/libs/utils.py", line 369, in load_object
module = __import__(module_name, globals(), locals(), [object_name])
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
from .tornado_fetcher import Fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
from tornado.curl_httpclient import CurlAsyncHTTPClient
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tornado/curl_httpclient.py", line 24, in <module>
import pycurl # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (nss) is different from compile-time ssl backend (openssl)
[I 180813 11:23:44 processor:211] processor starting...
[I 180813 11:23:45 scheduler:647] scheduler starting...
Traceback (most recent call last):
File "/root/.pyenv/versions/3.6.5/bin/pyspider", line 11, in <module>
load_entry_point('pyspider==0.3.10', 'console_scripts', 'pyspider')()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 754, in main
cli()
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 1043, in invoke
return Command.invoke(self, ctx)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 165, in cli
ctx.invoke(all)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 497, in all
ctx.invoke(webui, **webui_config)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 333, in webui
app = load_cls(None, None, webui_instance)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/run.py", line 48, in load_cls
return utils.load_object(value)
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/libs/utils.py", line 369, in load_object
module = __import__(module_name, globals(), locals(), [object_name])
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/webui/__init__.py", line 8, in <module>
from . import app, index, debug, task, result, login
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/webui/app.py", line 17, in <module>
from pyspider.fetcher import tornado_fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
from .tornado_fetcher import Fetcher
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
from tornado.curl_httpclient import CurlAsyncHTTPClient
File "/root/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tornado/curl_httpclient.py", line 24, in <module>
import pycurl # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (nss) is different from compile-time ssl backend (openssl)
從錯誤信息的輸出來看,其是nss與openssl之間的錯配問題。
問題分析
由於之前在安裝openssl的時候,進行了pycurl環境變量的配置,其中使用了openssl.
於是採用瞭如下的策略:
pip uninstall pycurl
pip install –no-cache-dir –compile –ignore-installed –install-option=”–with-nss” pycurl
vim ~/.bashrc
將其中修改爲:export PYCURL_SSL_LIBRARY=nss
source ~/.bashrc
pip uninstall pyspider
pip install pyspider
在完成所有這些操作之後,重啓啓動pyspider即可。
正確的輸入出如下:
[root@xxx~]# pyspider
[W 180813 11:31:52 run:413] phantomjs not found, continue running without it.
[I 180813 11:31:54 result_worker:49] result_worker starting...
[I 180813 11:31:55 processor:211] processor starting...
[I 180813 11:31:55 tornado_fetcher:638] fetcher starting...
[I 180813 11:31:55 scheduler:647] scheduler starting...
[I 180813 11:31:55 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 180813 11:31:55 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 180813 11:31:55 app:76] webui running on 0.0.0.0:5000
總結
總體感覺pyspider在運行環境的處理上,做的還是有待提高的,畢竟在安裝和啓動過程中,碰到了如此多的問題,這些都是需要改進的內容。總體而言,pyspider還是一個很讚的項目。