centos7 pyspider環境安裝

  PySpider 是一個我個人認爲非常方便並且功能強大的爬蟲框架,支持多線程爬取、JS動態解析,提供了可操作界面、出錯重試、定時爬取等等的功能,使用非常人性化。

網上的參考文檔:

http://www.jianshu.com/p/8eb248697475

http://cuiqingcai.com/2652.html

https://yq.aliyun.com/articles/75518

1.搭建環境:

    python版本:3.6.3

    系統環境:centos7.3


1.1.搭建python3環境:

# 下載依賴 

yum install -y ncurses-devel openssl openssl-devel zlib-devel gcc make glibc-devel libffi-devel glibc-static glibc-utils sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel libcurl-devel


# 下載python

wget https://www.python.org/ftp/python/3.6.3/Python-3.6.3.tgz


#解壓

tar -xf Python-3.6.3.tgz


#編譯安裝

 ./configure --prefix=/usr/local/python3.6 --enable-shared

make && make install


# 建立軟鏈接

ln -s /usr/local/python3.6/bin/python3 /usr/bin/python3

echo "/usr/local/python3.6/lib" > /etc/ld.so.conf.d/python3.5.conf

ldconfig


# 驗證python3

[root@ceph-host-01 local]# python3

Python 3.6.3 (default, Oct  9 2017, 04:01:24) 

[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> 


#pip

/usr/local/python3.6/bin/pip3 install --upgrade pip

ln -s /usr/local/python3.6/bin/pip /usr/bin/pip


1.2.安裝pyspider

pip install pyspider


啓動python中的pycurl模塊出現如下問題:

ImportError: pycurl: libcurl link-time ssl backend (nss) is different from compile-time ssl backend (none/other)

解決方法:

pip uninstall pycurl
export PYCURL_SSL_LIBRARY=nss
pip install pycurl


1.3.安裝phantomjs

官網下載:http://phantomjs.org/download.html

wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2

解壓:

yum -y install unbzip2

bzip2 -d phantomjs-2.1.1-linux-x86_64.tar.bz2 

tar -xf phantomjs-2.1.1-linux-x86_64.tar

mv phantomjs-2.1.1-linux-x86_64 phantomjs

ln -sv /usr/local/phantomjs/bin/phantomjs /usr/bin/phantomjs


1.4.啓動pyspider

由於放在公網,編輯了一個配置文件config.json ,用於登錄認證

[root@ceph-host-01 local]# vim config.json 


{

    "webui": {

        "port": "5000",

        "username": "abc",

        "password": "123456",

        "need-auth": true

    }

}

開啓進程

nohup pyspider --config config.json &


進入web界面:

2016-02-11 20.55.36

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章