原文鏈接:http://www.jianshu.com/p/5265b76026d9
著作權歸作者所有,轉載請聯繫作者獲得授權,並標註“簡書作者”。
環境:Mac OS X Yosemite 10.10.3
安裝Scrapy
學習Python爬蟲必須要使用的框架Scrapy,話不多說。
打開終端執行命令:
sudo easy_install pip
pip 和 easy_install 都是 Python 的框架管理命令,pip 是對 easy_install的升級。
然後終端執行命令安裝 Scrapy:
sudo pip install Scrapy
如果執行成功,那麼 Scrapy 就安裝成功了,但往往事與願違,你很有可能遇到如下錯誤:
/private/tmp/pip-build-9RYtLC/lxml/src/lxml/includes/etree_defs.h:14:10: fatal error: 'libxml/xmlversion.h' file not found
#include "libxml/xmlversion.h"
^
1 error generated.
error: command 'cc' failed with exit status 1
----------------------------------------
Command "/usr/bin/python -c "import setuptools, tokenize;__file__='/private/tmp/pip-build-9RYtLC/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-544HZx-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-9RYtLC/lxml
解決方法有如下幾種:
1、終端執行命令安裝或更新命令行開發工具:
xcode-select --install
2、配置路徑:C_INCLUDE_PATH
C_INCLUDE_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/libxml2:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/libxml2/libxml:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include
3、參照官網使用如下命令安裝Scrapy
STATIC_DEPS=true pip install lxml
一般此三個方法就可解決錯誤成功安裝Scrapy,如果還是失敗,參考 StackOverflow上的一個帖子
-------------------------------------------------分割線---------------------
http://stackoverflow.com/questions/30964836/scrapy-throws-importerror-cannot-import-name-xmlrpc-client
由於以上三種都不適用於我的情況。
當我輸入 scrapy的時候,
沒有預期的顯示出版本號,而是顯示如下:
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 7, in <module>
from scrapy.cmdline import execute
File "/Library/Python/2.7/site-packages/scrapy/__init__.py", line 48,
in <module>
from scrapy.spiders import Spider
File "/Library/Python/2.7/site-packages/scrapy/spiders/__init__.py",
line 10, in <module>
from scrapy.http import Request
File "/Library/Python/2.7/site-packages/scrapy/http/__init__.py", line
12, in <module>
from scrapy.http.request.rpc import XmlRpcRequest
File "/Library/Python/2.7/site-packages/scrapy/http/request/rpc.py",
line 7, in <module>
from six.moves import xmlrpc_client as xmlrpclib
ImportError: cannot import name xmlrpc_client
突然想起來在pip安裝的時候,好像six什麼的 就沒有安裝成功。
所以,
sudo rm -rf /Library/Python/2.7/site-packages/six*
sudo rm -rf /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six*
sudo pip install six
移除後重新安裝,it works.
安裝PIL
PIL是Python的圖形處理庫,在學習爬蟲的時候可以用來處理驗證碼。
終端輸入命令:
sudo pip install pil
恩,出錯:
/Library/Python/2.7/site-packages/pip-6.1.1-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Could not find a version that satisfies the requirement pil (from versions: )
Some externally hosted files were ignored as access to them may be unreliable (use --allow-external pil to allow).
No matching distribution found for pil
不過提示了添加參數 --allow-external pil
好,改一下命令重新執行:
sudo pip install PIL --allow-external PIL
好的,開始安裝了,哎?好像又出錯了!!!
_imagingft.c:73:10: fatal error: 'freetype/fterrors.h' file not found
#include <freetype/fterrors.h>
^
1 error generated.
error: Setup script exited with error: command 'cc' failed with exit status 1
提示沒找到 freetype/fterrors.h
文件,百度怎麼解決,很多文章的解決辦法是執行命令:ln -s /usr/local/include/freetype2 /usr/local/include/freetype
然後,試了,不行。
從Finder來到目錄 usr/local/include
下,咦?好像有目錄freetype2,但是麼有freetype,那麼...可以複製一個freetype2的副本再改名freetype不行嗎?恩,然後我就這樣幹了。然後在終端重新執行安裝PIL的命令:
sudo pip install PIL --allow-external PIL
然後就安裝成功了~~
安裝BeautifulSoup
首先,官網下載最新的包beautifulsoup4 4.3.2
,然後解壓縮,從終端進入該目錄。
終端執行
sudo python setup.py install
好,安裝成功。
Beautifulsoup的官方文檔
------------------spilt line -------
sudo pip install bs4
it also works.
補充:
easy_install使用方法:
安裝:easy_install PackageName
刪除:easy_install -m PackageName
更新:easy_install -U PackageName
pip使用方法:
安裝:pip install PackageName
刪除:pip uninstall PackageName
更新:pip install -U PackageName
搜索:pip search PackageName