Scrapyをインストールする

下記サイトを参考にScrapyをインストールする。すくれぴーと読むらしい。

普通はすんなりインストールできるようだが、私の環境ではなかなかうまくいかなかった。ごちゃごちゃと色々と試していたため、それらの手順をメモしておく。

pipがインストールできたので本題に入る。

$ pip install scrapy
（省略）
OSError: [Errno 13] Permission denied: '/Library/Python/2.7/site-packages/scrapy'

pipにsudoをつけて再実行。

$ sudo pip install scrapy
（省略）
cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -I/usr/include/libxml2 -I/private/tmp/pip_build_root/lxml/src/lxml/includes -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.macosx-10.9-intel-2.7/src/lxml/lxml.etree.o -w -flat_namespace

In file included from src/lxml/lxml.etree.c:232:

/private/tmp/pip_build_root/lxml/src/lxml/includes/etree_defs.h:14:10: fatal error: 'libxml/xmlversion.h' file not found

#include "libxml/xmlversion.h"

         ^

1 error generated.

error: command 'cc' failed with exit status 1

----------------------------------------
Cleaning up...
Command /usr/bin/python -c "import setuptools, tokenize;__file__='/private/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-BXzrlK-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /private/tmp/pip_build_root/lxml

下記サイトを参考に環境変数を追加し、再度インストールを行なったところ、Scrapyのインストールができた。

pip install pillow → error: command 'cc' failed with exit status 1 というエラーが出たら環境変数を追加

$ export CFLAGS=-Qunused-arguments
$ export CPPFLAGS=-Qunused-arguments
$ sudo pip install scrapy
Requirement already satisfied (use --upgrade to upgrade): scrapy in /Library/Python/2.7/site-packages
Cleaning up...

Scrapyでプロジェクトを作成するため下記コマンドを実行したが、エラー。

$ scrapy startproject helloscrapy 
/usr/local/lib/python2.7/site-packages/twisted/internet/_sslverify.py:184: UserWarning: You do not have the service_identity module installed. Please install it from <https://pypi.python.org/pypi/service_identity>. Without the service_identity module and a recent enough pyOpenSSL tosupport it, Twisted can perform only rudimentary TLS client hostnameverification.  Many valid certificate/hostname mappings may be rejected.
  verifyHostname, VerificationError = _selectVerifyImplementation()
Error: directory 'helloscrapy' already exists

「service_identity」をインストールしなくてはならないらしいので、pipでインストールを行なう。

$ sudo pip install service_identity
（省略）
Successfully installed service-identity pyasn1 pyasn1-modules characteristic
Cleaning up...

treeのインストールがまだだった気がしたのでbrewしてみるが、treeは既にインストール済みだった。しかし、treeコマンドを実行してみると「command not found」メッセージが表示される。

$ brew install tree
Warning: tree-1.7.0 already installed

$ tree
-bash: tree: command not found

原因不明なため、treeをアンインストールし、再インストールすることに。

$ brew remove tree
Uninstalling /usr/local/Cellar/tree/1.7.0...

$ brew install tree
（省略）

treeコマンドが有効になった。

$ tree helloscrapy/
helloscrapy/
├── helloscrapy
│   ├── __init__.py
│   ├── items.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       └── __init__.py
└── scrapy.cfg

2 directories, 6 files

残りは参考にしたサイト通りにpyファイルを作成し、実行。

ponkiti's blog

主に自分用、イベント参加メモや備忘録として利用

Scrapyをインストールする