Scrapy 1.2.2 发布了。
Scrapy 是一套基于基于Twisted的异步处理框架,纯python实现的爬虫框架,用户只需要定制开发几个模块就可以轻松的实现一个爬虫,用来抓取网页内容以及各种图片。
更新内容:
Bug 修复
-
Fix a cryptic traceback when a pipeline fails on
open_spider()
(issue 2011) -
Fix embedded IPython shell variables (fixing issue 396 that re-appeared in 1.2.0, fixed in issue 2418)
-
A couple of patches when dealing with robots.txt:
-
handle (non-standard) relative sitemap URLs (issue 2390)
-
handle non-ASCII URLs and User-Agents in Python 2 (issue 2373)
文档
-
Document
"download_latency"
key inRequest
‘smeta
dict (issue 2033) -
Remove page on (deprecated & unsupported) Ubuntu packages from ToC (issue 2335)
-
A few fixed typos (issue 2346, issue 2369, issue 2369, issue 2380) and clarifications (issue 2354, issue 2325, issue 2414)
其他变更
-
Advertize conda-forge as Scrapy’s official conda channel (issue 2387)
-
More helpful error messages when trying to use
.css()
or.xpath()
on non-Text Responses (issue 2264) -
startproject
command now generates a samplemiddlewares.py
file (issue 2335) -
Add more dependencies’ version info in
scrapy version
verbose output (issue 2404) -
Remove all
*.pyc
files from source distribution (issue 2386)
下载地址