Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

masterpy/pyFetch

Open more actions menu
 
 

Repository files navigation

pyFetch

基于python的分布式爬虫

安装mongoDB

https://www.mongodb.org/downloads 默认端口运行mongoDB

安装依赖

linux 安装

#ubuntu
pip install python-dev
#centos
yum install python-devel

windows 下的 gevent 可能需要安装 Microsoft Visual C++ Compiler for Python 2.7 http://www.microsoft.com/en-us/download/confirmation.aspx?id=44266

pip install requests
pip install pymongo
pip install flask
pip install flask-compress
pip install gevent
pip install tld
pip install click

执行

服务器

python service.py

客服端

python client.py

访问

http://127.0.0.1

Todo list

  • 参数可配置化, 还有mongo的连接配置
  • slave 执行环境安全
  • setup.py
  • 列表的时间排序有问题
  • 每个项目都可以添加多个url抓取入口
  • 项目与爬虫的抓取频率显示
  • 结果页面图片浏览模式
  • 新建项目且修改代码时,会有缓存且爬虫会使用旧代码进行抓取

About

基于python的分布式爬虫

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 60.4%
  • HTML 21.0%
  • JavaScript 17.4%
  • CSS 1.2%
Morty Proxy This is a proxified and sanitized view of the page, visit original site.