Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

CPyeah/crawler

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

crawler

使用Java编写的多线程爬虫,完成HTTP请求、模拟登录、Cookie保存、HTML解析的工作。在获得数据之后,会将它存入数据库中,当数据增长到一定规模之后,使用Elasticsearch处理和分析数据,并完成一个简单的搜索引擎。

运行前请先执行 sh init-data.sh

使用技术

  • java 8
  • Maven
  • circleci
  • 广度优先算法
  • Jsoup
  • spotbugs
  • H2数据库
  • Flyway 使用mvn flyway:migrate命令初始化数据库
  • Mybatis
  • spotbugs
  • Elasticsearch

About

Java多线程爬虫

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.