Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Latest commit

 

History

History
History
25 lines (18 loc) · 551 Bytes

File metadata and controls

25 lines (18 loc) · 551 Bytes
Copy raw file
Download raw file
Edit and raw actions
@author jackzhenguo
@desc 
@date 2019/8/8

98 爬取百度首页标题

import re
from urllib import request

#爬虫爬取百度首页内容
data=request.urlopen("http://www.baidu.com/").read().decode()

#分析网页,确定正则表达式
pat=r'<title>(.*?)</title>'

result=re.search(pat,data)
print(result) <re.Match object; span=(1358, 1382), match='<title>百度一下,你就知道</title>'>

result.group() # 百度一下,你就知道
[上一个例子](97.md) [下一个例子](99.md)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.