HOME
BLOG
利用Python爬取腾讯新闻首页的标题和链接
Oct 26 2016
环境

1、Python3

2、PyCharm

代码
1
2
3
4
5
6
7
8
9
import requests
import pandasfrom bs4
import BeautifulSoup
res = requests.get('http://news.qq.com/')
soup = BeautifulSoup(res.text,'html.parser')
newsary = []
for news in soup.select('.Q-tpWrap .text'):
newsary.append({'Title':news.select('a')[0].text,'Url':news.select('a')[0]['href']})
newsdf = pandas.DataFrame(newsary)print(newsdf)
输出结果
将输出结果保存到本地Excel文件
1
2
3
4
5
6
7
8
9
10
import requests
import pandas
from bs4 import BeautifulSoup
res = requests.get('http://news.qq.com/')
soup = BeautifulSoup(res.text,'html.parser')
newsary = []
for news in soup.select('.Q-tpWrap .text'):
newsary.append({'Title':news.select('a')[0].text,'Url':news.select('a')[0]['href']})
newsdf = pandas.DataFrame(newsary)
newsdf.to_excel('news.xlsx')
结果

打开文件