python项目实战:爬取youku视频播放链接
前言
相信大家都做过很多的爬虫把?比如爬取各种各样的图片,爬取许多视频啥的,还是模拟登陆,用字典暴力破解各种加密文件等等,总之Python居有非常都应用,可谓是丰富的很
接下来为大家介绍Python爬取youku视频的播放链接,直接放在网页就可以看了,废话不多说,直接上代码吧
首先导入库
import random import re import requests
发起请求
def get_request(url, user_agent): '''参数引入及头信息''' if len(user_agent) < 10: user_agent = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0' # 此处修改头字段, headers = { 'Host': "v.youku.com", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8", "Accept-Encoding": "gzip, deflate, sdch", "Accept-Language": "zh-CN,zh;q=0.8", 'Cache-Control': 'no-cache', "Connection": "keep-alive", "User-Agent": user_agent, 'Referer': 'http://www.youku.com/' } try: html = requests.get(url, headers=headers, timeout=20).text # print html return html except Exception, e: print(Exception, e) return -1
学习从来不是一个人的事情,要有个相互监督的伙伴,工作需要学习python或者有兴趣学习python的伙伴可以私信回复小编“学习” 领取全套免费python学习资料、视频()安装包
主函数执行
if __name__ == '__main__': # 此url为任意一个具有某视频播放窗口的页面 url = "http://v.youku.com/v_show/id_XMTgzNDI0MjkzNg==.html?from=y1.3-movie-grid-1095-9921.86985-107667.1-1&spm=a2hmv.20009921.yk-slide-107667.5~5~5~5!2~A#paction" # 导入数据集并随机获取一个User-Agent user_agent_list = [] f = open('user_agent.txt', 'r') for date_line in f: user_agent_list.append(date_line.replace(' ', '')) user_agent = random.choice(user_agent_list) # 发起请求 html_body = get_request(url, user_agent) print(re.findall('http://player.youku.com/player.php/sid/[A-Za-z0-9=]*/v.swf', html_body))
将此链接放在浏览器中可以直接播放,虽然有广告...但是还是能实现的,欢迎大家一起学习,共同交流
相关推荐
huavhuahua 2020-11-20
weiiron 2020-11-16
cakecc00 2020-11-15
千锋 2020-11-15
JakobHu 2020-11-14
guangcheng 2020-11-13
xirongxudlut 2020-11-10
solarLan 2020-11-09
pythonxuexi 2020-11-08
文山羊 2020-11-07
susmote 2020-11-07
wuShiJingZuo 2020-11-05
Pythonjeff远 2020-11-06
jacktangj 2020-11-04
lousir 2020-11-04
Noneyes 2020-11-10
ailxxiaoli 2020-11-16
chensen 2020-11-14
Nostalgiachild 2020-11-13