Python爬虫:urllib库的基本使用
请求网址获取网页代码
import urllib.request url = "http://www.baidu.com" response = urllib.request.urlopen(url) data = response.read() # print(data) # 将文件获取的内容转换成字符串 str_data = data.decode("utf-8") print(str_data) # 将结果保存到文件中 with open("baidu.html", "w", encoding="utf-8") as f: f.write(str_data)
get带参数请求
import urllib.request def get_method_params(wd): url = "http://www.baidu.com/s?wd=" # 拼接字符串 final_url = url + wd # 发送网络请求 response = urllib.request.urlopen(final_url) print(response.read().decode("utf-8")) get_method_params("美女")
直接这么写会报错:
原因是,网址里面包含了汉字,但是ascii码是没有汉字的,需要转义一下:
import urllib.request import urllib.parse import string def get_method_params(wd): url = "http://www.baidu.com/s?wd=" # 拼接字符串 final_url = url + wd # 将包含汉字的网址进行转义 encode_new_url = urllib.parse.quote(final_url, safe=string.printable) # 发送网络请求 response = urllib.request.urlopen(encode_new_url) print(response.read().decode("utf-8")) get_method_params("美女")
相关推荐
夜斗不是神 2020-11-17
染血白衣 2020-11-16
HeyShHeyou 2020-11-17
YENCSDN 2020-11-17
lsjweiyi 2020-11-17
houmenghu 2020-11-17
Erick 2020-11-17
以梦为马不负韶华 2020-10-20
lhtzbj 2020-11-17
pythonjw 2020-11-17
dingwun 2020-11-16
lhxxhl 2020-11-16
坚持是一种品质 2020-11-16
huavhuahua 2020-11-20
meylovezn 2020-11-20
逍遥友 2020-11-20
weiiron 2020-11-16