Python grequests模块使用场景及代码实例
作者:Yi_warmth 时间:2021-08-05 18:30:41
使用场景:
1) 爬虫设置ip代理池时验证ip是否有效
2)进行压测时,进行批量请求等等场景
grequests 利用 requests和gevent库,做了一个简单封装,使用起来非常方便。
grequests.map(requests, stream=False, size=None, exception_handler=None, gtimeout=None)
另外,由于grequests底层使用的是requests,因此它支持
GET,OPTIONS, HEAD, POST, PUT, DELETE 等各种http method
所以以下的任务请求都是支持的
grequests.post(url, json={“name”:“zhangsan”})
grequests.delete(url)
代码如下:
import grequests
urls = [
'http://www.baidu.com',
'http://www.qq.com',
'http://www.163.com',
'http://www.zhihu.com',
'http://www.toutiao.com',
'http://www.douban.com'
]
rs = (grequests.get(u) for u in urls)
print(grequests.map(rs)) # [<Response [200]>, None, <Response [200]>, None, None, <Response [418]>]
def exception_handler(request, exception):
print("Request failed")
reqs = [
grequests.get('http://httpbin.org/delay/1', timeout=0.001),
grequests.get('http://fakedomain/'),
grequests.get('http://httpbin.org/status/500')
]
print(grequests.map(reqs, exception_handler=exception_handler))
实际操作中,也可以自定义返回的结果
修改grequests源码文件:
例如:
新增extract_item() 函数合修改map()函数
def extract_item(request):
"""
提取request的内容
:param request:
:return:
"""
item = dict()
item["url"] = request.url
item["text"] = request.response.text or ""
item["status_code"] = request.response.status_code or 0
return item
def map(requests, stream=False, size=None, exception_handler=None, gtimeout=None):
"""Concurrently converts a list of Requests to Responses.
:param requests: a collection of Request objects.
:param stream: If True, the content will not be downloaded immediately.
:param size: Specifies the number of requests to make at a time. If None, no throttling occurs.
:param exception_handler: Callback function, called when exception occured. Params: Request, Exception
:param gtimeout: Gevent joinall timeout in seconds. (Note: unrelated to requests timeout)
"""
requests = list(requests)
pool = Pool(size) if size else None
jobs = [send(r, pool, stream=stream) for r in requests]
gevent.joinall(jobs, timeout=gtimeout)
ret = []
for request in requests:
if request.response is not None:
ret.append(extract_item(request))
elif exception_handler and hasattr(request, 'exception'):
ret.append(exception_handler(request, request.exception))
else:
ret.append(None)
yield ret
可以直接调用:
import grequests
urls = [
'http://www.baidu.com',
'http://www.qq.com',
'http://www.163.com',
'http://www.zhihu.com',
'http://www.toutiao.com',
'http://www.douban.com'
]
rs = (grequests.get(u) for u in urls)
response_list = grequests.map(rs, gtimeout=10)
for response in next(response_list):
print(response)
支持事件钩子
def print_url(r, *args, **kwargs):
print(r.url)
url = “http://www.baidu.com”
res = requests.get(url, hooks={“response”: print_url})
tasks = []
req = grequests.get(url, callback=print_url)
tasks.append(req)
ress = grequests.map(tasks)
print(ress)
来源:https://www.cnblogs.com/zhouzetian/p/13380537.html
标签:Python,grequests,模块
0
投稿
猜你喜欢
Python基础之字符串操作常用函数集合
2023-11-26 23:26:12
最近写的一个asp缓存函数
2008-11-25 14:07:00
详解django的serializer序列化model几种方法
2022-12-06 00:40:08
Go与C语言的互操作实现
2024-02-04 08:39:28
CSS Sprites + 圆角[译]
2009-05-08 16:10:00
极致之美——百行代码实现全新智能语言Lisp
2010-07-13 13:07:00
PHP版微信小店接口开发实例
2023-11-10 11:56:06
JS获取当前时间的年月日时分秒及时间的格式化的方法
2024-04-17 10:23:00
smarty缓存用法分析
2024-06-07 15:44:41
Django REST framework 如何实现内置访问频率控制
2023-08-01 17:11:00
MySQL使用profile查询性能的操作教程
2024-01-19 10:22:22
使用PHP实现生成HTML静态页面
2023-11-14 11:14:41
python Plotly绘图工具的简单使用
2023-06-13 01:16:17
Python selenium 自动化脚本打包成一个exe文件(推荐)
2023-01-13 02:37:39
微信小程序实现列表下拉刷新上拉加载
2024-05-21 10:11:26
解决Python安装后pip不能用的问题
2023-05-10 04:52:14
MySQL中CURRENT_TIMESTAMP的使用方式
2024-01-12 20:10:29
mac 上配置Pycharm连接远程服务器并实现使用远程服务器Python解释器的方法
2021-10-19 18:40:40
基于python实现MQTT发布订阅过程原理解析
2023-05-06 14:45:14
几个MySQL高频面试题的解答
2024-01-19 05:13:49