python爬虫 批量下载zabbix文档代码实例
作者:NAVYSUMMER 时间:2022-11-07 11:10:29
这篇文章主要介绍了python爬虫 批量下载zabbix文档代码实例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下
# -*- coding: UTF-8 -*-
import requests,re,time
url = 'https://www.zabbix.com/documentation/3.4/zh/manual'
base_url = 'https://www.zabbix.com/documentation/3.4/'
seconds = 1
err_url = []
def get_urls():
res = requests.get(url)
content = res.text
pattern = re.compile(r"indexmenu_4848130395ca30b274d8bd.add[(]'(zh/manual.*?)[']", re.S)
routes = pattern.findall(content)
urls = [base_url+item for item in routes]
return urls
def download(url):
download_url = url + "?do=export_pdf"
print("当前下载url:")
print(download_url)
res = requests.get(url)
if res.status_code == 200 :
pattern = re.compile(r"<title>(.*?)</title>", re.S)
title = pattern.findall(res.text)[0].encode("utf-8")
try:
filename = title.replace('\\','-').replace('/','-').replace('"','-').replace('*','-').replace('?','-').replace(':','-').replace('<','-').replace('>','-').replace('|','-')
except Exception:
title = pattern.findall(res.text)[0]
filename = title.replace('\\','-').replace('/','-').replace('"','-').replace('*','-').replace('?','-').replace(':','-').replace('<','-').replace('>','-').replace('|','-')
file = filename + '.pdf'
res = requests.get(download_url)
if res.status_code == 200 :
with open(file,"wb") as f:
f.write(res.content)
print('下载成功')
else:
print('下载失败')
err_url.append(download_url)
else:
print('获取文件名失败,停止当前下载')
err_url.append(download_url)
def downloads(urls):
for url in urls:
download(url)
time.sleep( seconds )
if len(err_url) :
print("下载失败的URL:")
print(err_url)
def main():
print("下载开始")
urls = get_urls()
downloads(urls)
print("下载完成")
if __name__ == '__main__':
main()
来源:https://www.cnblogs.com/navysummer/p/11051036.html
标签:python,爬虫,批量,下载,zabbix,文档
![](/images/zang.png)
![](/images/jiucuo.png)
猜你喜欢
Python异常继承关系和自定义异常实现代码实例
2023-06-22 07:34:44
利用Python实现Shp格式向GeoJSON的转换方法
2021-01-30 09:14:49
![](https://img.aspxhome.com/file/2023/1/121391_0s.jpg)
不固定参数的存储过程实现代码
2024-01-22 16:36:39
![](https://img.aspxhome.com/file/2023/7/92787_0s.gif)
Python制作个性化的词云图实例讲解
2021-10-17 03:51:02
![](https://img.aspxhome.com/file/2023/0/89280_0s.jpg)
关于python中readlines函数的参数hint的相关知识总结
2023-12-31 02:37:12
![](https://img.aspxhome.com/file/2023/5/63565_0s.png)
Java实现飞机大战-连接数据库并把得分写入数据库
2024-01-26 19:16:54
文字解说Golang Goroutine和线程的区别
2023-10-15 18:56:36
不归路系列:Python入门之旅-一定要注意缩进!!!(推荐)
2021-07-08 07:35:00
![](https://img.aspxhome.com/file/2023/5/78735_0s.png)
使用 XML HTTP Request 对象[翻译]
2007-11-07 21:11:00
SQL语句中EXISTS的详细用法大全
2024-01-22 11:09:11
![](https://img.aspxhome.com/file/2023/5/115565_0s.png)
tab(标签)在使用时的禁忌
2009-04-16 13:06:00
![](https://img.aspxhome.com/file/UploadPic/20094/16/01-33s.jpg)
Python传递参数的多种方式(小结)
2023-05-10 23:05:46
Sqlserver 高并发和大数据存储方案
2024-01-17 22:45:56
![](https://img.aspxhome.com/file/2023/6/116266_0s.png)
MySQL全局锁和表锁的深入理解
2024-01-24 00:48:53
![](https://img.aspxhome.com/file/2023/2/113982_0s.png)
Python中如何给字典设置默认值
2023-09-21 00:15:32
SQL Server中查看对象定义的SQL语句
2024-01-18 05:52:43
python中的import、from import及import as的区别解析
2022-10-07 15:56:09
python list删除元素时要注意的坑点分享
2021-07-15 16:02:22
利用Python实现Windows下的鼠标键盘模拟的实例代码
2023-06-22 04:37:31
![](https://img.aspxhome.com/file/2023/9/79659_0s.jpg)
NopCommerce架构分析之(六)自定义RazorViewEngine和WebViewPage
2024-05-03 15:31:10