Python实现的下载8000首儿歌的代码分享
作者:junjie 时间:2021-02-03 05:41:51
下载8000首儿歌的python的代码:
#-*- coding: UTF-8 -*-
from pyquery import PyQuery as py
from lxml import etree
import urllib
import re
import os
import sys
import logging
def format(filename):
tuple=(' ',''','\'')
for char in tuple:
if (filename.find(char)!=-1):
filename=filename.replace(char,"_")
return filename
def download_mp3(mp3_url, filename,dir):
f = dir+"\\"+filename
if os.path.exists(f):
logger.debug(f+" is existed.")
return
try:
open(f, 'wb').write(urllib.urlopen(mp3_url).read())
logger.debug( filename + ' is downloaded.')
except:
logger.debug( filename + ' is not downloaded.')
def download_all_mp3(start,end,dir,logger):
for x in range(start,end):
try:
url = "http://www.youban.com/mp3-d" + str(x) + ".html"
logger.debug(str(x) + ": "+url)
doc = py(url=url)
e = doc('.mp3downloadbox')
if e is None or e == '':
logger.debug(url+" is not existed.")
return
e = unicode(e)
#logger.debug( e)
regex = re.compile(ur".*<h1>(.*)</h1>.*downloadboxlist.*?<a.*?\"(.*?)\"",re.UNICODE|re.S)
m = regex.search(e)
if m is not None:
title = m.group(1).strip()
title2 = str(x)+"_"+title + ".mp3"
#title2 = re.sub(' ','_',title2)
title2 = format(title2)
link = m.group(2)
#logger.debug( "title:" + title + " link:" + link)
if link == '' or title == '':
logger.debug(url + " is not useful")
continue
logger.debug(str(x)+": "+link)
download_mp3(link,title2,dir)
except:
logger.debug(url+" met exception.")
continue
if __name__ == "__main__":
dir_root = "e:\\song"
if sys.argv[3] != '': dir_root=sys.argv[3]
start,end = 1,8000
if sys.argv[1] >= 0 and sys.argv[2]>=0:
start,end = int(sys.argv[1]),int(sys.argv[2])
print ("Download from %s to %s.\n" % (start,end))
dir = dir_root + "\\"+str(start)+"-"+str(end)
if not os.path.exists(dir):
os.mkdir(dir)
print "Download to " + dir + ".\n"
logger = logging.getLogger("simple")
logger.setLevel(logging.DEBUG)
fh = logging.FileHandler(dir+"\\"+"download.log")
ch = logging.StreamHandler()
formatter = logging.Formatter("%(message)s")
ch.setFormatter(formatter)
fh.setFormatter(formatter)
logger.addHandler(ch)
logger.addHandler(fh)
download_all_mp3(start,end,dir,logger)
有需要的可以参考继续修改。
![](/images/zang.png)
![](/images/jiucuo.png)
猜你喜欢
深度解析Django REST Framework 批量操作
![](https://img.aspxhome.com/file/2023/2/76002_0s.png)
python数据结构之搜索讲解
![](https://img.aspxhome.com/file/2023/8/75098_0s.png)
CSS3属性box-shadow图层阴影效果使用教程
![](https://img.aspxhome.com/file/UploadPic/20105/16/01-89s.jpg)
PyTorch学习笔记之回归实战
![](https://img.aspxhome.com/file/2023/2/78752_0s.png)
Python 内置高阶函数详细
php中运用http调用的GET和POST方法示例
解决python selenium3启动不了firefox的问题
![](https://img.aspxhome.com/file/2023/2/105882_0s.jpg)
基于php+mysql的期末作业小项目(学生信息管理系统)
![](https://img.aspxhome.com/file/2023/4/55524_0s.jpg)
python里使用正则表达式的组嵌套实例详解
Python中二维列表如何获取子区域元素的组成
![](https://img.aspxhome.com/file/2023/8/95488_0s.png)
YOLOv8训练自己的数据集(详细教程)
![](https://img.aspxhome.com/file/2023/6/96986_0s.jpg)
python 实现两个npy档案合并
python基础之reverse和reversed函数的介绍及使用
JS实现仿新浪微博发布内容为空时提示功能代码
![](https://img.aspxhome.com/file/2023/4/56164_0s.jpg)
python 实时得到cpu和内存的使用情况方法
![](https://img.aspxhome.com/file/2023/1/86341_0s.jpg)
JS+ASP实现无刷新新闻列表方法
Linux CentOS Python开发环境搭建教程
django将网络中的图片,保存成model中的ImageField的实例
python判断所输入的任意一个正整数是否为素数的两种方法
python Django编写接口并用Jmeter测试的方法
![](https://img.aspxhome.com/file/2023/2/68372_0s.png)