Django 大文件下载实现过程解析

作者：再见紫罗兰时间：2021-12-18 20:48:32　

django提供文件下载时，若果文件较小，解决办法是先将要传送的内容全生成在内存中，然后再一次性传入Response对象中：

def simple_file_download(request):
# do something...
content = open("simplefile", "rb").read()

如果文件非常大时，最简单的办法就是使用静态文件服务器，比如Apache或者Nginx服务器来处理下载。不过有时候，我们需要对用户的权限做一下限定，或者不想向用户暴露文件的真实地址，或者这个大内容是临时生成的(比如临时将多个文件合并而成的)，这时就不能使用静态文件服务器了。

django文档中提到，可以向HttpResponse传递一个迭代器，流式的向客户端传递数据。

要自己写迭代器的话，可以用yield：

def read_file(filename, buf_size=8192):
with open(filename, "rb") as f:
while True:
content = f.read(buf_size)
if content:
yield content
else:
break
def big_file_download(request):
filename = "filename"
response = HttpResponse(read_file(filename))
return response

或者使用生成器表达式，下面是django文档中提供csv大文件下载的例子：

import csv

from django.utils.six.moves import range
from django.http import StreamingHttpResponse

class Echo(object):
"""An object that implements just the write method of the file-like
interface.
"""
def write(self, value):
"""Write the value by returning it, instead of storing in a buffer."""
return value

def some_streaming_csv_view(request):
"""A view that streams a large CSV file."""
# Generate a sequence of rows. The range is based on the maximum number of
# rows that can be handled by a single sheet in most spreadsheet
# applications.
rows = (["Row {0}".format(idx), str(idx)] for idx in range(65536))
pseudo_buffer = Echo()
writer = csv.writer(pseudo_buffer)
response = StreamingHttpResponse((writer.writerow(row) for row in rows),
content_type="text/csv")
response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
return response

python也提供一个文件包装器，将类文件对象包装成一个迭代器：

class FileWrapper:
"""Wrapper to convert file-like objects to iterables"""
def __init__(self, filelike, blksize=8192):
self.filelike = filelike
self.blksize = blksize
if hasattr(filelike,'close'):
self.close = filelike.close
def __getitem__(self,key):
data = self.filelike.read(self.blksize)
if data:
return data
raise IndexError
def __iter__(self):
return self
def next(self):
data = self.filelike.read(self.blksize)
if data:
return data
raise StopIteration

使用时：

from django.core.servers.basehttp import FileWrapper
from django.http import HttpResponse
import os
def file_download(request,filename):

wrapper = FileWrapper(open(filename, 'rb'))
response = HttpResponse(wrapper, content_type='application/octet-stream')
response['Content-Length'] = os.path.getsize(path)
response['Content-Disposition'] = 'attachment; filename=％s' ％ filename
return response

django也提供了StreamingHttpResponse类来代替HttpResponse对流数据进行处理。

压缩为zip文件下载：

import os, tempfile, zipfile
from django.http import HttpResponse
from django.core.servers.basehttp import FileWrapper
def send_zipfile(request):
"""
Create a ZIP file on disk and transmit it in chunks of 8KB,
without loading the whole file into memory. A similar approach can
be used for large dynamic PDF files.
"""
temp = tempfile.TemporaryFile()
archive = zipfile.ZipFile(temp, 'w', zipfile.ZIP_DEFLATED)
for index in range(10):
filename = __file__ # Select your files here.
archive.write(filename, 'file％d.txt' ％ index)
archive.close()
wrapper = FileWrapper(temp)
response = HttpResponse(wrapper, content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=test.zip'
response['Content-Length'] = temp.tell()
temp.seek(0)
return response

不过不管怎么样，使用django来处理大文件下载都不是一个很好的注意，最好的办法是django做权限判断，然后让静态服务器处理下载。

这需要使用sendfile的机制："传统的Web服务器在处理文件下载的时候，总是先读入文件内容到应用程序内存，然后再把内存当中的内容发送给客户端浏览器。这种方式在应付当今大负载网站会消耗更多的服务器资源。sendfile是现代操作系统支持的一种高性能网络IO方式，操作系统内核的sendfile调用可以将文件内容直接推送到网卡的buffer当中，从而避免了Web服务器读写文件的开销，实现了“零拷贝”模式。 "

Apache服务器里需要mod_xsendfile模块来实现，而Nginx是通过称为X-Accel-Redirect的特性来实现。

nginx配置文件：

# Will serve /var/www/files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
internal;
alias /var/www/files;
}

或者

# Will serve /var/www/protected_files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
internal;
root /var/www;
}

注意alias和root的区别。

django中：

response['X-Accel-Redirect']='/protected_files/％s'％filename

这样当向django view函数发起request时，django负责对用户权限进行判断或者做些其它事情，然后向nginx转发url为/protected_files/filename的请求，nginx服务器负责文件/var/www/protected_files/filename的下载：

@login_required
def document_view(request, document_id):
book = Book.objects.get(id=document_id)
response = HttpResponse()
name=book.myBook.name.split('/')[-1]
response['Content_Type']='application/octet-stream'
response["Content-Disposition"] = "attachment; filename={0}".format(
name.encode('utf-8'))
response['Content-Length'] = os.path.getsize(book.myBook.path)
response['X-Accel-Redirect'] = "/protected/{0}".format(book.myBook.name)
return response

来源：https://www.cnblogs.com/linxiyue/p/4187484.html

标签：django,大文件,下载

投稿

Django 大文件下载实现过程解析

猜你喜欢

熵值法原理及Python实现的示例详解

使用python获取CPU和内存信息的思路与实现(linux系统)

用ASP判断客户端浏览器语言自动跳转

学点简单的Django之第一个Django程序的实现

一个超级简单的python web程序

javascript农历日历及世界时间代码

解读keras中的正则化(regularization)问题

一个css与js结合的下拉菜单支持主流浏览器

Python简单删除列表中相同元素的方法示例

利用php+mcDropdown实现文件路径可在下拉框选择

Python中注释（多行注释和单行注释）的用法实例

详解python上传文件和字符到PHP服务器

django解决跨域请求的问题

OpenCV图像分割之分水岭算法与图像金字塔算法详解

Python 使用指定的网卡发送HTTP请求的实例

微信小程序上传图片到php服务器的方法

Python cookbook(数据结构与算法)找出序列中出现次数最多的元素算法示例

Asp Oracle存储过程返回结果集的代码

深入理解Python分布式爬虫原理

ASP判断E-Mail的合法性，以及过滤邮箱字符