簡體   English   中英

使用 Flask 處理大文件上傳

[英]Handling large file uploads with Flask

使用 Flask 處理非常大的文件上傳(1 GB +)的最佳方法是什么?

我的應用程序基本上需要多個文件,為它們分配一個唯一的文件編號,然后根據用戶選擇的位置將其保存在服務器上。

我們如何將文件上傳作為后台任務運行,以便用戶在 1 小時內沒有瀏覽器旋轉,而是可以立即進入下一頁?

  • Flask 開發服務器能夠處理海量文件(50GB 需要 1.5 小時,上傳很快,但將文件寫入空白文件非常緩慢)
  • 如果我用 Twisted 包裝應用程序,應用程序會在大文件上崩潰
  • 我試過將 Celery 與 Redis 一起使用,但這似乎不是發布上傳的選項
  • 我在 Windows 上,網絡服務器的選項較少

我認為解決這個問題的超級簡單方法只是將文件分成許多小部分/塊發送。 所以有兩個部分來完成這項工作,前端(網站)和后端(服務器)。 對於前端部分,您可以使用Dropzone.js類的Dropzone.js ,它沒有額外的依賴項,並且包含不錯的 CSS。 您所要做的就是將類dropzone添加到表單中,它會自動將其變成其特殊的拖放字段之一(您也可以單擊並選擇)。

但是,默認情況下,dropzone 不會分塊文件。 幸運的是,它真的很容易啟用。 這是一個啟用了DropzoneJSchunking的示例文件上傳表單:

<html lang="en">
<head>

    <meta charset="UTF-8">

    <link rel="stylesheet" 
     href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.css"/>

    <link rel="stylesheet" 
     href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/basic.min.css"/>

    <script type="application/javascript" 
     src="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.js">
    </script>

    <title>File Dropper</title>
</head>
<body>

<form method="POST" action='/upload' class="dropzone dz-clickable" 
      id="dropper" enctype="multipart/form-data">
</form>

<script type="application/javascript">
    Dropzone.options.dropper = {
        paramName: 'file',
        chunking: true,
        forceChunking: true,
        url: '/upload',
        maxFilesize: 1025, // megabytes
        chunkSize: 1000000 // bytes
    }
</script>
</body>
</html>

這是使用燒瓶的后端部分:

import logging
import os

from flask import render_template, Blueprint, request, make_response
from werkzeug.utils import secure_filename

from pydrop.config import config

blueprint = Blueprint('templated', __name__, template_folder='templates')

log = logging.getLogger('pydrop')


@blueprint.route('/')
@blueprint.route('/index')
def index():
    # Route to serve the upload form
    return render_template('index.html',
                           page_name='Main',
                           project_name="pydrop")


@blueprint.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']

    save_path = os.path.join(config.data_dir, secure_filename(file.filename))
    current_chunk = int(request.form['dzchunkindex'])

    # If the file already exists it's ok if we are appending to it,
    # but not if it's new file that would overwrite the existing one
    if os.path.exists(save_path) and current_chunk == 0:
        # 400 and 500s will tell dropzone that an error occurred and show an error
        return make_response(('File already exists', 400))

    try:
        with open(save_path, 'ab') as f:
            f.seek(int(request.form['dzchunkbyteoffset']))
            f.write(file.stream.read())
    except OSError:
        # log.exception will include the traceback so we can see what's wrong 
        log.exception('Could not write to file')
        return make_response(("Not sure why,"
                              " but we couldn't write the file to disk", 500))

    total_chunks = int(request.form['dztotalchunkcount'])

    if current_chunk + 1 == total_chunks:
        # This was the last chunk, the file should be complete and the size we expect
        if os.path.getsize(save_path) != int(request.form['dztotalfilesize']):
            log.error(f"File {file.filename} was completed, "
                      f"but has a size mismatch."
                      f"Was {os.path.getsize(save_path)} but we"
                      f" expected {request.form['dztotalfilesize']} ")
            return make_response(('Size mismatch', 500))
        else:
            log.info(f'File {file.filename} has been uploaded successfully')
    else:
        log.debug(f'Chunk {current_chunk + 1} of {total_chunks} '
                  f'for file {file.filename} complete')

    return make_response(("Chunk upload successful", 200))

使用copy_current_request_context ,它會復制上下文request你可以使用線程或其他任何東西來讓你的任務在后台運行。

也許舉個例子就清楚了。我用 3.37G 的文件-debian-9.5.0-amd64-DVD-1.iso 對其進行了測試。

# coding:utf-8

from flask import Flask,render_template,request,redirect,url_for
from werkzeug.utils import secure_filename
import os
from time import sleep
from flask import copy_current_request_context
import threading
import datetime
app = Flask(__name__)
@app.route('/upload', methods=['POST','GET'])
def upload():
    @copy_current_request_context
    def save_file(closeAfterWrite):
        print(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') + " i am doing")
        f = request.files['file']
        basepath = os.path.dirname(__file__) 
        upload_path = os.path.join(basepath, '',secure_filename(f.filename)) 
        f.save(upload_path)
        closeAfterWrite()
        print(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') + " write done")
    def passExit():
        pass
    if request.method == 'POST':
        f= request.files['file']
        normalExit = f.stream.close
        f.stream.close = passExit
        t = threading.Thread(target=save_file,args=(normalExit,))
        t.start()
        return redirect(url_for('upload'))
    return render_template('upload.html')

if __name__ == '__main__':
    app.run(debug=True)

這是模板,它應該是模板\\上傳.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>
    <h1>example</h1>
    <form action="" enctype='multipart/form-data' method='POST'>
        <input type="file" name="file">
        <input type="submit" value="upload">
    </form>
</body>
</html>

上傳文件時,您不能離開頁面並讓它繼續。 頁面必須保持打開狀態才能繼續上傳。

您可以做的是打開一個新選項卡,僅用於處理上傳並在用戶在上傳完成之前無意中關閉新選項卡時提醒用戶。 這樣上傳將與用戶在原始頁面上所做的任何事情分開,這樣他們仍然可以在不取消上傳的情況下導航。 上傳選項卡也可以在完成后自行關閉。

索引.js

    // get value from <input id="upload" type="file"> on page
    var upload = document.getElementById('upload');
    upload.addEventListener('input', function () {
        // open new tab and stick the selected file in it
        var file = upload.files[0];
        var uploadTab = window.open('/upload-page', '_blank');
        if (uploadTab) {
            uploadTab.file = file;
        } else {
            alert('Failed to open new tab');
        }
    });

上傳頁面.js

    window.addEventListener('beforeunload', function () {
        return 'The upload will cancel if you leave the page, continue?';
    });
    window.addEventListener('load', function () {
        var req = new XMLHttpRequest();
        req.addEventListener('progress', function (evt) {
            var percentage = '' + (evt.loaded / evt.total * 100) + '%';
            // use percentage to update progress bar or something
        });
        req.addEventListener('load', function () {
            alert('Upload Finished');
            window.removeEventListener('beforeunload');
            window.close();
        });
        req.addRequestHeader('Content-Type', 'application/octet-stream');
        req.open('POST', '/upload/'+encodeURIComponent(window.file.name));
        req.send(window.file);
    });

在服務器上,您可以使用 request.stream 以塊的形式讀取上傳的文件,以避免必須先等待整個內容加載到內存中。

服務器.py

@app('/upload/<filename>', methods=['POST'])
def upload(filename):
    filename = urllib.parse.unquote(filename)
    bytes_left = int(request.headers.get('content-length'))
    with open(os.path.join('uploads', filename), 'wb') as upload:
        chunk_size = 5120
        while bytes_left > 0:
            chunk = request.stream.read(chunk_size)
            upload.write(chunk)
            bytes_left -= len(chunk)
        return make_response('Upload Complete', 200)

您也許可以使用 FormData api 而不是八位字節流,但我不確定您是否可以在燒瓶中流式傳輸它們。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM