简体   繁体   中英

Upload a CSV file in FastAPI and convert it into JSON format

I am trying to convert my CSV file into JSON by uploading it into FastAPI first, but when I try to process it directly (without storing it somewhere), I get error:

Error : FileNotFoundError: [Error 2] No such file or directory : "testdata.csv"

Code:

async def upload(file: UploadFile = File(...)):
    data = {}    
    with open(file.filename,encoding='utf-8') as csvf:
        csvReader = csv.DictReader(csvf)
        for rows in csvReader:             
            key = rows['No']
            data[key] = rows    
    return {data}```

UploadFile uses Python's SpooledTemporaryFile , a "file stored in memory", and which "is destroyed as soon as it is closed". For more info on that, please have a look at this answer .

Option 1

To approach the problem on your way (ie, reading from csv file and not using the file contents that you can get from contents = await file.read() ), you can copy the file contents into a NamedTemporaryFile (again, check this answer out for more info on that), and then use it to iterate over the csv contents. Below is a working example:

import uvicorn
from fastapi import FastAPI, File, UploadFile
from tempfile import NamedTemporaryFile
import os
import csv

app = FastAPI()
    

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    contents = await file.read()
    data = {}
    file_copy = NamedTemporaryFile(delete=False)
    
    try:
        with file_copy as f:  # The 'with' block ensures that the file closes and data are stored
            f.write(contents);
        
        with open(file_copy.name,'r', encoding='utf-8') as csvf:
            csvReader = csv.DictReader(csvf)
            for rows in csvReader:             
                key = rows['No']
                data[key] = rows  
    finally:
        file_copy.close()  # Remember to close any file instances before removing the temp file
        os.unlink(file_copy.name)  # delete the file
    
    return data
    

Option 2

Alternatively, a much more elegant solution would be to use, as mentioned earlier, the byte data of the uploaded file, saving you from copying them into a new temporary file. Convert the bytes into a string, and then load the string object into an in-memory text buffer (ie, StringIO ), as mentioned here , which you can pass to the csv reader. Example below:

from fastapi import FastAPI, File, UploadFile
import csv
from io import StringIO

app = FastAPI()

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    data = {}
    contents = await file.read()
    decoded = contents.decode()
    buffer = StringIO(decoded)
    csvReader = csv.DictReader(buffer)
    for rows in csvReader:             
        key = rows['No']
        data[key] = rows  
        
    buffer.close()
    return data

Option 3

You could also write the bytes from the uploaded file to a BytesIO stream, which you could then convert into a pandas dataframe. Next, using the to_dict() method (as described in this answer ), you could convert the dataframe into dictionary and return it (which, by default, FastAPI will convert into JSON using the jsonable_encoder and return a JSONResponse ).

from fastapi import FastAPI, File, UploadFile
from io import BytesIO
import pandas as pd

app = FastAPI()
    
@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    contents = await file.read()
    buffer = BytesIO(contents)
    df = pd.read_csv(buffer)
    buffer.close()
    return df.to_dict(orient='records')

The reason why you are getting the Error: FileNotFoundError: [Error 2] No such file or directory: "testdata.csv" is because you are trying to read a file that is not stored locally.

If you want to read the file this way you should save the uploaded file before proceeding:

async def upload(uploaded_file: UploadFile = File(...)):
    # save csv to local dir
    csv_name = uploaded_file.filename
    csv_path = 'path_to/csv_dir/'
    file_path = os.path.join(csv_path, csv_name)
    with open(file_path, mode='wb+') as f:
        f.write(uploaded_file.file.read())

    # read csv and convert to json
    data = {}
    with open(file_path, mode='r', encoding='utf-8') as csvf:
        csvReader = csv.DictReader(csvf)
        for rows in csvReader:             
            key = rows['No']
            data[key] = rows    
    return {data}

The file in the async function upload() is already open and you can fetch characters from it directly, there's no need to open it again. Also in FastAPI the class UploadFile is actually derived from standard library tempfile.SpooledTemporaryFile , which cannot be accessed by specifying path of the temporary file.

For example, if you use CPython and read the value of file.filename in the upload() in the Unix-like system, it returns a number instead of a well-formed path, because any instance of the class SpooledTemporaryFile will create a file descriptor (at some point when current stored data exceeds max_size ) and simply return the file descriptor (should be a number in Unix) on accessing SpooledTemporaryFile.filename

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM