简体   繁体   English

如何将csv数据导入django模型

[英]how to import csv data into django models

I have some CSV data and I want to import into django models using the example CSV data:我有一些 CSV 数据,我想使用示例 CSV 数据导入 django 模型:

1;"02-01-101101";"Worm Gear HRF 50";"Ratio 1 : 10";"input shaft, output shaft, direction A, color dark green";
2;"02-01-101102";"Worm Gear HRF 50";"Ratio 1 : 20";"input shaft, output shaft, direction A, color dark green";
3;"02-01-101103";"Worm Gear HRF 50";"Ratio 1 : 30";"input shaft, output shaft, direction A, color dark green";
4;"02-01-101104";"Worm Gear HRF 50";"Ratio 1 : 40";"input shaft, output shaft, direction A, color dark green";
5;"02-01-101105";"Worm Gear HRF 50";"Ratio 1 : 50";"input shaft, output shaft, direction A, color dark green";

I have some django models named Product.我有一些名为 Product 的 Django 模型。 In Product there are some fields like name , description and price .在 Product 中有一些字段,例如namedescriptionprice I want something like this:我想要这样的东西:

product=Product()
product.name = "Worm Gear HRF 70(02-01-101116)"
product.description = "input shaft, output shaft, direction A, color dark green"
product.price = 100

You want to use the csv module that is part of the python language and you should use Django's get_or_create method您想使用作为 python 语言一部分的 csv 模块,您应该使用 Django 的 get_or_create 方法

 with open(path) as f:
        reader = csv.reader(f)
        for row in reader:
            _, created = Teacher.objects.get_or_create(
                first_name=row[0],
                last_name=row[1],
                middle_name=row[2],
                )
            # creates a tuple of the new object or
            # current object and a boolean of if it was created

In my example the model teacher has three attributes first_name, last_name and middle_name.在我的示例中,模型教师具有三个属性 first_name、last_name 和 middle_name。

Django documentation of get_or_create method get_or_create 方法的 Django 文档

If you want to use a library, a quick google search for csv and django reveals two libraries - django-csvimport and django-adaptors .如果你想使用一个库,谷歌快速搜索csvdjango会发现两个库 - django-csvimportdjango-adaptors Let's read what they have to say about themselves...让我们来看看他们对自己的评价……

  • django-adaptors : django 适配器

Django adaptor is a tool which allow you to transform easily a CSV/XML file into a python object or a django model instance. Django 适配器是一个工具,它允许您轻松地将 CSV/XML 文件转换为 python 对象或 django 模型实例。

  • django-importcsv : django-importcsv

django-csvimport is a generic importer tool to allow the upload of CSV files for populating data. django-csvimport 是一个通用的导入工具,允许上传 CSV 文件来填充数据。

The first requires you to write a model to match the csv file, while the second is more of a command-line importer, which is a huge difference in the way you work with them, and each is good for a different type of project.第一个要求你编写一个模型来匹配 csv 文件,而第二个更像是一个命令行导入器,这与你使用它们的方式有很大的不同,并且每个都适用于不同类型的项目。

So which one to use?那么使用哪一个呢? That depends on which of those will be better suited for your project in the long run.这取决于从长远来看,哪些更适合您的项目。

However, you can also avoid a library altogether, by writing your own django script to import your csv file, something along the lines of (warning, pseudo-code ahead):但是,您也可以完全避免使用库,方法是编写自己的 django 脚本来导入 csv 文件,类似于(警告,前面的伪代码):

# open file & create csvreader
import csv, yada yada yada

# import the relevant model
from myproject.models import Foo

#loop:
for line in csv file:
     line = parse line to a list
     # add some custom validation\parsing for some of the fields

     foo = Foo(fieldname1=line[1], fieldname2=line[2] ... etc. )
     try:
         foo.save()
     except:
         # if the're a problem anywhere, you wanna know about it
         print "there was a problem with line", i 

It's super easy.这非常容易。 Hell, you can do it interactively through the django shell if it's a one-time import.地狱,如果它是一次性导入,您可以通过 django shell 交互地完成它。 Just - figure out what you want to do with your project, how many files do you need to handle and then - if you decide to use a library, try figuring out which one better suits your needs .只需 - 弄清楚你想对你的项目做什么,你需要处理多少个文件,然后 - 如果你决定使用一个库,试着找出哪个更适合你的需要

Use the Pandas library to create a dataframe of the csv data.使用Pandas 库创建 csv 数据的数据框。
Name the fields either by including them in the csv file's first line or in code by using the dataframe's columns method.通过将字段包含在 csv 文件的第一行或使用数据框的 columns 方法在代码中命名字段。
Then create a list of model instances.然后创建一个模型实例列表。
Finally use the django method .bulk_create() to send your list of model instances to the database table.最后使用 django 方法.bulk_create()将模型实例列表发送到数据库表。

The read_csv function in pandas is great for reading csv files and gives you lots of parameters to skip lines, omit fields, etc. pandas 中的read_csv函数非常适合读取 csv 文件,并为您提供了许多参数来跳过行、省略字段等。

import pandas as pd
from app.models import Product

tmp_data=pd.read_csv('file.csv',sep=';')
#ensure fields are named~ID,Product_ID,Name,Ratio,Description
#concatenate name and Product_id to make a new field a la Dr.Dee's answer
products = [
    Product(
        name = tmp_data.ix[row]['Name'], 
        description = tmp_data.ix[row]['Description'],
        price = tmp_data.ix[row]['price'],
    )
    for row in tmp_data['ID']
]
Product.objects.bulk_create(products)

I was using the answer by mmrs151 but saving each row (instance) was very slow and any fields containing the delimiting character (even inside of quotes) were not handled by the open() -- line.split(';') method.我正在使用 mmrs151 的答案,但保存每一行(实例)非常慢,并且任何包含定界字符(甚至在引号内)的字段都不是由 open() -- line.split(';') 方法处理的。

Pandas has so many useful caveats, it is worth getting to know Pandas 有很多有用的注意事项,值得了解

You can also use, django-adaptors你也可以使用, django-adaptors

>>> from adaptor.model import CsvModel
>>> class MyCSvModel(CsvModel):
...     name = CharField()
...     age = IntegerField()
...     length = FloatField()
...
...     class Meta:
...         delimiter = ";"

You declare a MyCsvModel which will match to a CSV file like this:您声明了一个 MyCsvModel ,它将与 CSV 文件匹配,如下所示:

Anthony;27;1.75安东尼;27;1.75

To import the file or any iterable object, just do:要导入文件或任何可迭代对象,只需执行以下操作:

>>> my_csv_list = MyCsvModel.import_data(data = open("my_csv_file_name.csv"))
>>> first_line = my_csv_list[0]
>>> first_line.age
    27

Without an explicit declaration, data and columns are matched in the same order:如果没有显式声明,数据和列将以相同的顺序匹配:

Anthony --> Column 0 --> Field 0 --> name
27      --> Column 1 --> Field 1 --> age
1.75    --> Column 2 --> Field 2 --> length

For django 1.8 that im using,对于我正在使用的 django 1.8,

I made a command that you can create objects dynamically in the future, so you can just put the file path of the csv, the model name and the app name of the relevant django application, and it will populate the relevant model without specified the field names.我做了一个命令,你以后可以动态创建对象,所以你可以把csv的文件路径,模型名称和相关django应用程序的应用程序名称,它会填充相关模型而不指定字段名字。 so if we take for example the next csv:因此,如果我们以下一个 csv 为例:

field1,field2,field3
value1,value2,value3
value11,value22,value33

it will create the objects [{field1:value1,field2:value2,field3:value3}, {field1:value11,field2:value22,field3:value33}] for the model name you will enter to the command.它将为您将在命令中输入的模型名称创建对象 [{field1:value1,field2:value2,field3:value3}, {field1:value11,field2:value22,field3:value33}]。

the command code:命令代码:

from django.core.management.base import BaseCommand
from django.db.models.loading import get_model
import csv


class Command(BaseCommand):
    help = 'Creating model objects according the file path specified'

    def add_arguments(self, parser):
        parser.add_argument('--path', type=str, help="file path")
        parser.add_argument('--model_name', type=str, help="model name")
        parser.add_argument('--app_name', type=str, help="django app name that the model is connected to")

    def handle(self, *args, **options):
        file_path = options['path']
        _model = get_model(options['app_name'], options['model_name'])
        with open(file_path, 'rb') as csv_file:
            reader = csv.reader(csv_file, delimiter=',', quotechar='|')
            header = reader.next()
            for row in reader:
                _object_dict = {key: value for key, value in zip(header, row)}
                _model.objects.create(**_object_dict)

note that maybe in later versions请注意,也许在以后的版本中

from django.db.models.loading import get_model

is deprecated and need to be change to已弃用,需要更改为

from django.apps.apps import get_model

Python csv 库可以进行解析,您的代码可以将它们转换为Products()

something like this:像这样的东西:

f = open('data.txt', 'r')  
for line in f:  
   line =  line.split(';')  
   product = Product()  
   product.name = line[2] + '(' + line[1] + ')'  
   product.description = line[4]  
   product.price = '' #data is missing from file  
   product.save()  

f.close()  

If you're working with new versions of Django (>10) and don't want to spend time writing the model definition.如果您正在使用新版本的 Django (>10) 并且不想花时间编写模型定义。 you can use the ogrinspect tool.您可以使用 ogrinspect 工具。

This will create a code definition for the model .这将为模型创建一个代码定义。

python manage.py ogrinspect [/path/to/thecsv] Product

The output will be the class (model) definition.输出将是类(模型)定义。 In this case the model will be called Product .在这种情况下,模型将被称为Product You need to copy this code into your models.py file.您需要将此代码复制到您的 models.py 文件中。

Afterwards you need to migrate (in the shell) the new Product table with:之后,您需要使用以下命令迁移(在 shell 中)新的 Product 表:

python manage.py makemigrations
python manage.py migrate

More information here: https://docs.djangoproject.com/en/1.11/ref/contrib/gis/tutorial/更多信息在这里: https ://docs.djangoproject.com/en/1.11/ref/contrib/gis/tutorial/

Do note that the example has been done for ESRI Shapefiles but it works pretty good with standard CSV files as well.请注意,该示例已针对 ESRI Shapefiles 完成,但它也适用于标准 CSV 文件。

For ingesting your data (in CSV format) you can use pandas.要获取数据(CSV 格式),您可以使用 pandas。

import pandas as pd
your_dataframe = pd.read_csv(path_to_csv)
# Make a row iterator (this will go row by row)
iter_data = your_dataframe.iterrows()

Now, every row needs to be transformed into a dictionary and use this dict for instantiating your model (in this case, Product())现在,每一行都需要转换成一个字典并使用这个字典来实例化你的模型(在本例中为 Product())

# python 2.x
map(lambda (i,data) : Product.objects.create(**dict(data)),iter_data

Done, check your database now.完成,现在检查您的数据库。

Write command in Django app.在 Django 应用程序中编写命令。 Where you need to provide a CSV file and loop it and create a model with every new row.您需要在其中提供 CSV 文件并将其循环并为每个新行创建一个模型。

your_app_folder/management/commands/ProcessCsv.py

from django.core.management.base import BaseCommand
from django.conf import settings
from your_app_name.models import Product

class Command(BaseCommand):
    def handle(self, *args, **options):
        with open(os.join.path(settings.BASE_DIR / 'your_csv_file.csv'), 'r') as csv_file:
            csv_reader = csv.reader(csv_file, delimiter=';')
            for row in csv_reader:
                Product.objects.create(name=row[2], description=row[3], price=row[4])

At the end just run the command to process your CSV file and insert it into Product model.最后只需运行命令来处理您的 CSV 文件并将其插入到Product模型中。

Terminal: python manage.py ProcessCsv终端: python manage.py ProcessCsv

Thats it.而已。

You can use the django-csv-importer package.您可以使用 django-csv-importer 包。 http://pypi.python.org/pypi/django-csv-importer/0.1.1 http://pypi.python.org/pypi/django-csv-importer/0.1.1

It works like a django model它就像一个 django 模型

MyCsvModel(CsvModel):
    field1 = IntegerField()
    field2 = CharField()
    etc

    class Meta:
        delimiter = ";"
        dbModel = Product

And you just have to: CsvModel.import_from_file("my file")你只需要: CsvModel.import_from_file("my file")

That will automatically create your products.这将自动创建您的产品。

You can give a try to django-import-export .您可以尝试django-import-export It has nice admin integration, changes preview, can create, update, delete objects.它具有很好的管理集成、更改预览、可以创建、更新、删除对象。

This is based off of Erik's answer from earlier , but I've found it easiest to read in the .csv file using pandas and then create a new instance of the class for every row in the in data frame.这是基于Erik 之前的回答,但我发现使用 pandas 读取 .csv 文件最容易,然后为 in 数据框中的每一行创建一个新的类实例。

This example is updated using iloc as pandas no longer uses ix in the most recent version.此示例使用iloc更新,因为pandas在最新版本中不再使用 ix。 I don't know about Erik's situation but you need to create the list outside of the for loop otherwise it will not append to your array but simply overwrite it.我不知道 Erik 的情况,但您需要在 for 循环之外创建列表,否则它不会附加到您的数组,而只是覆盖它。

import pandas as pd
df = pd.read_csv('path_to_file', sep='delimiter')
products = []
for i in range(len(df)):
    products.append(
        Product(
        name=df.iloc[i][0]
        description=df.iloc[i][1]
        price=df.iloc[i][2]
        )
    )
Product.objects.bulk_create(products)

This is just breaking the DataFrame into an array of rows and then selecting each column out of that array off the zero index.这只是将 DataFrame 分解为一个行数组,然后从该数组中从零索引中选择每一列。 (ie name is the first column, description the second, etc.) (即名称是第一列,描述是第二列,等等)

Hope that helps.希望有帮助。

Here's a django egg for it:这是一个 django 鸡蛋:

django-csvimport django-csvimport

Consider using Django's built-in deserializers.考虑使用 Django 的内置反序列化器。 Django's docs are well-written and can help you get started. Django 的文档写得很好,可以帮助您入门。 Consider converting your data from csv to XML or JSON and using a deserializer to import the data.考虑将数据从 csv 转换为 XML 或 JSON,并使用反序列化器导入数据。 If you're doing this from the command line (rather than through a web request), the loaddata manage.py command will be especially helpful.如果您从命令行(而不是通过 Web 请求)执行此操作,则loaddata manage.py 命令将特别有用。

define class in models.py and a function in it.在 models.py 中定义类并在其中定义一个函数。

class all_products(models.Model):
    def get_all_products():
        items = []
        with open('EXACT FILE PATH OF YOUR CSV FILE','r') as fp:
            # You can also put the relative path of csv file
            # with respect to the manage.py file
            reader1 = csv.reader(fp, delimiter=';')
            for value in reader1:
                items.append(value)
        return items

You can access ith element in the list as items[i]您可以将列表中的第 i 个元素作为 items[i] 访问

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM