简体   繁体   English

如何使用python将.csv文件转换为.db文件?

[英]How do I convert a .csv file to .db file using python?

I want to convert a csv file to a db (database) file using python. 我想使用python将csv文件转换为db(数据库)文件。 How should I do it ? 我该怎么办?

  1. You need to find a library that helps you to parse the csv file, or read the file line by line and parse it with standard python, it could be as simple as split the line on commas. 您需要找到一个库来帮助您解析csv文件,或者逐行读取文件并使用标准python对其进行解析,这可能就像用逗号分隔行一样简单。

  2. Insert in the Sqlite database. 插入Sqlite数据库。 Here you have the python documentation on SQLite . 在这里,您可以找到有关SQLite的python文档。 You could also use sqlalchemy or other ORM . 您也可以使用sqlalchemy或其他ORM。

Another way, could be using the sqlite shell itself. 另一种方法是使用sqlite shell本身。

I don't think this can be done in full generality without out-of-band information or just treating everything as strings/text. 我认为,如果没有带外信息或将所有内容都视为字符串/文本,就无法完全通用地完成此操作。 That is, the information contained in the CSV file won't, in general, be sufficient to create a semantically “satisfying” solution. 也就是说,CSV文件中包含的信息通常不足以创建语义上“令人满意的”解决方案。 It might be good enough to infer what the types probably are for some cases, but it'll be far from bulletproof. 可能是不够好,推断出的类型可能是某些情况下,但它会从防弹远。

I would use Python's csv and sqlite3 modules, and try to: 我将使用Python的csvsqlite3模块,并尝试:

  • convert the cells in the first CSV line into names for the SQL columns (strip “oddball” characters) 将第一个CSV行中的单元格转换为SQL列的名称(带“ oddball”字符)
  • infer the types of the columns by going through the cells in the second CSV file line (first line of data), attempting to convert each one first to an int , if that fails, try a float , and if that fails too, fall back to strings 通过遍历第二个CSV文件行(第一行数据)中的单元格来推断列的类型,尝试先将每个单元格转换为一个int ,如果失败,则尝试使用float ;如果也失败,则退回串
  • this would give you a list of names and a list of corresponding probably types from which you can roll a CREATE TABLE statement and execute it 这将为您提供名称列表和相应的可能类型列表,您可以从中滚动CREATE TABLE语句并执行它
  • try to INSERT the first and subsequent data lines from the CSV file 尝试从CSV文件INSERT第一行和后续数据行

There are many things to criticize in such an approach (eg no keys or indexes, fails if first line contains a field that is a string in general but just so happens to contain a value that's Python-convertible to an int or float in the first data line), but it'll probably work passably for the majority of CSV files. 这种方法有很多要批评的地方(例如,没有键或索引,如果第一行包含的字段通常是字符串,则失败,但是恰好包含了Python可以转换为intfloat的值)数据行),但对于大多数CSV文件而言,它可能都能正常运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM