简体   繁体   English

使用python(或其他)从大csv文件创建Excel数据表

[英]Creating an Excel data sheet from big csv file using python (or other)

At my job we are working with a huge set of data of real estate properties compacted in a csv file of around 200000 lines (constantly growing). 在我的工作中,我们正在处理大量房地产资产的数据,这些数据压缩在一个大约200000行的csv文件中(不断增长)。

This csv sheet includes columns with info such as: pricing, surface area, year built, street, street nr, post code, etc. 此csv表包括具有以下信息的列:定价,表面积,建成年份,街道,街道nr,邮政编码等。

Part of the work we are doing includes creating an Excel sheet of properties that are comparable to a given object within a set of certain limits (eg surface area +/- 20%). 我们正在做的部分工作包括创建一组Excel属性,这些属性与一组特定限制内的给定对象相当(例如,表面积+/- 20%)。

I want to automate generating such an Excel list and I was thinking about using Python for this. 我想自动生成这样一个Excel列表,我正在考虑使用Python。 Here is what I want the program to do: 这是我希望程序执行的操作:

1) Read in the csv file 1)读入csv文件

2) Take in all necessary parameters to be compared for the Excel sheet 2)获取Excel表格要比较的所有必要参数

3) Create an excel sheet from the csv data with properties that fit these parameters 3)使用适合这些参数的属性从csv数据创建Excel工作表

4) Rewrite abstract parameter descriptions (eg if the value of column 'dishwasher' is '0', write 'No dishwasher available') and append the value in the house_number column to the street_name column value 4)重写抽象参数描述(例如,如果列'洗碗机'的值为'0',写'不可用洗碗机')并将house_number列中的值附加到street_name列值

Is python a good way for handling this or would you have other suggestions? python是处理这个的好方法还是你有其他建议?

Python is a good language to do data parsing like this. Python是一种很好的语言,可以像这样进行数据解析。 Using the pandas library might be helpful. 使用pandas库可能会有所帮助。 It has functions for importing CSVs and functions to operate on the resulting data. 它具有导入CSV和函数的功能,以对结果数据进行操作。 Pandas can also directly export into the excel format . 熊猫还可以直接导出为excel格式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM