简体   繁体   English

从 csv 文件中读取 Pandas dataframe 并转换为 ZA7F5F35426B927411173231B5632 类型

[英]Read Pandas dataframe from csv file and convert to Python types

I want to read a Pandas dataframe with elements of particular python types, such as arrays and dictionaries, and numpy arrays. I want to read a Pandas dataframe with elements of particular python types, such as arrays and dictionaries, and numpy arrays. I want to read it such that I can immediately work with them (now they are read as a string).我想阅读它以便我可以立即使用它们(现在它们被读取为字符串)。 How do I do that?我怎么做?

I want functionality similar to ast.literal_eval , but hopefully there is a way to do it without looping over the whole dataframe.我想要类似于ast.literal_eval的功能,但希望有一种方法可以做到这一点,而无需遍历整个 dataframe。

Edit: as requested, a minimal reproducible example.编辑:根据要求,一个最小的可重现示例。

import pandas as pd
import numpy as np

output = pd.DataFrame()
data = {'integer':1, 'list': [1,2,3], 'dictionary':{}, 'np_arrar' = np.array([1,2,3]}
output = output.append({}, ignore_index=True)
    
filename = 'data.csv'
output.to_csv(filename)

input_data = pd.read_csv(filename, ???) # What to do here?

Ideally, I want a way where I don't have to input the datatypes manually (not sure if there is such approach).理想情况下,我想要一种不必手动输入数据类型的方法(不确定是否有这种方法)。

For people of the future: for simple data types it is possible to use the dtype parameter, like so对于未来的人:对于简单的数据类型,可以使用dtype参数,就像这样

input_data = pd.read_csv(filename, dtype = {'integer':'int'})

However, for objects, this does not work properly.但是,对于对象,这不能正常工作。 Then you can use the converters parameter instead.然后您可以改用converters参数。 This is a dictionary of functions to convert a certain column in your data.这是用于转换数据中特定列的函数字典。 One can use the function ast.literal_eval from ast可以使用来自ast的 function ast.literal_eval

input_data = pd.read_csv(filename, converters= {'integer': ast.literal_eval, 'dictionary': ast.literal_eval, 'list': ast.literal_eval}

Be careful though, this does not work with numpy arrays as you will encounter an error SyntaxError: invalid syntax because numpy arrays are stored without commas, this is not valid Python Syntax. Be careful though, this does not work with numpy arrays as you will encounter an error SyntaxError: invalid syntax because numpy arrays are stored without commas, this is not valid Python Syntax. Instead you can define your own function相反,您可以定义自己的 function

def string_to_numpyArray(x):
    return np.fromstring(x[1:-1],dtype = float, sep = ' ')

and then use this as follows然后按如下方式使用它

input_data = pd.read_csv(filename, converters= {'integer': ast.literal_eval, 'dictionary': ast.literal_eval, 'list': ast.literal_eval, 'np_list':string_to_numpyArray}

Hope this is helpful for someone.希望这对某人有帮助。

Cheers干杯

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:从csv文件读取Pandas数据帧,将过滤后的输出作为csv生成到另一个文件 - Python: Read Pandas Dataframe from csv File, Make Filtered Output to Another File as csv pandas CSV 文件读取不会将数据类型从 object 转换为 int - pandas CSV file read won't convert data types from object to int 从 Google Cloud Storage 将 CSV 文件读取到 Datalab 并转换为 Pandas 数据帧 - Read CSV file to Datalab from Google Cloud Storage and convert to pandas dataframe Python:从csv文件中读取数据帧列表 - Python: Read list of dataframe from csv file Python pandas 读取多个 csv 文件并转为 Z6A8064B5DF4794505500553C47DZC5 - Python pandas read multiple csv file and turn into dataframe python:将csv文件中的行格式文件转换为pandas dataframe中的导入 - python: convert rows formatted file in csv files to import in pandas dataframe 如何从 csv 文件中读取并将值添加为 python 中的 Pandas 数据帧的标头? - How to read from a csv file and add the values as headers of pandas dataframe in python? AWS Lambda - 读取 csv 并转换为 Pandas 数据帧 - AWS Lambda - read csv and convert to pandas dataframe 从txt文件中读取并在python中转换为dataframe - read from txt file and convert into dataframe in python 将列表写入pandas dataframe到csv,从csv读取dataframe并再次转换为列表而没有字符串 - write lists to pandas dataframe to csv, read dataframe from csv and convert to lists again without having strings
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM