[英]How to read data from csv file if all the values are in the same column?
I have a csv file in the following format: 我有一个csv文件,格式如下:
"age","job","marital","education","default","balance","housing","loan"
58,"management","married","tertiary","no",2143,"yes","no"
44,"technician","single","secondary","no",29,"yes","no"
However, instead of being separated by tabs (different columns), they all lie in the same first column. 但是,它们不是由制表符(不同的列)分隔,而是位于相同的第一列中。 When I try reading this using pandas, the output gives all the values in the same list instead of a list of lists.
当我尝试使用pandas读取它时,输出会在同一列表中提供所有值,而不是列表列表。
My code: 我的代码:
dataframe = pd.read_csv("marketing-data.csv", header = 0, sep= ",")
dataset = dataframe.values
print(dataset)
O/p: O / P:
[[58 'management' 'married' ..., 2143 'yes' 'no']
[44 'technician' 'single' ..., 29 'yes' 'no']]
What I need: 我需要的:
[[58, 'management', 'married', ..., 2143, 'yes', 'no']
[44 ,'technician', 'single', ..., 29, 'yes', 'no']]
What is it I am missing? 我错过了什么?
I think you are confused by the print()
output which doesn't show commas. 我认为你对
print()
输出感到困惑,它没有显示逗号。
Demo: 演示:
In [1]: df = pd.read_csv(filename)
Pandas representation: 熊猫代表:
In [2]: df
Out[2]:
age job marital education default balance housing loan
0 58 management married tertiary no 2143 yes no
1 44 technician single secondary no 29 yes no
Numpy representation: Numpy代表:
In [3]: df.values
Out[3]:
array([[58, 'management', 'married', 'tertiary', 'no', 2143, 'yes', 'no'],
[44, 'technician', 'single', 'secondary', 'no', 29, 'yes', 'no']], dtype=object)
Numpy string
representation (result of print(numpy_array)
): Numpy
string
表示( print(numpy_array)
):
In [4]: print(df.values)
[[58 'management' 'married' 'tertiary' 'no' 2143 'yes' 'no']
[44 'technician' 'single' 'secondary' 'no' 29 'yes' 'no']]
Conclusion: your CSV file has been parsed correctly. 结论:您的CSV文件已正确解析。
I don't really see a difference between what you want and what you get.. but parsing the csv file with the built in csv module give your desired result 我真的没有看到你想要的和你得到的东西之间的区别......但是使用内置的csv模块解析csv文件可以得到你想要的结果
import csv
with open('file.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
print list(spamreader)
[ [
['age', 'job', 'marital', 'education', 'default', 'balance', 'housing', 'loan'], ['年龄','工作','婚姻','教育','默认','平衡','住房','贷款'],
['58', 'management', 'married', 'tertiary', 'no', '2143', 'yes', 'no'], ['58','管理','已婚','大专','不','2143','是','不'],
['44', 'technician', 'single', 'secondary', 'no', '29', 'yes', 'no'] ['44','技师','单身','中学','不','29','是','不']
] ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.