[英]Read file using Python/Pandas
I have a tab-delimited file with data like:我有一个制表符分隔的文件,其中包含以下数据:
id Name address dept sal
1 abc "bangalore,
Karnataka,
Inida" 10 500
2 xyz "Hyderabad
Inida" 20 500
Here the columns are id
, Name
, address
, dept
, and sal
.这里的列是id
、 Name
、 address
、 dept
和sal
。
The issue is with address columns that can contain a new line character.问题在于可以包含换行符的地址列。 I tried different methods to read the file using Pandas and Python but instead of two rows, I am getting multiple rows as output.我尝试了不同的方法来使用 Pandas 和 Python 读取文件,但我得到的不是两行,而是多行 output。
Here are the few commands I tried:以下是我尝试的几个命令:
file1 = open('C:/dummy/dummy.csv', 'r')
lines = file1.readlines()
for i in lines:
print(i)
and和
df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"')
Can anyone please help?有人可以帮忙吗?
df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"')
The corresponding output is, in case the columns are tab-delimited in the csv-file, as you say相应的 output 是,如果列在 csv 文件中以制表符分隔,如您所说
id Name address dept sal
0 1 abc bangalore,\r\nKarnataka,\r\nInida 10 500
1 2 xyz Hyderabad\r\nInida 20 500
If you like to remove the CR-LF within the string, you can remove them via post-processing.如果您想删除字符串中的 CR-LF,您可以通过后处理删除它们。 Additionally you could define the index-column via此外,您可以通过定义索引列
df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"',index_col=0)
What is your desired/expected output?您想要/期望的 output 是什么?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.