简体   繁体   English

使用 Python/Pandas 读取文件

[英]Read file using Python/Pandas

I have a tab-delimited file with data like:我有一个制表符分隔的文件,其中包含以下数据:

id  Name    address dept    sal
1   abc "bangalore,
        Karnataka,
        Inida"  10  500
2   xyz "Hyderabad
         Inida" 20  500

Here the columns are id , Name , address , dept , and sal .这里的列是idNameaddressdeptsal

The issue is with address columns that can contain a new line character.问题在于可以包含换行符的地址列。 I tried different methods to read the file using Pandas and Python but instead of two rows, I am getting multiple rows as output.我尝试了不同的方法来使用 Pandas 和 Python 读取文件,但我得到的不是两行,而是多行 output。

Here are the few commands I tried:以下是我尝试的几个命令:

file1 = open('C:/dummy/dummy.csv', 'r')

lines = file1.readlines()

for i in lines:

    print(i)

and

df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"')

Can anyone please help?有人可以帮忙吗?

df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"')

The corresponding output is, in case the columns are tab-delimited in the csv-file, as you say相应的 output 是,如果列在 csv 文件中以制表符分隔,如您所说

   id Name                            address  dept  sal
0   1  abc  bangalore,\r\nKarnataka,\r\nInida    10  500
1   2  xyz                 Hyderabad\r\nInida    20  500

If you like to remove the CR-LF within the string, you can remove them via post-processing.如果您想删除字符串中的 CR-LF,您可以通过后处理删除它们。 Additionally you could define the index-column via此外,您可以通过定义索引列

df = pd.read_csv("C:/dummy/dummy.csv",sep='\t',quotechar='"',index_col=0)

What is your desired/expected output?您想要/期望的 output 是什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM