繁体   English   中英

读取固定宽度表(.txt文件)中的一个“单元格”,该表在python / pandas中分成两行

[英]Reading one 'cell' of a fixed width table (.txt file) that is split over two lines in python/pandas

如何读取分成两行的固定宽度列的一个“单元格”? 数据输入是一个固定宽度的表,就像这样;

ID   Description                 QTY
1    Description split over      1
     two lines
2    Description on one line     2

我希望数据帧格式如下所示;

ID   Description                           QTY
1    Description split over two lines      1       
2    Description on one line               2

我目前的代码是;

import pandas as pd

df = pd.read_fwf('test.txt', names = ['ID', 'Description', 'QTY'])
df

但这给了我;

ID   Description                 QTY
1    Description split over      1
NaN  two lines                   NaN 
2    Description on one line     2

有任何想法吗?

#Conditionally concatenate description from next row to current row if the ID of next row is NAN>
df['Description'] = df.apply(lambda x: x.Description if x.name==(len(df)-1) else x.Description + ' ' + df.iloc[x.name+1]['Description'] if np.isnan(df.iloc[x.name+1]['ID']) else x.Description, axis=1)

#Drop rows with NA.
df = df.dropna()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM