[英]Reading rpt files with pandas
I read rpt data to pandas by using: 我使用以下方法将rpt数据读入pandas:
import pandas as pd
df = pd.read_fwf("2014-1.rpt", skiprows=[1], nrows=150)
I actually follow the anwser here However, for some columns, seperation is not accurate. 我实际上在这里跟随anwser然而,对于某些列,分离是不准确的。 It is sample of what I get: 这是我得到的样本:
Country Order Date Device Category
UK 2014-01-03 Desktop Shoes
IT 2014-01-03 Desktop Shoes
FR 2014-01-04 Desktop Dress
FR 2014-01-04 Tablet Dress
US 2014-01-05 Desktop Bags
US 2014-01-06 Desktop Bags
UK 2014-01-07 Tablet Dress
For instance it reads Order Date and Device columns as a single column. 例如,它将Order Date和Device列读作单个列。 Actually, it is just an example, there are many columns like this. 实际上,这只是一个例子,有很多这样的列。 How to solve it? 怎么解决? Do you have any idea? 你有什么主意吗? Actually these columns with problems might have fixed widths 实际上这些有问题的列可能有固定的宽度
This question is old, but here is an answer. 这个问题很古老,但这是一个答案。 You can read it as a csv using pandas. 您可以使用pandas将其作为csv读取。 I have used this for a variety of rpt files and it has worked. 我已经将它用于各种rpt文件,它已经有效了。
import pandas as pd
df = pd.read_csv("2014-1.rpt", skiprows=[1], nrows=150)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.