如何使用带有熊猫的行键提取CSV文件的特定部分

Question

我有一个巨大的CSV文件，具有10000行和500列。 我想从标题提取数据到包含device_boot的行。 我想消除device_boot之后的所有行。

例：

Name,Time,status,..
start,05:06:2018 10:10:23,good,..
start,05:06:2018 10:11:23,good,..
failure,05:06:2018 11:10:25,critical,..
device_boot,05:06:2018 13:11:25,reboot,..
start,05:06:2018 13:13:23,good,..
start,05:06:2018 13:16:23,good,..

因此，我需要使用熊猫在CSV文件中维护最多device_boot行（行）。 我能够删除该关键字上的特定行，但无法使用pd.drop(...)提取到该部分。

感谢您的建议。

Answer 1

采用：

print(df.loc[:df['Name'].gt('device_boot').idxmin()+1,:])

输出将是预期的输出。

更新：

print(df.loc[:df.index[df['Name']=='device_boot'].tolist()[-1],:])

如果要删除它，它包含'device_boot'行：

print(df.loc[:df.index[df['Name']=='device_boot'].tolist()[-1]-1,:])

Answer 2

我找到关键字的索引，例如

val = df.loc[df['name']=='device_boot'].index
print val

然后，使用该行索引并仅检索直到该变量，

rowretrive_index = val1+50  // any extra rows can be added here.
print rowretrive_index

df1 = df.iloc[1:rowretrive_index]
df1.to_csv('/out.csv',',',dtype='unicode8')

希望它会有用。 谢谢，Sundar

如何使用带有熊猫的行键提取CSV文件的特定部分

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-12-12 11:55:34

解决方案2
0 2018-12-17 13:21:04

如何使用带有熊猫的行键提取CSV文件的特定部分

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-12-12 11:55:34

解决方案2 0 2018-12-17 13:21:04

解决方案1
1 已采纳 2018-12-12 11:55:34

解决方案2
0 2018-12-17 13:21:04