简体   繁体   English

如何使用标记字符串从csv文件检索行

[英]How to retrieve rows from a csv file using a tag string

I have a CSV file contains data like this: 我有一个CSV文件,其中包含以下数据:

I have write down a code which is able to retrieve the rows which contains "Active" at second column "outcome": 我写下了一个能够检索第二列“结果”中包含“活动”的行的代码:

Data: 数据:

No,Outcome,target,result
1,Active,PGS2,positive
2,inactive,IM2,negative
3,inactive,IGI,positive
4,Active,IIL,positive
5,Active,P53,negative

Code: 码:

new_file  = open(my_file)
lines  = new_file.readlines()
for line in lines:
    if "Active" in line:
        print line,

Outcome: 结果:

No,Outcome,target,result
1,Active,PGS2,positive
4,Active,IIL,positive
5,Active,P53,negative

How can i write down this code using pandas library so that i can make this code shorter if i am using pandas functionality after retrieving the rows. 我如何使用pandas库记录此代码,以便在检索行后使用pandas功能时可以使此代码更短。

Also this code is not suitable when you have "Active" key word same where else in yor row because that can retrieve a false row. 同样,当您在其他行中的“活动”关键字相同的情况下,此代码也不适用,因为这样可以检索错误的行。 I found after previewing some posts that "pandas" is very suitable library for CSV Handling. 在预览一些帖子后,我发现“ pandas”非常适合CSV处理。

Why not just filter this aftewards, it will be faster than parsing line by line. 为什么不过滤这个问题,它会比逐行分析更快。 Just do this: 只要这样做:

In [172]:

df[df['Outcome']=='Active']
Out[172]:
   No Outcome target    result
0   1  Active   PGS2  positive
3   4  Active    IIL  positive
4   5  Active    P53  negative

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM