简体   繁体   English

拆分一个 Pandas dataframe,保留两个部分

[英]Split a Pandas dataframe, keep both parts

I'm creating a dataframe by importing a.csv file.我正在通过导入 a.csv 文件创建一个 dataframe。 I then need to delete rows based on certain conditions.然后我需要根据某些条件删除行。 Because the number of rows deleted is quite small it is easier to validate the conditions by checking what has been deleted, instead of what remains.因为删除的行数非常少,所以通过检查已删除的内容而不是剩余的内容来验证条件更容易。 I end up doing something like this:我最终做了这样的事情:

    dfcd=df.loc[(~df.Course_Code.str.contains('MG')) & (~df.Course_Code.str.contains('DE'))]
    df=df.loc[(df.Course_Code.str.contains('MG')) | (df.Course_Code.str.contains('DE'))]

But this feels very clumsy and as the conditions get more complex I worry that I am going to write the inverse condition incorrectly (reading another thread on SO I realise I could have simplified the above by using another set of parentheses with the ~ outside them, but anyway)但这感觉非常笨拙,并且随着条件变得越来越复杂,我担心我会错误地编写逆条件(阅读另一个关于 SO 的线程我意识到我可以通过使用另一组带有 ~ 的括号来简化上述内容,但无论如何)

Is there a command that will create two dataframes, one where the condition is true and the other where it is false?是否有一个命令会创建两个数据帧,一个条件为真,另一个为假? Something like:就像是:

    df,dfcd=df.<another_command>[(df.Course_Code.str.contains('MG')) | (df.Course_Code.str.contains('DE'))]

Or is there another better way to do this?还是有另一种更好的方法来做到这一点?

You can use |你可以使用| for regex or , so possible simplify your solution by filter for condition and invert condition by ~ for match rows if condition get False s:对于正则表达式or ,因此可以通过过滤条件来简化您的解决方案,如果条件得到False s,则可以通过~为匹配行反转条件:

m = df.Course_Code.str.contains('MG|DE')
#same like
# m = (df.Course_Code.str.contains('MG')) | (df.Course_Code.str.contains('DE'))

df1, df2 = df[m], df[~m]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM