简体   繁体   English

检查 Pandas DataFrame 列中的序列

[英]Check for sequence in column of Pandas DataFrame

My DataFrame looks like this:我的 DataFrame 看起来像这样:

    Category       Date
81    Monate 2020-01-01
88    Monate 2020-01-02
58    Monate 2020-01-03
3     Monate 2020-01-04
23    Monate 2020-01-05
..       ...        ...
134   Wochen 2020-05-24
145     Tage 2020-05-25
147     Tage 2020-05-26
146     Tage 2020-05-27
148     Tage 2020-05-28

It is ordered by Date .它按Date排序。 I need to run a check if on each row Monate follows Monate, Wochen follows Wochen and so on.我需要检查每一行 Monate 是否跟随 Monate,Wochen 是否跟随 Wochen 等等。 It is allowed that Wochen follows Monate and Tage follows Wochen.允许 Wochen 跟随 Monate,Tage 跟随 Wochen。 I hope it is clear that I mean.我希望我的意思很清楚。 Something looks this should cause an error, since the sequence is invalid.看起来这应该会导致错误,因为序列无效。

    Category       Date
81    Monate 2020-01-01
88    Monate 2020-01-02
58    Tage   2020-01-03
3     Monate 2020-01-04
23    Monate 2020-01-05
..       ...        ...
134   Wochen 2020-05-24
145     Tage 2020-05-25
147     Tage 2020-05-26
146   Wochen 2020-05-27
148     Tage 2020-05-28

I could try to write a pretty complicated and probably slow iteration over each row.我可以尝试在每一行上编写一个非常复杂且可能很慢的迭代。

for row in result_df.iterrows():
    do xyz

Is there a better and quicker way to check for an ongoing sequence in a Series or a maybe in a list, dictionary etc.?有没有更好更快的方法来检查系列中的正在进行的序列,或者可能在列表、字典等中?

I believe you can create a numeric dictionary stating the order and replace the values of the Category column and check if series.diff is never negative with series.all :我相信您可以创建一个数字字典来说明顺序并替换 Category 列的值,并检查series.diff是否永远不会使用series.all为负:

def check(dataframe):
    d = {'Monate':1,'Wochen':2,'Tage':3}
    return dataframe['Category'].replace(d).diff().fillna(0).ge(0).all()

Test Runs:测试运行:

print(df,'\n\n',f"Valid? : {check(df)}",'\n\n',df1,'\n\n',f"Valid? : {check(df1)}")

 Category        Date
81    Monate  2020-01-01
88    Monate  2020-01-02
58    Monate  2020-01-03
3     Monate  2020-01-04
23    Monate  2020-01-05
134   Wochen  2020-05-24
145     Tage  2020-05-25
147     Tage  2020-05-26
146     Tage  2020-05-27
148     Tage  2020-05-28 

 Valid? : True 

     Category        Date
81    Monate  2020-01-01
88    Monate  2020-01-02
58      Tage  2020-01-03
3     Monate  2020-01-04
23    Monate  2020-01-05
134   Wochen  2020-05-24
145     Tage  2020-05-25
147     Tage  2020-05-26
146   Wochen  2020-05-27
148     Tage  2020-05-28 

 Valid? : False

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM