簡體   English   中英

如何在數據框中找到每行最長的字符串並在超過一定數量時打印行號

[英]How would I find the longest string per row in a data frame and print the row number if it exceeds a certain amount

我想編寫一個程序來搜索數據框,如果其中的任何項目超過 50 個字符,打印行號並詢問是否要繼續搜索數據框。

threshold = 50 

mask = (df.drop(columns=exclude, errors='ignore')
          .apply(lambda s: s.str.len().ge(threshold))
        )

out = df.loc[~mask.any(axis=1)]

我嘗試使用它,但我不想刪除行,只打印字符串超過 50 的行號

輸入:

0 "Robert","20221019161921","London"
1 "Edward","20221019161921","London"
2 "Johnny","20221019161921","London"
3 "Insane string which is way too longggggggggggg","20221019161921","London"

Output:

Row 3 is above the 50-character limit.

我還希望程序打印太長的特定值或字符串。

您可以使用:

exclude = []
threshold = 30

mask = (df.drop(columns=exclude, errors='ignore')
          .apply(lambda s: s.str.len().ge(threshold))
        )

s = mask.any(axis=1)

for idx in s[s].index:
    print(f'row {idx} is above the {threshold}-character limit.')
    s2 = mask.loc[idx]
    for string in df.loc[idx, s2.reindex(df.columns, fill_value=False)]:
        print(string)

Output:

row 3 is above the 30-character limit.
"Insane string which is way too longggggggggggg","20221019161921","London"

s

0    False
1    False
2    False
3     True
dtype: bool

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM