如何在數據框中找到每行最長的字符串並在超過一定數量時打印行號

Question

我想編寫一個程序來搜索數據框，如果其中的任何項目超過 50 個字符，打印行號並詢問是否要繼續搜索數據框。

threshold = 50 

mask = (df.drop(columns=exclude, errors='ignore')
          .apply(lambda s: s.str.len().ge(threshold))
        )

out = df.loc[~mask.any(axis=1)]

我嘗試使用它，但我不想刪除行，只打印字符串超過 50 的行號

輸入：

0 "Robert","20221019161921","London"
1 "Edward","20221019161921","London"
2 "Johnny","20221019161921","London"
3 "Insane string which is way too longggggggggggg","20221019161921","London"

Output：

Row 3 is above the 50-character limit.

我還希望程序打印太長的特定值或字符串。

Answer 1

您可以使用：

exclude = []
threshold = 30

mask = (df.drop(columns=exclude, errors='ignore')
          .apply(lambda s: s.str.len().ge(threshold))
        )

s = mask.any(axis=1)

for idx in s[s].index:
    print(f'row {idx} is above the {threshold}-character limit.')
    s2 = mask.loc[idx]
    for string in df.loc[idx, s2.reindex(df.columns, fill_value=False)]:
        print(string)

Output：

row 3 is above the 30-character limit.
"Insane string which is way too longggggggggggg","20221019161921","London"

s ：

0    False
1    False
2    False
3     True
dtype: bool

如何在數據框中找到每行最長的字符串並在超過一定數量時打印行號

問題描述

1 個解決方案

解決方案1
0 已采納 2022-11-28 16:18:28

如何在數據框中找到每行最長的字符串並在超過一定數量時打印行號

問題描述

1 個解決方案

解決方案1 0 已采納 2022-11-28 16:18:28

解決方案1
0 已采納 2022-11-28 16:18:28