根據部分字符串匹配創建兩個新的 pandas 列

Question

我有一個以隨機順序排列的結構標題和名稱的數據框（但某人的名字總是在其標題右側的單元格中），如下所示：

   contact_1_title contact_1_name contact_2_title contact_2_name contact_3_title contact_3_name      contact_4_title contact_4_name
0  owner_architect            joe    other_string   other_string    other_string   other_string         other_string   other_string
1     other_string   other_string       architect           jack    other_string   other_string         other_string   other_string
2     other_string   other_string    other_string   other_string    other_string   other_string  self_cert_architect           mary
3     other_string   other_string    other_string   other_string           owner           phil         other_string   other_string
4       contractor          sarah    other_string   other_string    other_string   other_string         other_string   other_string
5     other_string   other_string       expeditor           kate    other_string   other_string         other_string   other_string

我想提取每個帶有“建築師”一詞的標題，並將其插入到它自己的新列中。 我還想立即將單元格中的每個名稱都拉到右側，並將其插入到自己的列中。 我想要的輸出：

        arch_title_col arch_name_col
0      owner_architect           joe
1            architect          jack
2  self_cert_architect          mary

我不知道該怎么做。 我嘗試使用iterrtuples()但我並沒有走得太遠。

Answer 1

您需要的是pd.wide_to_long ，但我無法獲得正確格式化列的語法。 所以這里是手動的：

title = pd.concat([df[col] for col in df.filter(like='title')], axis=0)
name = pd.concat([df[col] for col in df.filter(like='name')], axis=0)
df = pd.concat([title, name], axis=1)
df.columns = ['title', 'name']

現在我們有了一個好的格式，這是一個簡單的檢查：

out = df[df.title.str.contains('architect')]
print(out)

輸出：

                 title  name
0      owner_architect   joe
1            architect  jack
2  self_cert_architect  mary

我向你保證，在 99% 的情況下， iter...不是你想要的，並且有一種更好的 panda 特定方法可以做任何你想做的事情。

根據部分字符串匹配創建兩個新的 pandas 列

問題描述

1 個解決方案

解決方案1
0 2022-07-18 23:37:41

根據部分字符串匹配創建兩個新的 pandas 列

問題描述

1 個解決方案

解決方案1 0 2022-07-18 23:37:41

解決方案1
0 2022-07-18 23:37:41