遍歷列的每一行並執行操作

Question

我有我的數據——

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns = ['names', 'category'])
df
names   category
abc - a   A
def - b   B
ghi - c   C
jkl - d   D

我想要的輸出是 -

names     division    category
    abc      a          A
    def      b          B
    ghi      c          C
    jkl      d          D

有很多方法可以執行此操作，但我想使用此邏輯執行此操作 -

遍歷列名的每一行，並將每個值存儲在 'st1' 中，然后 ->

first, middle, last = st1.partition(' - ')
df['names'] = first
df['division'] = last

並將其一一分配給數據框，請幫助我在python中獲得所需的輸出。

Answer 1

你可以這樣做：

df[['names','division']] = df.names.str.split(' - ',expand=True)

Answer 2

像之前一樣創建數據幀，然后遍歷名稱和類別的所有行，並通過- s 拆分名稱並將它們附加到新數據集，然后將其轉換為另一個數據幀，如下所示：

import pandas as pd

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns = ['names', 'category'])

newdata = []
for names, category in zip(df.names, df.category):
    name, division = names.split("-")
    newdata.append([name.strip(), division.strip(), category])

new_df = pd.DataFrame(newdata, columns = ['names', 'division', 'category'])

print新的數據幀結果：

>>> new_df
  names division category
0   abc        a        A
1   def        b        B
2   ghi        c        C
3   jkl        d        D

Answer 3

我正在測試 github copilot，看看它如何解決 stackoverflow 問題。

# Solution 1
import pandas as pd
import numpy as np

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns=['names', 'category'])


# Iterate through each rows of column-names, and store each value in 'st1' and then ->
# first, middle, last = st1.partition(' - ')
# df['names'] = first
# df['division'] = last
# and also assigning it to dataframe one by one, please help me to get my desired output in python.


for index, row in df.iterrows():
    st1 = row['names']
    first, middle, last = st1.partition(' - ')
    df.loc[index, 'names'] = first
    df.loc[index, 'division'] = last

# Explain what is df.loc
# df.loc[row index, column index]
# df.loc[0, 'names'] = first
# df.loc[0, 'division'] = last

print(df)

輸出：

  names category division
0   abc        A        a
1   def        B        b
2   ghi        C        c
3   jkl        D        d

Answer 4

由於您想遍歷DataFrame每一行並單獨使用它們，因此您需要使用一些 NumPy 來完成您的工作。 由於您要拆分行，因此.partition()工作方式與 Pandas 中的.split()類似，但在 NumPy 中則不同。

以下是您需要的軟件包：

import pandas as pd
import numpy as np

在遍歷行之前，您需要使用.insert()創建一個名為“division”的新列（我使用np.nan作為填充np.nan ，但您可以使用任何您想要的值：

df.insert(1, 'division', np.nan)

現在您可以使用 Pandas 的iterrows()方法遍歷行。

# index returns the index number, row returns a tuple of the row values
for index, row in df.iterrows():
    
    # convert row values from a tuple to a row
    row = list(row)
    
    # remove 'np.nan' value from the column we created above
    row.pop(1)
    
    # split value from the 'names' column; creates values for 'names' and 'division' columns
    new_row = row[0].split(' - ')
    
    # append the value from the 'category' column
    new_row = np.append(new_row, row[1])
    
    # save the new row to the DataFrame
    df.iloc[index] = new_row

這是輸出：

|    | names   | division   | category   |
|---:|:--------|:-----------|:-----------|
|  0 | abc     | a          | A          |
|  1 | def     | b          | B          |
|  2 | ghi     | c          | C          |
|  3 | jkl     | d          | D          |

遍歷列的每一行並執行操作

問題描述

4 個解決方案

解決方案1
2 2021-07-16 13:04:35

解決方案2
1 2021-07-16 13:00:34

解決方案3
1 已采納 2021-07-16 13:06:52

解決方案4
0 2021-07-16 13:54:23

遍歷列的每一行並執行操作

問題描述

4 個解決方案

解決方案1 2 2021-07-16 13:04:35

解決方案2 1 2021-07-16 13:00:34

解決方案3 1 已采納 2021-07-16 13:06:52

解決方案4 0 2021-07-16 13:54:23

解決方案1
2 2021-07-16 13:04:35

解決方案2
1 2021-07-16 13:00:34

解決方案3
1 已采納 2021-07-16 13:06:52

解決方案4
0 2021-07-16 13:54:23