简体   繁体   中英

Iterate through each rows of a column and perform operation

I have my data -

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns = ['names', 'category'])
df
names   category
abc - a   A
def - b   B
ghi - c   C
jkl - d   D

What I want as my output is -

names     division    category
    abc      a          A
    def      b          B
    ghi      c          C
    jkl      d          D

There are a lot of methods to perform this, but I want to perform this with this logic -

iterate through each rows of column-names, and store each value in 'st1' and then ->

first, middle, last = st1.partition(' - ')
df['names'] = first
df['division'] = last

and also assigning it to dataframe one by one, please help me to get my desired output in python.

你可以这样做:

df[['names','division']] = df.names.str.split(' - ',expand=True)

Create the dataframe as you did before, then iterate over all rows of names and categories and split the names through - s and append them to a new dataset which is then converted into another DataFrame like this:

import pandas as pd

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns = ['names', 'category'])

newdata = []
for names, category in zip(df.names, df.category):
    name, division = names.split("-")
    newdata.append([name.strip(), division.strip(), category])

new_df = pd.DataFrame(newdata, columns = ['names', 'division', 'category'])

print ing the new dataframe results in:

>>> new_df
  names division category
0   abc        a        A
1   def        b        B
2   ghi        c        C
3   jkl        d        D

I'm testing out github copilot to see how it can solve stackoverflow issues.

# Solution 1
import pandas as pd
import numpy as np

data = [['abc - a', 'A'], ['def - b', 'B'], ['ghi - c', 'C'], ['jkl - d', 'D']]
df = pd.DataFrame(data, columns=['names', 'category'])


# Iterate through each rows of column-names, and store each value in 'st1' and then ->
# first, middle, last = st1.partition(' - ')
# df['names'] = first
# df['division'] = last
# and also assigning it to dataframe one by one, please help me to get my desired output in python.


for index, row in df.iterrows():
    st1 = row['names']
    first, middle, last = st1.partition(' - ')
    df.loc[index, 'names'] = first
    df.loc[index, 'division'] = last

# Explain what is df.loc
# df.loc[row index, column index]
# df.loc[0, 'names'] = first
# df.loc[0, 'division'] = last

print(df)

Output:

  names category division
0   abc        A        a
1   def        B        b
2   ghi        C        c
3   jkl        D        d

Since you want to iterate through each of the rows in your DataFrame and work with them individually, you'll need to use some NumPy to get your work done. Since you want to split the rows, .partition() works similarly to .split() in Pandas, but not in NumPy.

Here are the packages you'll need:

import pandas as pd
import numpy as np

Before you can iterate through your rows, you'll need to use .insert() to create a new column named "division" (I use np.nan as a place filler, but you can use any value you want:

df.insert(1, 'division', np.nan)

Now you can iterate through the rows using Pandas' iterrows() method.

# index returns the index number, row returns a tuple of the row values
for index, row in df.iterrows():
    
    # convert row values from a tuple to a row
    row = list(row)
    
    # remove 'np.nan' value from the column we created above
    row.pop(1)
    
    # split value from the 'names' column; creates values for 'names' and 'division' columns
    new_row = row[0].split(' - ')
    
    # append the value from the 'category' column
    new_row = np.append(new_row, row[1])
    
    # save the new row to the DataFrame
    df.iloc[index] = new_row

This is the output:

|    | names   | division   | category   |
|---:|:--------|:-----------|:-----------|
|  0 | abc     | a          | A          |
|  1 | def     | b          | B          |
|  2 | ghi     | c          | C          |
|  3 | jkl     | d          | D          |

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM