I have a data frame which has 10 columns. You can use this code to generate an example frame called df
.
cols = []
for i in range(1,11):
cols.append(f'x{i}')
df = pd.DataFrame(np.random.randint(10,99,size=(10, 10)), columns=cols)
The data frame will look something like this, it is randomly generated so your figures will be different.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
0 91 30 82 10 92 62 43 66 96 88
1 61 95 77 16 19 67 88 44 72 52
2 44 21 68 93 29 40 25 78 96 94
3 80 11 50 55 14 56 21 78 36 41
4 84 52 97 29 92 44 89 78 27 62
5 11 82 83 84 34 90 56 74 68 76
6 31 92 13 89 95 80 75 59 81 74
7 14 25 47 98 67 18 78 10 64 40
8 52 75 60 44 36 18 33 79 65 18
9 19 69 12 61 60 92 61 21 43 72
I want to apply a function which returns a tuple. I want to use the tuples to create 2 columns in my data frame.
def some_func(i1,i2):
o1 = i2 / i1 * 0.5
o2 = i2 * o1 * 6
return o1,o2
When I did this,
df['c1'], df['c2'] = df.apply(lambda row: some_func(row['x9'],row['x10']), axis=1)
I get this error,
ValueError: too many values to unpack (expected 2)
The output should look like this,
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 c1 c2
0 91 30 82 10 92 62 43 66 96 88 0.458333 242.000000
1 61 95 77 16 19 67 88 44 72 52 0.361111 112.666667
2 44 21 68 93 29 40 25 78 96 94 0.489583 276.125000
3 80 11 50 55 14 56 21 78 36 41 0.569444 140.083333
4 84 52 97 29 92 44 89 78 27 62 1.148148 427.111111
5 11 82 83 84 34 90 56 74 68 76 0.558824 254.823529
6 31 92 13 89 95 80 75 59 81 74 0.456790 202.814815
7 14 25 47 98 67 18 78 10 64 40 0.312500 75.000000
8 52 75 60 44 36 18 33 79 65 18 0.138462 14.953846
9 19 69 12 61 60 92 61 21 43 72 0.837209 361.674419
If I only return 1 output, and create 1 column it works fine. How do I output 2 items (tuple or list of 2 items) and create 2 new columns using this?
There must be several ways.
I tried to do what you want with only a few changes.
lambda
usage because you already defined your own function.result_type="expand"
in the apply()
so that the return value will be split to multiple columns. Dataframe
rather than two Series
so that the return values can be split into the dataframe (consist of two Series
). import pandas as pd
df = pd.DataFrame({
'inputcol1': [1, 2, 3, 4],
'inputcol2': [1, 2, 3, 4]
})
def some_func(x):
output1 = x['inputcol1'] + x['inputcol2']
output2 = x['inputcol2'] - x['inputcol2']
return output1, output2
print(df)
# inputcol1 inputcol2
#0 1 1
#1 2 2
#2 3 3
#3 4 4
df[['outputcol1', 'outputcol2']] = df[['inputcol1', 'inputcol2']].apply(some_func, axis=1, result_type="expand")
print(df)
# inputcol1 inputcol2 outputcol1 outputcol2
#0 1 1 2 0
#1 2 2 4 0
#2 3 3 6 0
#3 4 4 8 0
Since you need to loop through multiple columns by rows, a better / more efficient approach is to use zip
+ for loop to create a list of tuples which you can directly assign to a list of columns to the original data frame:
df[['c1', 'c2']] = [some_func(x, y) for x, y in zip(df.x9, df.x10)]
df
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 c1 c2
0 20 67 76 95 28 60 82 81 90 93 0.516667 288.300000
1 94 30 97 82 51 10 54 43 36 41 0.569444 140.083333
2 50 57 85 48 67 65 41 91 48 46 0.479167 132.250000
3 61 36 44 59 18 71 42 18 56 77 0.687500 317.625000
4 11 85 34 66 45 55 21 42 77 27 0.175325 28.402597
5 20 19 86 46 97 21 84 12 86 98 0.569767 335.023256
6 24 87 65 62 22 43 26 80 15 64 2.133333 819.200000
7 38 15 23 22 89 89 19 32 21 33 0.785714 155.571429
8 82 88 64 89 92 88 15 30 85 83 0.488235 243.141176
9 96 24 91 70 96 54 57 81 59 32 0.271186 52.067797
First generate a temporary DataFrame:
wrk = df.apply(lambda row: some_func(row['x9'],row['x10']), axis=1)\
.apply(pd.Series, index=['c1', 'c2'])
Details:
df.apply(…)
- your code - creates from each row tuple and they are collected in a Series . apply(pd.Series, index=['c1', 'c2'])
- from each element of the Series generated so far (a tuple) create a Series with index containing new column names. These Series objects are then collected into a DataFrame , where source index values are now column names.Print wrk to see the result generated so far.
Then join it to df and save the result back under df :
df = df.join(wrk)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.