简体   繁体   中英

How to create new columns by iterating a function through original columns in pandas

I try to put more details with a simplified dataset.

df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10], "C":[30, 40, 50]})
print(df)
    A   B   C
0  10  20  30
1  20  30  40
2  30  10  50

I want to create a new column of C with this formula:

df['perc_C']= (df['C']/(df['A']*df['B']))*100
df

to obtain this final output:

 A  B   C   perc_C
    0   10  20  30  15.000000
    1   20  30  40  6.666667
    2   30  10  50  16.666667

But my Dataframe contains 658 rows × 2144 columns, where the first 2 columns are the 'A' and 'B' of the simplified example I wrote above and 'C' are the other 2142 columns.

I aim to create new 'perc_C' columns by using the above transformation. The problem is also that I do not know how to iterate column labels.

So I tried this:

def perc_fluxes(x):
    for i in df[x]:
        df[perc_x]=(df[x]/(df['A']*df['B']))*100

for key,value df.iteritems():
    df.apply(perc_fluxes)

I got this error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-36-44a716f6c584> in <module>
      6 
      7 for key,value in aho2_n2.iteritems():
----> 8     aho2_n2.apply(perc_fluxes)

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7766             kwds=kwds,
   7767         )
-> 7768         return op.get_result()
   7769 
   7770     def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/apply.py in get_result(self)
    183             return self.apply_raw()
    184 
--> 185         return self.apply_standard()
    186 
    187     def apply_empty_result(self):

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/apply.py in apply_standard(self)
    274 
    275     def apply_standard(self):
--> 276         results, res_index = self.apply_series_generator()
    277 
    278         # wrap results

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/apply.py in apply_series_generator(self)
    288             for i, v in enumerate(series_gen):
    289                 # ignore SettingWithCopy here in case the user mutates
--> 290                 results[i] = self.f(v)
    291                 if isinstance(results[i], ABCSeries):
    292                     # If we have a view on v, we need to make a copy because

<ipython-input-36-44a716f6c584> in perc_fluxes(x)
      2 
      3 def perc_fluxes(x):
----> 4     for i in aho2_n2[x]:
      5         aho2_n2[x_perc]=(aho2_n2[x]/(ahno2_n2[3]*aho2_n2['time_s']))*100
      6 

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3028             if is_iterator(key):
   3029                 key = list(key)
-> 3030             indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
   3031 
   3032         # take() does not accept boolean indexers

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1264             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1265 
-> 1266         self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
   1267         return keyarr, indexer
   1268 

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1314             if raise_missing:
   1315                 not_found = list(set(key) - set(ax))
-> 1316                 raise KeyError(f"{not_found} not in index")
   1317 
   1318             not_found = key[missing_mask]

KeyError: '[0.0, 0.5, 2.16667, 2.33333, 1.5, 2.5, 2.66667, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 14.0833, 15.0833, 16.0833, 10.5, 18.0833, 19.0833, 20.0833, 21.0833, 13.25, 14.25, 15.25, 16.25, 26.0833, 18.25, 19.25, 29.0833, 30.0833, 31.0833, 23.5, 33.0833, 34.0833, 26.25, 27.25, 28.25, 38.0833, 39.0833, 40.0833, 41.0833, 42.9167, 34.25, 35.5, 45.0833, 46.0833, 38.25, 48.1667, 49.1667, 50.0833, 10.75, 50.25, 51.0833, 54.0833, 55.0833, 11.5, 56.0833, 49.25, 50.75, 51.25, 12.5, 53.25, 54.25, 64.0833, 65.0833, 13.75, 66.0833, 59.5, 69.0833, 70.0833, 14.75, 63.25, 64.25, 74.0833, 75.0833, 15.75, 76.0833, 69.25, 79.0833, 80.0833, 16.75, 81.0833, 74.25, 75.25, 76.25, 17.25, 17.75, 86.0833, 89.0833, 90.0833, 18.75, 91.0833, 84.5, 94.0833, 86.25, 19.75, 88.25, 89.25, 99.0833, 100.083, 20.25, 20.75, 101.083, 104.083, 105.083, 21.25, 21.75, 106.083, 100.25, 101.25, 22.25, 22.75, 104.25, 105.25, 106.25, 107.25, 24.5, 25.25, 25.75, 26.75, 27.75, 28.75, 29.25, 29.75, 30.25, 30.75, 31.25, 31.75, 32.25, 32.75, 33.25, 33.75, 34.75, 36.5, 37.25, 37.75, 38.75, 39.25, 39.75, 40.25, 40.75, 41.25, 41.75, 42.25, 42.75, 43.25, 43.75, 44.25, 44.75, 45.25, 45.75, 46.25, 46.75, 47.5, 48.5, 49.75, 51.75, 52.25, 52.75, 53.75, 54.75, 55.25, 55.75, 56.25, 56.75, 57.25, 57.75, 58.25, 58.75, 60.5, 61.25, 61.75, 62.25, 62.75, 63.75, 64.75, 65.25, 65.75, 66.25, 66.75, 67.25, 67.75, 68.25, 68.75, 69.75, 70.25, 70.75, 71.5, 72.5, 73.25, 73.75, 74.75, 75.75, 76.75, 77.25, 77.75, 78.25, 78.75, 79.25, 79.75, 80.25, 80.75, 81.25, 81.75, 82.25, 82.75, 83.5, 85.25, 85.75, 86.75, 87.25, 87.75, 88.75, 89.75, 90.25, 90.75, 91.25, 91.75, 92.25, 92.75, 93.25, 93.75, 94.25, 94.75, 95.5, 96.5, 97.25, 97.75, 98.25, 98.75, 99.25, 99.75, 100.917, 100.583, 100.75, 100.417, 101.917, 101.583, 101.75, 101.417, 1.16667, 102.417, 102.917, 1.66667, 102.25, 102.75, 103.25, 103.75, 103.417, 103.917, 102.083, 104.417, 104.917, 104.75, 103.083, 103.583, 105.917, 105.417, 104.583, 105.583, 105.75, 106.75, 106.417, 106.917, 106.583, 107.75, 107.417, 107.917, 107.083, 107.583, 16.4167, 16.9167, 17.4167, 17.9167, 18.4167, 18.9167, 19.4167, 19.9167, 20.4167, 20.9167, 21.4167, 21.9167, 22.4167, 22.9167, 23.1667, 23.6667, 24.1667, 24.6667, 25.1667, 25.4167, 25.9167, 26.4167, 26.9167, 27.4167, 27.9167, 28.4167, 28.9167, 29.4167, 29.9167, 30.4167, 30.9167, 31.4167, 31.9167, 32.4167, 32.9167, 33.4167, 33.9167, 34.4167, 34.9167, 35.1667, 35.6667, 10.1667, 10.6667, 10.9167, 36.1667, 36.6667, 11.1667, 11.6667, 37.1667, 37.4167, 12.1667, 12.6667, 37.9167, 38.4167, 38.9167, 13.1667, 13.4167, 13.9167, 39.4167, 39.9167, 14.4167, 14.9167, 40.4167, 40.9167, 15.4167, 15.9167, 41.4167, 41.9167, 42.4167, 43.9167, 43.4167, 44.4167, 44.9167, 45.4167, 45.9167, 46.4167, 46.9167, 47.1667, 47.6667, 48.6667, 49.4167, 49.9167, 50.4167, 50.9167, 0.333333, 0.833333, 51.4167, 51.9167, 52.4167, 52.9167, 2.83333, 53.4167, 53.9167, 3.33333, 3.83333, 54.4167, 54.9167, 4.33333, 4.83333, 55.4167, 55.9167, 5.33333, 5.83333, 56.4167, 56.9167, 6.33333, 6.83333, 57.4167, 57.9167, 7.33333, 7.83333, 58.4167, 58.9167, 8.33333, 8.83333, 59.1667, 59.6667, 9.33333, 9.83333, 60.1667, 60.6667, 61.1667, 61.4167, 61.9167, 62.4167, 62.9167, 63.4167, 63.9167, 0.166667, 0.666667, 3.16667, 3.66667, 4.16667, 4.66667, 5.16667, 5.66667, 6.16667, 6.66667, 7.16667, 7.66667, 8.16667, 8.66667, 9.16667, 9.66667, 64.4167, 64.5833, 64.9167, 65.4167, 65.5833, 65.9167, 102.583, 66.4167, 66.5833, 66.9167, 67.0833, 67.4167, 67.5833, 67.9167, 68.0833, 68.4167, 68.5833, 68.9167, 69.4167, 69.5833, 69.9167, 70.4167, 70.5833, 70.9167, 71.1667, 71.3333, 71.6667, 71.8333, 72.1667, 72.3333, 72.6667, 72.8333, 73.1667, 73.4167, 73.5833, 73.9167, 74.4167, 74.5833, 74.9167, 75.4167, 75.5833, 75.9167, 76.4167, 76.5833, 76.9167, 77.0833, 77.4167, 77.5833, 77.9167, 78.0833, 78.4167, 78.5833, 78.9167, 79.4167, 79.5833, 79.9167, 80.4167, 80.5833, 80.9167, 81.4167, 81.5833, 81.9167, 82.0833, 82.4167, 82.5833, 82.9167, 83.1667, 83.3333, 83.6667, 83.8333, 84.1667, 84.3333, 84.6667, 84.8333, 85.1667, 85.4167, 85.5833, 85.9167, 86.4167, 86.5833, 10.3333, 86.9167, 87.0833, 87.4167, 11.3333, 11.8333, 87.5833, 87.9167, 88.0833, 12.3333, 12.8333, 88.4167, 88.5833, 88.9167, 13.5833, 89.4167, 89.5833, 89.9167, 90.4167, 14.5833, 90.5833, 90.9167, 91.4167, 91.5833, 15.5833, 91.9167, 92.0833, 92.4167, 92.5833, 92.9167, 93.0833, 93.4167, 93.5833, 93.9167, 94.4167, 94.5833, 94.9167, 95.1667, 95.3333, 95.6667, 95.8333, 96.1667, 96.3333, 96.6667, 96.8333, 97.1667, 97.4167, 97.5833, 97.9167, 98.0833, 98.4167, 98.5833, 98.9167, 99.4167, 99.5833, 99.9167, 1.33333, 1.83333, 16.5833, 17.0833, 17.5833, 18.5833, 19.5833, 20.5833, 21.5833, 22.0833, 22.5833, 23.3333, 23.8333, 24.3333, 24.8333, 25.5833, 26.5833, 27.0833, 27.5833, 28.0833, 28.5833, 29.5833, 30.5833, 31.5833, 32.0833, 32.5833, 33.5833, 34.5833, 35.3333, 35.8333, 36.3333, 36.8333, 37.5833, 38.5833, 39.5833, 40.5833, 41.5833, 42.0833, 42.5833, 43.0833, 43.5833, 44.0833, 44.5833, 45.5833, 46.5833, 47.3333, 47.8333, 48.3333, 48.8333, 49.5833, 50.5833, 51.5833, 52.0833, 52.5833, 53.0833, 53.5833, 54.5833, 55.5833, 56.5833, 57.0833, 57.5833, 58.0833, 58.5833, 59.3333, 59.8333, 60.3333, 60.8333, 61.5833, 62.0833, 62.5833, 63.0833, 63.5833] not in index'

So then I tried as you suggested by changing the function:

def perc_fluxes1(x,y):
x= df.columns[2:]  #to not consider the column 'A' and 'B'
for i in x:
    y= (i/(df['A']*df['B']))*100

for column in df.columns[2:]:
    # Then you can call your function to create your columns
    new_column = "perc_"+column
    df[new_column] = df[column, new_column].apply(perc_fluxes1)
    print (df)

But I got this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-61-953155850397> in <module>
      1 for column in aho2_n2.columns[2:2142] :
      2     # Then you can call your function to create your columns
----> 3     new_column = "perc_"+column
      4     aho2_n2[new_column] = aho2_n2[column, new_column].apply(perc_fluxes2)

TypeError: can only concatenate str (not "int") to str

Thank you again in advance for the patience.

To iterate over column labels and create your columns you can do :

for column in df.columns :
    # Then you can call your function to create your columns
    new_column = "perc_"+column
    df[new_column] = df[column].apply(perc_fluxes)

Of course you need to adapt your function to your desired output but without a concrete example of what you need to produce we can't help more.

Ok in the end I was able to create a correct one and now it works. Thank to your suggestion to create the new label :)

So the right code is this:

def perc_fluxes(y,z,w):
    if w !=0:
        return (y/(z*w))*100
    else:
        return 0
        
for column in df.columns[2:2143]: # considering the range of columns to iterate

    #Then you can call your function to create your columns
    new_column = "perc_"+ column
    df[new_column] = df.apply(lambda x: perc_fluxes(x[column], x['A'], x['B']), axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM