简体   繁体   中英

Fill up a column in a loop

I have a dataset like this:

import pandas as pd
df = pd.DataFrame([[0, 0], [2,2] ], columns=('feature1', 'feature2'))

Now I would like to add an extra column

df['c'] = ""

And then loop trought the data.frame to fill up column C with the contents of both feature 1 and feature 2

for index, row in df.iterrows():
    subject = row["feature1"]
    content = row["feature2"]
    row["C"] = subject, content

However if I print the data frame now. Something seems to go wrong cause column C is empty.

If you want to build a tuple out of two columns, be explicit and keep it simple:

df['c'] = df.apply(tuple, axis=1)

df
Out[7]: 
   feature1  feature2       c
0         0         0  (0, 0)
1         2         2  (2, 2)

EdChum has you covered in the comments for how to fix your approach - you should be using .loc for indexing. However can achieve the same much more simply and without having to resort to row iteration by using zip .

In[43]: df['c'] = list(zip(df.feature1, df.feature2))
in[44]: df
Out[44]: 
   feature1  feature2       c
0         0         0  (0, 0)
1         2         2  (2, 2)
df.assign(c=df.set_index(['feature1', 'feature2']).index.to_series().values)

You never updated the original column. You just updated a variable named row. But for ease of remembering code (not the most efficient obviously):

df['C'] = zip(df.feature1, df.feature2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM