I want to replace the first n
elements of a column in my data frame with another pd.series I have saved. So as an example,
category price store testscore
0 Cleaning 11.42 Walmart NaN
1 Cleaning 23.50 Dia NaN
2 Entertainment 19.99 Walmart NaN
3 Entertainment 15.95 Fnac NaN
4 Tech 55.75 Dia NaN
5 Tech 111.55 Walmart NaN
Here I would want to replace the first three NaNs in testscore with a new set of strings.
Imagine I have a variable:
cats = pd.Series(df['category'][0:2])
So can I place this in the testscore column...
category price store testscore
0 Cleaning 11.42 Walmart Cleaning
1 Cleaning 23.50 Dia Cleaning
2 Entertainment 19.99 Walmart Entertainment
3 Entertainment 15.95 Fnac NaN
4 Tech 55.75 Dia NaN
5 Tech 111.55 Walmart NaN
But whenever I try to do this it won't work.
Code to create this fake dataset:
import pandas as pd
import numpy as np
df = pd.DataFrame({'category': ['Cleaning', 'Cleaning', 'Entertainment', 'Entertainment', 'Tech', 'Tech'],
'store': ['Walmart', 'Dia', 'Walmart', 'Fnac', 'Dia','Walmart'],
'price':[11.42, 23.50, 19.99, 15.95, 55.75, 111.55],
'testscore': [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})
print(df)
df2 = pd.DataFrame({'category': ['Cleaning', 'Cleaning', 'Entertainment', 'Entertainment', 'Tech', 'Tech'],
'store': ['Walmart', 'Dia', 'Walmart', 'Fnac', 'Dia','Walmart'],
'price':[11.42, 23.50, 19.99, 15.95, 55.75, 111.55],
'testscore': ['Cleaning', 'Cleaning', 'Entertainment', np.nan, np.nan, np.nan]})
print(df2)
Simply use df.loc :
import pandas as pd
import numpy as np
df = pd.DataFrame({'category': ['Cleaning', 'Cleaning', 'Entertainment', 'Entertainment', 'Tech', 'Tech'],
'store': ['Walmart', 'Dia', 'Walmart', 'Fnac', 'Dia','Walmart'],
'price':[11.42, 23.50, 19.99, 15.95, 55.75, 111.55],
'testscore': [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})
cats = pd.Series(df['category'][:3]) # 3 elements
df.loc[:3,'testscore'] = cats # Assign first 3
print(df)
And you get:
category price store testscore
0 Cleaning 11.42 Walmart Cleaning
1 Cleaning 23.50 Dia Cleaning
2 Entertainment 19.99 Walmart Entertainment
3 Entertainment 15.95 Fnac NaN
4 Tech 55.75 Dia NaN
5 Tech 111.55 Walmart NaN
Use fillna
with parameter limit
:
df['testscore'] = df.testscore.fillna(df.category, limit=3)
df
Output:
category price store testscore
0 Cleaning 11.42 Walmart Cleaning
1 Cleaning 23.50 Dia Cleaning
2 Entertainment 19.99 Walmart Entertainment
3 Entertainment 15.95 Fnac NaN
4 Tech 55.75 Dia NaN
5 Tech 111.55 Walmart NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.