简体   繁体   中英

Fill NaNs of pandas.DataFrame based on condition over another column

I want to replace NaNs from one column of a DataFrame based on a condition from another column. If in column [0] there is "Passenger-Kilometers", I want to fill the NaN of another column [1] at that row with value "Total passenger transport", as in index 14 of df below (there is an alternative to that for other NaNs, please see the mapping totals_dict below). If tried this loop below, which works in each case, but I'd like to find a more elegant solution.

totals_dict = {"Passenger-Kilometers": "Total passenger transport",
               "Freight Ton-Kilometers": "Total freight transport",}
for key, value in totals_dict.items():
    df[df[0] == key] = df[df[0] == key].fillna(value)

Is there a more clean, different approach to solve this?

Alternatively, I tried:

df = df.groupby(0).assign(target_col=lambda group: group["target_col"].fillna(totals_dict.get(group[0])))

But unfortunately groupby objects don't accept assign as method.

df is as follows:

                       0                                         1
1          Vehicle Stock                Medium Trucks(10000 units)
2          Vehicle Stock                 Heavy Trucks(10000 units)
3          Vehicle Stock                       Trucks(10000 units)
4          Vehicle Stock      Mini Passenger Vehicles(10000 units)
5          Vehicle Stock     Small Passenger Vehicles(10000 units)
6          Vehicle Stock    Medium Passenger Vehicles(10000 units)
7          Vehicle Stock                 Light Trucks(10000 units)
8          Vehicle Stock     Large Passenger Vehicles(10000 units)
9          Vehicle Stock               Civil Vehicles(10000 units)
10  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
11  Passenger-Kilometers       Waterways(100 million passenger-km)
12  Passenger-Kilometers        Highways(100 million passenger-km)
13  Passenger-Kilometers        Railways(100 million passenger-km)
14  Passenger-Kilometers                                      None
15         Vehicle Stock           Passenger Vehicles(10000 units)

Thank you!

Lets say I have this dataframe:

>>> a
                      0                                         1
0  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
1  Passenger-Kilometers       Waterways(100 million passenger-km)
2  Passenger-Kilometers                                      None
3  Passenger-Kilometers                                      None
4  Passenger-Kilometers                                      None

Then I can run the following:

def b(x):
    x[1] = "hello"
    return x
a[(a[0] == "Passenger-Kilometers") & (a[1].isnull())] = a[(a[0] == "Passenger-Kilometers") & (a[1].isnull())].apply(b, axis=1)

And now if I look:

>>> a
                      0                                         1
0  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
1  Passenger-Kilometers       Waterways(100 million passenger-km)
2  Passenger-Kilometers                                     hello
3  Passenger-Kilometers                                     hello
4  Passenger-Kilometers                                     hello

So you can just replace "hello" with whatever you need

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM