Having an issue combining two like columns that have dtype object. Since the two columns are the same they never both have values in the same row. Everything in the columns are integers but there are some nan values and "$0" which all of solutions I have tried do not seem to bypass. The data looks like this:
Actual MTD Actual
nan 3
nan $0
nan nan
3 nan
2 nan
1 nan
I have tried changing the columns to string type and then to integer type. I have also tried filling in nan values with 0 but this does not seem to work
What I've tried:
1. df[["Actual", "MTD Actual"]].sum(axis=1)
2. df['Actual'].add(df['MTD Actual'], fill_value=0)
3. pd.to_numeric(df['MTD Actual'])
Corresponding error messages:
1. Will sum but the whole column is NaN
2. Returns "unsupported operand type(s) for +: 'int' and 'str' "
3. Unable to parse string "$0" at position 3266
I would like the output to be:
Actual
3
0
nan
3
2
1
You have two different issues. First, you want to convert your non-numeric columns to numeric values. Second, you want to sum across the columns, keeping nan
values where all the rows are nan
but treating them as 0
otherwise.
Here's a solution which should work:
df.loc[df.any(axis=1)] = df.replace('[\$,]', '', regex=True).astype(float).fillna(0)
df = df.sum(axis=1)
The regular expression removes dollar signs and commas. .astype(float)
casts the data to be numeric, and .fillna(0)
replaces the nan
s. df.loc[df.any(axis=1)]
means we're only changing the values of rows where there's at least one non- nan
value.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.