Python Pandas - replace all values in dataframe where the value meets certain condition

Question

I have a dataframe that contains numbers represented as strings which uses the comma separator (eg 150,000). There are also some values that are represented by "-".

I'm trying to convert all the numbers that are represented as strings into a float number. The "-" will remain as it is.

My current code uses a for loop to iterate each column and row to see if each cell has a comma. If so, it removes the comma then converts it to a number.

This works fine most of the time except some of the dataframes have duplicated column names and that's when it falls apart.

Is there a more efficient way of doing this update (ie not using loops) and also avoid the problem when there are duplicated column names?

Current code:

    for col in statement_df.columns: 
    row = 0
    while row < len(statement_df.index):

        row_name = statement_df.index[row]

        if statement_df[col][row] == "-":
            #do nothing
            print(statement_df[col][row])

        elif statement_df[col][row].find(",") >= 0:
            #statement_df.loc[col][row] = float(statement_df[col][row].replace(",",""))
            x = float(statement_df[col][row].replace(",",""))
            statement_df.at[row_name, col] = x
            print(statement_df[col][row])

        else:

            x = float(statement_df[col][row])
            statement_df.at[row_name, col] = x
            print(statement_df[col][row])

        row = row + 1

Answer 1

Use str.replace(',', '') on dataframe itself

For a dataframe like below

Name  Count
Josh  12,33
Eric  24,57
Dany  9,678

apply like these

df['Count'] = df['Count'].str.replace(',', '')
df

It will give you the following output

   Name Count
0  Josh  1233
1  Eric  2457
2  Dany  9678

Answer 2

You can use iloc functionality for that -

for idx in range(len(df.columns)):
    df.iloc[:, idx] = df.iloc[:, idx].apply(your_function)

The code in your_function should be able to deal with input from one row. For example -

def your_function(x):
    if x == ',': return 0
    return float(x)

Python Pandas - replace all values in dataframe where the value meets certain condition

Question

2 answers

solution1
1 2020-04-27 14:47:30

solution2
0 2020-04-27 14:45:42

Python Pandas - replace all values in dataframe where the value meets certain condition

Question

2 answers

solution1 1 2020-04-27 14:47:30

solution2 0 2020-04-27 14:45:42

solution1
1 2020-04-27 14:47:30

solution2
0 2020-04-27 14:45:42