Removing dash string from mixed dtype column in pandas Dataframe

Question

I have a dataframe with possible objects mixed with numerical values.

My target is to change every value to a simple integer, however, some of these values have - between numbers.

A minimal working example looks like:

import pandas as pd

d = {'API':[float(4433), float(3344), 6666, '6-9-11', '8-0-11', 9990]}
df = pd.DataFrame(d)

I try:

df['API'] = df['API'].str.replace('-','')

But this leaves me with nan for the numeric types because it's searching the entire frame for the strings only.

The output is:

API

nan
nan
nan
6911
8011
nan

I'd like an output:

Where all types are int .

Is there an easy way to take care of just the object types in the Series but leaving the actual numericals in tact? I'm using this technique on large data sets (300,000+ lines) so something like lambda or series operations would be preferred over a loop search.

Answer 1

Use df.replace with regex=True

df = df.replace('-', '', regex=True).astype(int)

    API
0   4433
1   3344
2   6666
3   6911
4   8011
5   9990

Answer 2

也，

df['API'] = df['API'].astype(str).apply(lambda x: x.replace('-', '')).astype(int)

Removing dash string from mixed dtype column in pandas Dataframe

Question

2 answers

solution1
4 ACCPTED 2019-03-21 17:41:48

solution2
1 2019-03-21 18:13:14

Removing dash string from mixed dtype column in pandas Dataframe

Question

2 answers

solution1 4 ACCPTED 2019-03-21 17:41:48

solution2 1 2019-03-21 18:13:14

solution1
4 ACCPTED 2019-03-21 17:41:48

solution2
1 2019-03-21 18:13:14