Pandas Dataframe automatic typecasting

Question

I am working with a pandas dataframe and need several columns (x & y in the example below) to be an integer and one column to be a float (l). It appears that assigning a new row with a float in it recasts the whole dataframe as a float. Why is this and how do I prevent it?

data = pd.DataFrame(data=[[3103, 1189, 1]],index = None, columns = ['y', 'x', 'l'], dtype = int)
print data.y
data.ix[1] = (3, 3, 3.4)
print data.y

Which produces:

0    3103
Name: y, dtype: int32
0    3103
1       3
Name: y, dtype: float64

Answer 1

You can recast all of the other columns after each addition using:

data['y'] = data['y'].astype(int)

Not the most efficient solution if you need to add a lot of columns on the fly. Alternatively you could create the entire data frame using Series in advance and type the whole thing a creation time instead if that's an option.

Pandas Dataframe automatic typecasting

Question

1 answers

solution1
1 2016-09-21 16:49:37

Pandas Dataframe automatic typecasting

Question

1 answers

solution1 1 2016-09-21 16:49:37

solution1
1 2016-09-21 16:49:37