I have a DataFrame that looks like this:
n_trigger device_name Collected charge (V s) accepted
0 0 Speedy Gonzalez 2.913136e-12 True
1 0 #6 5.530943e-12 True
2 1 Speedy Gonzalez 1.530740e-11 True
3 1 #6 4.784455e-11 True
4 2 Speedy Gonzalez 6.736956e-12 True
... ... ... ... ...
9507 5552 #6 1.155196e-11 True
9508 5553 Speedy Gonzalez 3.378050e-12 True
9509 5553 #6 9.158863e-12 True
9510 5554 Speedy Gonzalez 3.723929e-12 True
9511 5554 #6 1.401557e-11 True
and I also have this function
def resample_measured_data(measured_data_df):
resampled_df = measured_data_df.pivot(
index = 'n_trigger',
columns = 'device_name',
values = set(measured_data_df.columns) - {'n_trigger','device_name'},
)
resampled_df = resampled_df.sample(frac=1, replace=True)
resampled_df = resampled_df.stack()
resampled_df = resampled_df.reset_index()
return resampled_df
For some reason the Collected charge (V s)
column is being changed from float64
to object
. I found that pivot
changes from int
to float
which is reasonable to handle NaN
values. But why is it here changing from float64
to object
?
I think this is because you used 2 columns which are of different dtypes as "values"
in pivot
.
Let's look at a simple example df
:
a b c d
0 1 2 1 True
1 2 2 0 False
>>> df.pivot('a','b',['c','d']).dtypes
b
c 2 object
d 2 object
dtype: object
this happens because c
is dtype int and d
is dtype bool. Now if we change dtype of c
into bool and check dtypes:
>>> df['c'] = df['c'].astype(bool)
>>> df.pivot('a','b',['c','d']).dtypes
b
c 2 bool
d 2 bool
dtype: object
we get bool as expected. Same happens if we change the dtype of d
to float or int, we'll get the expected dtypes.
Back to your data, if we change the dtype of "accepted"
column to numeric and then pivot
:
resampled_df = measured_data_df.assign(accepted=measured_data_df['accepted'].astype(int)).pivot(
index = 'n_trigger',
columns = 'device_name',
values = set(measured_data_df.columns) - {'n_trigger','device_name'},
)
>>> resampled_df.dtypes
device_name
accepted #6 float64
Speedy Gonzalez float64
Collected charge (V s) #6 float64
Speedy Gonzalez float64
dtype: object
we get the expected dtypes.
Finally, if we fill with 0, the float dtype columns revert back to their original dtype:
>>> resampled_df.fillna(0).dtypes
device_name
Collected charge (V s) #6 float64
Speedy Gonzalez float64
accepted #6 object
Speedy Gonzalez object
dtype: object
It turns out, you can directly turn them into dtype float objects as well using astype(float)
:
>>> s = resampled_df['Collected charge (V s)'].astype(float)
>>> s.dtypes
device_name
#6 float64
Speedy Gonzalez float64
dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.