I was wondering how I would be able to convert my binned dataframe to a binned numpy array that I can use in sklearn's PCA.
Here's my code so far (x is my original unbinned dataframe):
bins=(2,6,10,14,20,26,32,38,44,50,56,62,68,74,80,86,92,98)
binned_data = x.groupby(pd.cut(x.Weight, bins))
I want to convert binned_data to a numpy array. Thanks in advance.
EDIT:
When I try binned_data.values, I receive this error:
AttributeError: Cannot access attribute 'values' of 'DataFrameGroupBy' objects, try using the 'apply' method
You need to apply some kind of aggregation to the GroupBy object to return a DataFrame. Once you have that, you can use .values
to extract the numpy arrary.
For example, if you wanted the sum or count of the data in each bin you could do:
binned_data.sum().values
binned_data.size().values
Edit: My code wasn't exactly right, because the column (Weight) and the index will have the same name. It can be fixed by renaming the index, as below:
binned_data = x.groupby(pd.cut(x.Weight, bins)).sum()
binned_data.index.name = 'Weight_Bin'
binned_data.reset_index().values
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.