I have a dataframe with +60 columns. I need to group by two columns, and only need one column values summed. The problem is if I don't manually type the names of all the column labels in the groupby statement, columns not included will not appear in the output.
Instead of something like this:
df_final.groupby(by=['OrderNo','ItemSKU','CustName',.......'20th Column'],as_index=False).sum()
I'd like to do something like this:
df_final.groupby(by=[:20],as_index=False).sum()
How can I do this and avoid typing all those column names?
Here is a print of the column datatypes:
>>> print(df_final.dtypes)
OrderNo float64
PledgeID int64
ReferrerID float64
FulfillmentStatus object
FundingDate object
PaymentMethod float64
Appearance object
Name object
Email object
Amount object
PlatformFee object
PerkID float64
Perk object
ShippingName object
ShippingPhoneNumber object
ShippingAddress object
ShippingAddress2 object
ShippingCity object
ShippingState/Province object
ShippingZip/PostalCode object
ShippingCountry object
ItemSKU object
ArticleName object
UPC float64
ArticleQty int64
dtype: object
>>>
您可以将前20列名称转换为列表:
df_final.groupby(by=df_final.columns[:20].tolist(),as_index=False).sum()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.