[英]Concatenate strings of a groupedBy pandas dataframe
From ang SQL query, I got a DataFrame similar to this one:从 ang SQL 查询中,我得到了一个类似于这个的 DataFrame:
df = pd.DataFrame([
['ABC', 'Order'],
['ABC', 'Address'],
['ABC', 'Zip'],
['XYZ', 'Customer'],
['XYZ', 'Name']
],
columns=("Table", "Column"))
Table Column
0 ABC Order
1 ABC Address
2 ABC Zip
3 XYZ Customer
4 XYZ Name
I am trying to save info in a separate file, like:我正在尝试将信息保存在一个单独的文件中,例如:
Table ABC has columns: Order, Address, Zip
表 ABC 有列:订单、地址、邮编
One line for each table (and only once).每个表一行(并且只有一次)。
How I can achieve this?我怎么能做到这一点?
I already tried:我已经尝试过:
for table_name in df.TABLE_NAME:
output = "Table" + Table_name + "are" + (df.iloc[:,2])
But I am not getting any desired output.但我没有得到任何想要的输出。
Making some string manipulation while grouping by your Table
name can give you what you expect.在按
Table
名分组时进行一些字符串操作可以满足您的期望。
import pandas as pd
if __name__ == '__main__':
df = pd.DataFrame([
['ABC', 'Order'],
['ABC', 'Address'],
['ABC', 'Zip'],
['XYZ', 'Customer'],
['XYZ', 'Name']
],
columns=("Table", "Column"))
pretty = pd.concat(
(df['Table'],
df.groupby("Table")['Column'].transform(lambda x: ", ".join(x))),
axis=1
).drop_duplicates()
for _, row in pretty.iterrows():
print("Table '{}' has columns: {}".format(row['Table'], row['Column']))
Table 'ABC' has columns: Order, Address, Zip
Table 'XYZ' has columns: Customer, Name
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.