连接 groupedBy pandas 数据帧的字符串

Question

From ang SQL query, I got a DataFrame similar to this one:从 ang SQL 查询中，我得到了一个类似于这个的 DataFrame：

df = pd.DataFrame([
        ['ABC', 'Order'],
        ['ABC', 'Address'],
        ['ABC', 'Zip'],
        ['XYZ', 'Customer'],
        ['XYZ', 'Name']
    ],
    columns=("Table", "Column"))

  Table    Column
0   ABC     Order
1   ABC   Address
2   ABC       Zip
3   XYZ  Customer
4   XYZ      Name

I am trying to save info in a separate file, like:我正在尝试将信息保存在一个单独的文件中，例如：

Table ABC has columns: Order, Address, Zip表 ABC 有列：订单、地址、邮编

One line for each table (and only once).每个表一行（并且只有一次）。

How I can achieve this?我怎么能做到这一点？

I already tried:我已经尝试过：

for table_name in df.TABLE_NAME:
  output = "Table" + Table_name + "are" + (df.iloc[:,2])

But I am not getting any desired output.但我没有得到任何想要的输出。

Answer 1

Making some string manipulation while grouping by your Table name can give you what you expect.在按Table名分组时进行一些字符串操作可以满足您的期望。

import pandas as pd

if __name__ == '__main__':
    df = pd.DataFrame([
        ['ABC', 'Order'],
        ['ABC', 'Address'],
        ['ABC', 'Zip'],
        ['XYZ', 'Customer'],
        ['XYZ', 'Name']
    ],
    columns=("Table", "Column"))

    pretty = pd.concat(
        (df['Table'],
        df.groupby("Table")['Column'].transform(lambda x: ", ".join(x))),
        axis=1
    ).drop_duplicates()

    for _, row in pretty.iterrows():
        print("Table '{}' has columns: {}".format(row['Table'], row['Column']))

Table 'ABC' has columns: Order, Address, Zip
Table 'XYZ' has columns: Customer, Name

连接 groupedBy pandas 数据帧的字符串

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-11-25 11:02:12

连接 groupedBy pandas 数据帧的字符串

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-11-25 11:02:12

解决方案1
1 已采纳 2019-11-25 11:02:12