简体   繁体   English

如何从 dataframe 创建列表列表

[英]How to create a list of list from a dataframe

I have a dataframe df and I want to convert the dataframe to a list of list我有一个 dataframe df ,我想将 dataframe 转换为列表

    left_side                                  right_side                             similarity
0114600043776001 loan payment receipt         0421209017073500 loan payment receipt     0.689008
0114600043776001 loan payment receipt         0421209017073500 loan payment receipt     0.689008
vat onverve*issuance fee*506108               vat onverve*issuance fee*5061087       0.743522
vat onverve*issuance fee*506108               verve*issuance fee*506108*********1112    0.684342
verve*issuance fee*506108                     verve*issuance fee*506108*********8296    0.717817
verve*issuance fee*506108                     vat onverve*issuance fee*506108**         0.684342

maint fee recovery jun 2018                   vat maint fee recovery jun 2018          0.896607
maint fee recovery jun 2018                  vat maint fee recovery jun 2018         0.896607
maint fee recovery jun 2018                  vat maint fee recovery jun 2018         0.896607

Expected output should look like this:预期的 output 应如下所示:

[[0114600043776001 loan payment receipt, 0421209017073500 loan payment receipt,
  0421209017073500 loan payment receipt],
[vat onverve*issuance fee*506108, vat onverve*issuance fee*5061087, 
  verve*issuance fee*506108*********1112], 
[verve*issuance fee*506108*********8296, verve*issuance fee*506108                    
 vat onverve*issuance fee*506108** ],...]

I have tried grouping the above df by left_side column and converting the resulting df to a list, but the output is not what I expected.我尝试按left_side column对上述 df 进行分组并将生成的 df 转换为列表,但 output 不是我所期望的。 please I need your assistance on this请在这方面需要你的帮助

grouup_df = df.groupby(['left_side']).right_side.sum().to_frame()

grouup_df.values.tolist()

and the output looks like this: output 看起来像这样:

['0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt']
['vat maint fee recovery jun 2018vat maint fee recovery jun 2018vat maint fee recovery jun 2018maint fee recovery jul 2018maint fee recovery oct 2018maint fee recovery jul 2018maint fee recovery jul 2018']
import pandas as pd

dfold = {'left_side': ['string','string','string','string'],
            'right_side': ['string','string','string','string']
            }

df = pd.DataFrame(dfold, columns= ['left_side', 'right_side'])
print(df)
df_list = df.values.tolist()
print(df_list)

You can use df.groupby :您可以使用df.groupby

>>> [[k, *g] for k, g in df.groupby('left_side', sort=False)['right_side']]

[['0114600043776001 loan payment receipt',
  '0421209017073500 loan payment receipt',
  '0421209017073500 loan payment receipt'],
 ['vat onverve*issuance fee*506108',
  'vat onverve*issuance fee*5061087',
  'verve*issuance fee*506108*********1112'],
 ['verve*issuance fee*506108',
  'verve*issuance fee*506108*********8296',
  'vat onverve*issuance fee*506108**'],
 ['maint fee recovery jun 2018',
  'vat maint fee recovery jun 2018',
  'vat maint fee recovery jun 2018',
  'vat maint fee recovery jun 2018']]

I believe your looking for the to_records() method on a Datagrams.我相信您正在寻找数据报上的to_records()方法。 Try df.to_records() , you can find its documentation here试试df.to_records() ,你可以在这里找到它的文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM