使用 pandas 连接两个数据框列

Question

I am new to python development.我是 python 开发的新手。 Here, I have the following dataframe which在这里，我有以下 dataframe

Document_ID OFFSET  PredictedFeature  word

    0         0            2000       Mark
    0         8            2000       Bob
    0         16           2200       AL
    0         23           2200       NS
    0         30           2200       GK
    1          0            2100      sandy
    1          5            2100      Rohan
    1          7            2100      DV

Here DOcument ID is the key you can say I.这里 DOcument ID 是您可以说 I 的关键。

Here what I am trying to do is that making a file in which I will see the result like在这里，我想做的是制作一个文件，在该文件中我会看到类似的结果

mark 2000, Bob 2000, AL 2200, NS 2200, GK 2200, sandy 2100, 2100 Rohan, 2100 DV

I tried using the group by我尝试使用该组

df = df.groupby('Document_ID').agg(lambda x: ','.join(x))
for name in df.index:
    print name
    print df.loc[name]

also I am trying to save it in text or csv format file.我也试图将其保存为文本或 csv 格式文件。

Can any one help me with this?谁能帮我这个？

Answer 1

Use DataFrame.stack :使用DataFrame.stack ：

new_df=df[['word','PredictedFeature']].stack().to_frame().T
new_df.columns=new_df.columns.droplevel(0)
print(new_df)

   word PredictedFeature word PredictedFeature word PredictedFeature word  \
0  Mark             2000  Bob             2000   AL             2200   NS   

  PredictedFeature word PredictedFeature   word PredictedFeature   word  \
0             2200   GK             2200  sandy             2100  Rohan   

  PredictedFeature word PredictedFeature  
0             2100   DV             2100

but if you want to keep the rest of the information it is best to use pivot_table但是如果要保留rest的信息最好使用pivot_table

new_df=df.pivot_table(columns=['word','PredictedFeature'],index='Document_ID',values='OFFSET',fill_value=0)
print(new_df)

word               AL  Bob   DV   GK Mark   NS Rohan sandy
PredictedFeature 2200 2000 2100 2200 2000 2200  2100  2100
Document_ID                                               
0                  16    8    0   30    0   23     0     0
1                   0    0    7    0    0    0     5     0

to save it you need DataFrame.to_csv :要保存它，您需要DataFrame.to_csv ：

new_df.to_csv('mycsv.csv')

if it's multiindex you need:如果它是多索引，您需要：

new_df.to_csv('mycsv.csv',index_label=['word','PredictedFeature'])

to read it pd.read_csv :阅读它pd.read_csv ：

new_read_csv=pd.read_csv('mycsv.csv')

使用 pandas 连接两个数据框列

问题描述

1 个解决方案

解决方案1
0 2019-10-16 09:40:22

使用 pandas 连接两个数据框列

问题描述

1 个解决方案

解决方案1 0 2019-10-16 09:40:22

解决方案1
0 2019-10-16 09:40:22