[英]concatenate the two data-frame columns using pandas
I am new to python development.我是 python 开发的新手。 Here, I have the following dataframe which
在这里,我有以下 dataframe
Document_ID OFFSET PredictedFeature word
0 0 2000 Mark
0 8 2000 Bob
0 16 2200 AL
0 23 2200 NS
0 30 2200 GK
1 0 2100 sandy
1 5 2100 Rohan
1 7 2100 DV
Here DOcument ID is the key you can say I.这里 DOcument ID 是您可以说 I 的关键。
Here what I am trying to do is that making a file in which I will see the result like在这里,我想做的是制作一个文件,在该文件中我会看到类似的结果
mark 2000, Bob 2000, AL 2200, NS 2200, GK 2200, sandy 2100, 2100 Rohan, 2100 DV
I tried using the group by我尝试使用该组
df = df.groupby('Document_ID').agg(lambda x: ','.join(x))
for name in df.index:
print name
print df.loc[name]
also I am trying to save it in text or csv format file.我也试图将其保存为文本或 csv 格式文件。
Can any one help me with this?谁能帮我这个?
Use DataFrame.stack
:使用
DataFrame.stack
:
new_df=df[['word','PredictedFeature']].stack().to_frame().T
new_df.columns=new_df.columns.droplevel(0)
print(new_df)
word PredictedFeature word PredictedFeature word PredictedFeature word \
0 Mark 2000 Bob 2000 AL 2200 NS
PredictedFeature word PredictedFeature word PredictedFeature word \
0 2200 GK 2200 sandy 2100 Rohan
PredictedFeature word PredictedFeature
0 2100 DV 2100
but if you want to keep the rest of the information it is best to use pivot_table
但是如果要保留rest的信息最好使用
pivot_table
new_df=df.pivot_table(columns=['word','PredictedFeature'],index='Document_ID',values='OFFSET',fill_value=0)
print(new_df)
word AL Bob DV GK Mark NS Rohan sandy
PredictedFeature 2200 2000 2100 2200 2000 2200 2100 2100
Document_ID
0 16 8 0 30 0 23 0 0
1 0 0 7 0 0 0 5 0
to save it you need DataFrame.to_csv
:要保存它,您需要
DataFrame.to_csv
:
new_df.to_csv('mycsv.csv')
if it's multiindex you need:如果它是多索引,您需要:
new_df.to_csv('mycsv.csv',index_label=['word','PredictedFeature'])
to read it pd.read_csv
:阅读它
pd.read_csv
:
new_read_csv=pd.read_csv('mycsv.csv')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.