python pandas dataframe to csv导出每列具有唯一格式的格式化文本文件

Question

I am using Python 2.7.7 with Pandas on Win 7 64bit. 我在Win 7 64位上将Python 2.7.7与Pandas一起使用。 My input data was originally as space delimited, right justified. 我的输入数据最初是用空格定界的，右对齐。 I now have the data as a Pandas dataframe which I exported as a csv. 现在，我将数据作为Pandas数据框，并导出为csv。 I want to write a space delimited right justified text file. 我想编写一个以空格分隔的右对齐文本文件。 The columns have strings, ints, and floats. 这些列具有字符串，整数和浮点数。 I tried to format one of the columns using this: 我试图使用以下格式设置列之一：

df_fg['Mem']=df_fg['Mem'].map('{:5d}'.format)

This allows me to format each column individually, which is great. 这使我可以分别设置每一列的格式，这很棒。

The problem is that when I use this type of formatting I can not output a space delimited file. 问题是，当我使用这种格式时，我无法输出以空格分隔的文件。 Here are the various ways I tried to write the text file: 这是我尝试编写文本文件的各种方法：

df_fg.to_csv('t.txt',index = False)

Not surprisingly this produces a csv file that is formatted with padding spaces. 毫不奇怪，这会生成一个用填充空格格式化的csv文件。

So, I thought the next logical step would be to try to include "sep" to get rid of the commas. 因此，我认为下一个合乎逻辑的步骤将是尝试包括“ sep”以消除逗号。

df_fg.to_csv('t.txt',index = False,sep= ' ')

this produces formatted text in the text file, but each element in every column is surrounded by double quotes. 这样会在文本文件中生成带格式的文本，但是每列中的每个元素都用双引号引起来。 So I get a column that looks like 所以我得到一个看起来像

"    1"
"    1"

I tried various combinations of the "quoting" and "doublequote" options of .to_csv. 我尝试了.to_csv的“引号”和“双引号”选项的各种组合。 Nothing works. 什么都没有。 I either end up with formatted text within double quotation marks or formatted text within a csv file. 我要么以双引号引起来的格式化文本，要么最终以csv文件中的格式化文本结束。 I can't get formatted text within a text file. 我无法在文本文件中获取带格式的文本。

Maybe, I should not use "map" and "format"? 也许我不应该使用“地图”和“格式”？ Any advice on how to write a right-justified space deliminated strings, ints, and floats from a dataframe or csv would be very much appreciated. 任何有关如何从数据帧或csv中写右对齐的空间的字符串，整数和浮点数的建议都将非常受人们欢迎。

I attempted to write the dataframe to a string. 我试图将数据帧写入字符串。 I formatted each column in the dataframe using commands such as df_g['Mem']=df_g['Mem'].map('{:4d}'.format) 我使用诸如df_g ['Mem'] = df_g ['Mem']。map（'{：4d}'。format）之类的命令来格式化数据框中的每一列。

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

I wrote the dataframe using the dataframe to string command. 我使用dataframe to string命令编写了dataframe。 I was hoping that the output would be right justified 我希望输出是正确的

f2 = open('2.txt','w')
s=df_g.to_string(justify='right',index = False)
f2.write(s) 
f2.close()

In the text file not all columns were right justified. 在文本文件中，并非所有列都是右对齐的。 Column 1 contains an integer it was right justified as expected Column 5 contains a float with 2 decimals it was right justified as expected Columns 2,3 and 4 were strings (I used the command below to make them strings in the dataframe 第1列包含一个整数，可以按预期对正；第5列包含一个带2个小数的浮点数；第2列，第3列和第4列是字符串（我使用下面的命令将它们设置为数据帧中的字符串）

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

1,26/04/2015 ,09:19:07 ,more-text , -1600.00, 1,26 / 04/2015，09：19：07，更多文本，-1600.00，

(I am presenting the commas just to demonstrate where the fields end and begin. （我提出逗号只是为了说明字段的结束和起点。

So, I still cannot find a way for dataframe.to_string to output formatted strings. 因此，我仍然找不到dataframe.to_string输出格式化字符串的方法。 Most interestingly, the "map format" DOES, in fact, change the length of the strings( and the spacing), but the " justify='right' " did not work on them. 最有趣的是，“地图格式”确实会更改字符串的长度（和间距），但是“ justify ='right'”对它们不起作用。

Any advice? 有什么建议吗？

Answer 1

I think this might give you what you're looking for. 我认为这可以为您提供所需的东西。 First pad the column entries as you suggest. 首先按照您的建议填充列条目。 Then sum along axis 1: 然后沿轴1求和：

s = df_string.sum(axis=1)

This is a series with a string in each entry representing a row in the original df. 这是一个序列，每个条目中都有一个字符串，代表原始df中的一行。 Then just add a line break to each element and sum again: 然后只需在每个元素上添加一个换行符，然后再次求和：

s = (s + '\n').sum()

Then just write the file you want 然后只写你想要的文件

open('t.txt', 'w').write(s)

Here's a stupidly terse one-liner example: 这是一个愚蠢的简单示例：

df = pd.DataFrame({'A': [1.2, 2.34], 'B': ['foo', 'bar', ]})
print (df.applymap(lambda x: '{:>20s}'.format(str(x))).sum(axis=1) + '\n').sum()

             1.2                 foo
            2.34                 bar

python pandas dataframe to csv导出每列具有唯一格式的格式化文本文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-05-25 20:50:10

python pandas dataframe to csv导出每列具有唯一格式的格式化文本文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-05-25 20:50:10

解决方案1
1 已采纳 2015-05-25 20:50:10