简体   繁体   English

python pandas dataframe to csv导出每列具有唯一格式的格式化文本文件

[英]python pandas dataframe to csv exporting a formatted text file with unique formats for each column

I am using Python 2.7.7 with Pandas on Win 7 64bit. 我在Win 7 64位上将Python 2.7.7与Pandas一起使用。 My input data was originally as space delimited, right justified. 我的输入数据最初是用空格定界的,右对齐。 I now have the data as a Pandas dataframe which I exported as a csv. 现在,我将数据作为Pandas数据框,并导出为csv。 I want to write a space delimited right justified text file. 我想编写一个以空格分隔的右对齐文本文件。 The columns have strings, ints, and floats. 这些列具有字符串,整数和浮点数。 I tried to format one of the columns using this: 我试图使用以下格式设置列之一:

df_fg['Mem']=df_fg['Mem'].map('{:5d}'.format)

This allows me to format each column individually, which is great. 这使我可以分别设置每一列的格式,这很棒。

The problem is that when I use this type of formatting I can not output a space delimited file. 问题是,当我使用这种格式时,我无法输出以空格分隔的文件。 Here are the various ways I tried to write the text file: 这是我尝试编写文本文件的各种方法:

df_fg.to_csv('t.txt',index = False)

Not surprisingly this produces a csv file that is formatted with padding spaces. 毫不奇怪,这会生成一个用填充空格格式化的csv文件。

So, I thought the next logical step would be to try to include "sep" to get rid of the commas. 因此,我认为下一个合乎逻辑的步骤将是尝试包括“ sep”以消除逗号。

df_fg.to_csv('t.txt',index = False,sep= ' ') 

this produces formatted text in the text file, but each element in every column is surrounded by double quotes. 这样会在文本文件中生成带格式的文本,但是每列中的每个元素都用双引号引起来。 So I get a column that looks like 所以我得到一个看起来像

"    1"
"    1"

I tried various combinations of the "quoting" and "doublequote" options of .to_csv. 我尝试了.to_csv的“引号”和“双引号”选项的各种组合。 Nothing works. 什么都没有。 I either end up with formatted text within double quotation marks or formatted text within a csv file. 我要么以双引号引起来的格式化文本,要么最终以csv文件中的格式化文本结束。 I can't get formatted text within a text file. 我无法在文本文件中获取带格式的文本。

Maybe, I should not use "map" and "format"? 也许我不应该使用“地图”和“格式”? Any advice on how to write a right-justified space deliminated strings, ints, and floats from a dataframe or csv would be very much appreciated. 任何有关如何从数据帧或csv中写右对齐的空间的字符串,整数和浮点数的建议都将非常受人们欢迎。

I attempted to write the dataframe to a string. 我试图将数据帧写入字符串。 I formatted each column in the dataframe using commands such as df_g['Mem']=df_g['Mem'].map('{:4d}'.format) 我使用诸如df_g ['Mem'] = df_g ['Mem']。map('{:4d}'。format)之类的命令来格式化数据框中的每一列。

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

I wrote the dataframe using the dataframe to string command. 我使用dataframe to string命令编写了dataframe。 I was hoping that the output would be right justified 我希望输出是正确的

f2 = open('2.txt','w')
s=df_g.to_string(justify='right',index = False)
f2.write(s) 
f2.close() 

In the text file not all columns were right justified. 在文本文件中,并非所有列都是右对齐的。 Column 1 contains an integer it was right justified as expected Column 5 contains a float with 2 decimals it was right justified as expected Columns 2,3 and 4 were strings (I used the command below to make them strings in the dataframe 第1列包含一个整数,可以按预期对正;第5列包含一个带2个小数的浮点数;第2列,第3列和第4列是字符串(我使用下面的命令将它们设置为数据帧中的字符串)

df_g['Date1']=df_g['Date1'].map('{:12s}'.format)

1,26/04/2015 ,09:19:07 ,more-text , -1600.00, 1,26 / 04/2015,09:19:07,更多文本,-1600.00,

(I am presenting the commas just to demonstrate where the fields end and begin. (我提出逗号只是为了说明字段的结束和起点。

So, I still cannot find a way for dataframe.to_string to output formatted strings. 因此,我仍然找不到dataframe.to_string输出格式化字符串的方法。 Most interestingly, the "map format" DOES, in fact, change the length of the strings( and the spacing), but the " justify='right' " did not work on them. 最有趣的是,“地图格式”确实会更改字符串的长度(和间距),但是“ justify ='right'”对它们不起作用。

Any advice? 有什么建议吗?

I think this might give you what you're looking for. 我认为这可以为您提供所需的东西。 First pad the column entries as you suggest. 首先按照您的建议填充列条目。 Then sum along axis 1: 然后沿轴1求和:

s = df_string.sum(axis=1)

This is a series with a string in each entry representing a row in the original df. 这是一个序列,每个条目中都有一个字符串,代表原始df中的一行。 Then just add a line break to each element and sum again: 然后只需在每个元素上添加一个换行符,然后再次求和:

s = (s + '\n').sum()

Then just write the file you want 然后只写你想要的文件

open('t.txt', 'w').write(s)

Here's a stupidly terse one-liner example: 这是一个愚蠢的简单示例:

df = pd.DataFrame({'A': [1.2, 2.34], 'B': ['foo', 'bar', ]})
print (df.applymap(lambda x: '{:>20s}'.format(str(x))).sum(axis=1) + '\n').sum()

             1.2                 foo
            2.34                 bar

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM