简体   繁体   English

将文本文件转换为excel python 3

[英]converting text file to excel python 3

I have a text file that i'm trying to convert to a Excel file in python 3. The text files have a series of accounts - one text file looks like: example - 我有一个文本文件,我正在尝试将其转换为python 3中的Excel文件。该文本文件具有一系列帐户-一个文本文件如下所示:示例-

PRODUCE_NAME: abc PRODUCE_NAME:abc

PRODUCE_NUMBER: 12345 PRODUCE_NUMBER:12345

DATE: 12/1/13 日期:12/1/13

PRODUCE_NAME: efg PRODUCE_NAME:efg

PRODUCE_NUMBER: 987 PRODUCE_NUMBER:987

DATE: 2/16/16 日期:2/16/16

TIME: 12:54:00 时间:12:54:00

PRODUCE_NAME: xyz PRODUCE_NAME:xyz

PRODUCE_NUMBER: 0046 PRODUCE_NUMBER:0046

DATE: 7/15/10 日期:7/15/10

COLOR: blue. 颜色:蓝色。

I would like the excel file to look like this. 我希望excel文件看起来像这样。 enter image description here 在此处输入图片说明

some code: ` # open text file 一些代码:`#打开文本文件

op_file = open("Comp_file_1.txt", "r", encoding='windows-1252')
text_file = op_file.read()

##############################################################
# location of CAP WORD: and group them 

for mj in re.finditer(r"[A-Z]\w+(:)", text_file):
    col_list_start.append(mj.start(0))
    col_list_end.append(mj.end(0))
    col_list_group.append(mj.group()) 

#############################################################
# Location of the end of file and delete index 0 of start

while True:
    # Advance location by 1.
    location = text_file.find(".", location + 1)

    # Break if not found.
    if location == -1: break

# Display result.
    endline = location

col_list_start.append(int(endline))
del col_list_start[0]

##############################################################
# cut out the index of the rows - abc , 12345, 12/1/13

for m in range(len(col_list_end)):
    index4.append(file_data2[col_list_end[m]:col_list_start[m]]) 

##############################################################
# makes a data frame 
# and groups the data frame

group_excel_list = {}
for k,v in zip(col_list_group, index4):
     group_excel_list.setdefault(k, []).append(v)`

dataframe looks like this 
key                 value
{"PRODUCE_NAME:": [abc, efg, xyz]}    
{"PRODUCE_NUMBER:" : [12345, 987, 0046]}
{"DATE:" : [12/1/13, 2/16/16, 7/15/10]}
{"TIME:" : [12:54:00]}
{"COLOR:" [blue]}

df = pd.DataFrame(data=[group_excel_list], columns = col_list_group)

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("Comp_file_1" + '.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')

# Close the Pandas Excel writer and output the Excel file.
writer.save()

I'm getting just one row of the dataframe. 我只得到数据框的一行。 Header - PRODUCE_NAME: PRODUCE_NUMBER: DATE: row 0 - [abc, efg, xyz] [12345, 987, 0046] [12/1/13, 2/16/16, 7/15/10] 标头-PRODUCE_NAME:PRODUCE_NUMBER:DATE:第0行-[abc,efg,xyz] [12345、987、0046] [12/1 / 13、2 / 16 / 16、7 / 15/10]

Whatever help you can give would be appreciated. 您能提供的任何帮助将不胜感激。

Read in your data from your text file (.txt file where the columns are seperated with tabs, this was the case with my data, but might be different with yours of course!): 从文本文件(.txt文件,其中的各列用制表符分隔)中读取数据,这是我的数据的情况,但当然可能与您的数据不同!):

import csv

data = []

with open("file_%02d.txt" %fileNumber, 'r') as f:
    reader = csv.reader(f, dialect = 'excel', delimiter = '\t')
    % reads the rows from your imported data file and appends them to a list
    for row in reader:
        print row
        data.append(row)

Write your data to an external file: 将您的数据写入外部文件:

import pandas as pd
newData= pd.DataFrame(data, columns = ['name1','name2',...,'nameN'])
expData.to_csv("new_file_%02d.csv" %fileNum, sep = ';')

This is more or less top of my head, but it should do the trick. 这或多或少是我的首要任务,但应该可以解决。 You can write away data that is in a list, just make sure that the number of elements in the list and the columnnames match 您可以写出列表中的数据,只需确保列表中的元素数和列名匹配

I hope I helped a little! 希望我能有所帮助!

I'm sorry that I can't remember the precise method but if you create a file using f = file ... etc. and make it a comma separated values (.csv) file then there is a way of loading that straight into excel so that all the items separated by commas go into separate columns and all the things split by enters go into separate rows (again sorry I can't remember the exact procedure) 抱歉,我不记得确切的方法,但是如果您使用f = file ...等创建文件,并将其设置为逗号分隔的值(.csv)文件,则可以直接将其加载到excel,以便所有用逗号分隔的项目都进入单独的列,而所有按Enter分隔的东西都进入单独的行(再次抱歉,我不记得确切的过程了)

See 看到

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM