[英]Converting Txt to Xlsx with python
I'm very new to python.我对 python很陌生。 But I'm trying to build a code that will convert text files to xlsx.但我正在尝试构建一个将文本文件转换为 xlsx 的代码。 It has to be xlsx because there's too many entries for xls and I can't just convert it to csv because there's two fields that don't get read correctly if I do.它必须是 xlsx,因为 xls 的条目太多,我不能将其转换为 csv,因为如果我这样做,有两个字段无法正确读取。
This is what I have and it works...eventually.这就是我所拥有的,它可以工作......最终。 It is extremely slow.它非常慢。 I've tried looking up other codes online and I can't get any of them to work for me.我已经尝试在网上查找其他代码,但我无法让它们中的任何一个为我工作。 Like I said, I'm very new to this.就像我说的,我对此很陌生。 Any suggestions on how to speed this up?关于如何加快速度的任何建议?
import pandas as pd
nal = pd.read_csv('Path.TXT', delimiter= '\t')
nal.columns = ['1','2','3',...]
nal.to_excel('path.xlsx', 'Sheet 1')
Edit: I have no idea how to attach the files I'm working with.编辑:我不知道如何附加我正在使用的文件。 They're tab delimited.它们是制表符分隔的。 Here's one file's set-up这是一个文件的设置
10001 2020 30380000001006000001.0 "CR 512 SEBASTIAN, FL 32958" 12/11/2014 FLEMING GRANT PLAT SHOWING THE S/D OF PBB 1-72 BEING MORE PART DESC AS FOLL ALL THAT PART FLEMING GRANT SEC 6 & 15 LYING S OF FELLSMERE WATER CONT ROL DISTRICT MAIN OUTFALL CANAL & LYINGS OF TWP LINE BETWEEN TWP 30S & TWP 31S (SAME BEING THE N BDRY OF IND 00 30 38 COMM MULTI 0.00000 690102.10 69 1 1083.5100 3038000001 FLEMING GRANT 120610509041 1 72 "FLEMING GRANT PLAT SHOWING THE S/D OF PBB 1-72 BEING MORE PART DESC AS FOLL ALL THAT PART FLEMING GRANT SEC 6 & 15 LYING S OF FELLSMERE WATER CONT ROL DISTRICT MAIN OUTFALL CANAL & LYINGS OF TWP LINE BETWEEN TWP 30S & TWP 31S (SAME BEING THE N BDRY OF INDIAN RIVER C OUNTY) AS PROJECTED ACROSS THE FLEMING GRANT: LESS & EXCEPTING THAT PART OF SAID FLEMING GRANT SEC 6 DESC AS FOLL: BEG AT INTERSECTION OF FLEMING GRANT LINE & SR/W OF 400 FT WIDE R/W OF FELLSMERE MAIN OUTFALL CANAL, RUN TH S 89 DEG 49 MIN 55 SEC E ALONG SAID S R/W FELLSMERE MAINCANAL, A DIST OF 829.06 FT TO A PT OF INTERSECTION WITH E LINE OF A 170 FT WIDE FP&L TRANSMISSION LINE EASEMENT; TH RUNS 18 DEG 46 MIN 02 SEC E ALONG E LINE OF SAID EASEMENT A DIST OF 1073.84 FT; TH RUN N 89 DEG 26 MIN 54 SEC W A DIST OF166.10 FT TO A PT ON FLEMING GRANT LINE; TH RUN N 44 DEG 44 MIN 30 SEC W ALONG SAID FLEMING GRANT LINE A DIST OF 1432.63FT TO POB. (PCL 2 IN OR BK 1304 PP 2778)"
10002 2020 30380000001006000002.0 "CR 512 SEBASTIAN, FL 32958" 11/07/2012 FLEMING GRANT PLAT SHOWING THE S/D OF PBB 1-72 BEING MORE PART DESC AS FOLL PART OF GRANT SEC 6 AS IN D BK10 PP 473 ST LUCIE CO RECORDS 00 30 38 COMM MULTI 0.00000 710102.10 71 1 15.2200 3038000001 FLEMING GRANT 120610509041 1 72 FLEMING GRANT PLAT SHOWING THE S/D OF PBB 1-72 BEING MORE PART DESC AS FOLL PART OF GRANT SEC 6 AS IN D BK10 PP 473 ST LUCIE CO RECORDS
Here's the other file's set-up:这是另一个文件的设置:
41 30380000001006000001.0 R 2020 087 2762951 2762951 2762951 0 0 0 0 0 0 2762951 2762951 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2762951 1 1083.51 47197696 1214 0 0 0 ST JOHNS RIVER WATER MANAGEMEN PO BOX 1429 PALATKA FL 32178 FL FLEMING GRANT PLAT SHOWING THE XXXXXXXXX XXXXXXXXX 69 690102.10 W 1 30 38 00 120610509041 CR 512 SEBASTIAN 32958 10001
41 30380000001006000002.0 R 2020 094 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 662983 1112 0 0 0 FELLSMERE DR DIST CR 512 SEBASTIAN FL 32958 FL FLEMING GRANT PLAT SHOWING THE XXXXXXXXX XXXXXXXXX 71 710102.10 1 30 38 00 120610509041 CR 512 SEBASTIAN 32958 10002 10002 2
Try this尝试这个
mypath = "C:/Users/Test/Downloads/Data/" #path where you saved the text files
files = [join(mypath, f) for f in listdir(mypath) if isfile(join(mypath, f))]
for txt in files:
with open(txt) as fp:
lst = [row.split("\t") for row in fp.read().split("\n")]
col = list(range(1, len(lst[0])+1))
df = pd.DataFrame(lst, columns=col)
df.to_excel(ntpath.basename(txt)[:-4] + ".xlsx")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.