简体   繁体   English

使用python将文本文件转换为excel文件(制表符分隔)

[英]Text file to a excel file (tab delimited) with python

I have a txt file that looks like this我有一个看起来像这样的txt文件

1000  lewis     hamilton  36
1001 sebastian vettel 34
1002  lando  norris  21

i want them to look like this我希望它们看起来像这样

在此处输入图片说明

I tried the solution in here but it gave me a blank excel file and error when trying to open it我在这里尝试了解决方案,但在尝试打开它时它给了我一个空白的 excel 文件和错误

There is more than one million lines and each lines contains around 10 column有超过一百万行,每行包含大约 10 列

And one last thing i am not 100% sure if they are tab elimited because some columns looks like they have more space in between them than the others but when i press to backspace once they stick to each other so i guess it is最后一件事,我不能 100% 确定它们是否被制表符限制,因为有些列看起来它们之间的空间比其他列更大,但是当我按下退格键时,它们粘在一起,所以我猜是

you can use pandas read_csv for read your txt file and then save it like an excel file with .to_excel您可以使用 pandas read_csv读取您的 txt 文件,然后将其保存为带有.to_excel的 excel 文件

df = pd.read_csv('your_file.txt' , delim_whitespace=True)
df.to_excel('your_file.xlsx' , index = False)

here some documentation :这里有一些文档:

pandas.read_csv : https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html pandas.read_csv : https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

.to_excel : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html .to_excel : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html

If you're not sure about how the fields are separated, you can use '\\s' to split by spaces.如果您不确定字段的分隔方式,可以使用“\\s”按空格分隔。

import pandas as pd 
df = pd.read_csv('f1.txt', sep="\s+", header=None)
# you might need: pip install openpyxl
df.to_excel('f1.xlsx', 'Sheet1')  

Example of randomly separated fields (f1.txt):随机分隔字段示例 (f1.txt):

1000  lewis     hamilton  2 36
1001 sebastian vettel 8 34
1002  lando  norris   6 21

If you have some lines having more columns than the first one, causing:如果某些行的列数比第一行多,则导致:

ParserError: Error tokenizing data. ParserError:标记数据时出错。 C error: Expected 5 fields in line 5, saw 6 C 错误:第 5 行中应有 5 个字段,看到 6

You can ignore those by using:您可以使用以下方法忽略这些:

df = pd.read_csv('f1.txt', sep="\s+", header=None,  error_bad_lines=False)

This is an example of data:这是一个数据示例:

1000  lewis     hamilton  2 36
1001 sebastian vettel 8 34
1002  lando  norris     6 21
1003 charles leclerc           1 3
1004 carlos sainz  ferrari 2 2 

The last line will be ignored:最后一行将被忽略:

b'Skipping line 5: expected 5 fields, saw 6\\n' b'跳过第 5 行:预期 5 个字段,看到 6\\n'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM