[英]Text file to a excel file (tab delimited) with python
I have a txt file that looks like this我有一个看起来像这样的txt文件
1000 lewis hamilton 36
1001 sebastian vettel 34
1002 lando norris 21
i want them to look like this我希望它们看起来像这样
I tried the solution in here but it gave me a blank excel file and error when trying to open it我在这里尝试了解决方案,但在尝试打开它时它给了我一个空白的 excel 文件和错误
There is more than one million lines and each lines contains around 10 column有超过一百万行,每行包含大约 10 列
And one last thing i am not 100% sure if they are tab elimited because some columns looks like they have more space in between them than the others but when i press to backspace once they stick to each other so i guess it is最后一件事,我不能 100% 确定它们是否被制表符限制,因为有些列看起来它们之间的空间比其他列更大,但是当我按下退格键时,它们粘在一起,所以我猜是
you can use pandas read_csv
for read your txt file and then save it like an excel file with .to_excel
您可以使用 pandas read_csv
读取您的 txt 文件,然后将其保存为带有.to_excel
的 excel 文件
df = pd.read_csv('your_file.txt' , delim_whitespace=True)
df.to_excel('your_file.xlsx' , index = False)
here some documentation :这里有一些文档:
pandas.read_csv : https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html pandas.read_csv : https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
.to_excel : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html .to_excel : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html
If you're not sure about how the fields are separated, you can use '\\s' to split by spaces.如果您不确定字段的分隔方式,可以使用“\\s”按空格分隔。
import pandas as pd
df = pd.read_csv('f1.txt', sep="\s+", header=None)
# you might need: pip install openpyxl
df.to_excel('f1.xlsx', 'Sheet1')
Example of randomly separated fields (f1.txt):随机分隔字段示例 (f1.txt):
1000 lewis hamilton 2 36
1001 sebastian vettel 8 34
1002 lando norris 6 21
If you have some lines having more columns than the first one, causing:如果某些行的列数比第一行多,则导致:
ParserError: Error tokenizing data. ParserError:标记数据时出错。 C error: Expected 5 fields in line 5, saw 6 C 错误:第 5 行中应有 5 个字段,看到 6
You can ignore those by using:您可以使用以下方法忽略这些:
df = pd.read_csv('f1.txt', sep="\s+", header=None, error_bad_lines=False)
This is an example of data:这是一个数据示例:
1000 lewis hamilton 2 36
1001 sebastian vettel 8 34
1002 lando norris 6 21
1003 charles leclerc 1 3
1004 carlos sainz ferrari 2 2
The last line will be ignored:最后一行将被忽略:
b'Skipping line 5: expected 5 fields, saw 6\\n' b'跳过第 5 行:预期 5 个字段,看到 6\\n'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.