[英]content from multiple txt files into single excel file using python
獲得所需的輸出涉及一些邏輯。
首先,將輸入文件處理成單獨的列表。 您可能需要根據文件的實際內容調整此邏輯。 您需要能夠獲取文件的列。 對於提供的示例,我的邏輯有效。
我添加了一個安全檢查以查看輸入文件是否具有相同的行數。 如果他們不這樣做,它會嚴重弄亂生成的 excel 文件。 如果發生長度不匹配,您需要添加一些邏輯。
對於寫入 excel 文件,將 pandas 與 openpyxl 結合使用非常容易。 可能有更優雅的解決方案,但我會把它留給你。
我在代碼中引用了一些 SO 答案以供進一步閱讀。
要求.txt
pandas
openpyxl
主文件
# we use pandas for easy saving as XSLX
import pandas as pd
filelist = ["file01.txt", "file02.txt", "file03.txt"]
def load_file(filename: str) -> list:
result = []
with open(filename) as infile:
# the split below is OS agnostic and removes EOL characters
for line in infile.read().splitlines():
# the split below splits on space character by default
result.append(line.split())
return result
loaded_files = []
for filename in filelist:
loaded_files.append(load_file(filename))
# you will want to check if the files have the same number of rows
# it will break stuff if they don't, you could fix it by appending empty rows
# stolen from:
# https://stackoverflow.com/a/10825126/9267296
len_first = len(loaded_files[0]) if loaded_files else None
if not all(len(i) == len_first for i in loaded_files):
print("length mismatch")
exit(419)
# generate empty list of lists so we don't get index error below
# stolen from:
# https://stackoverflow.com/a/33990699/9267296
result = [ [] for _ in range(len(loaded_files[0])) ]
for f in loaded_files:
for index, row in enumerate(f):
result[index].extend(row)
result[index].append('')
# trim the last empty column
result = [line[:-1] for line in result]
# write as excel file
# stolen from:
# https://stackoverflow.com/a/55511313/9267296
# note that there are some other options on this SO question, but this one
# is easily readable
df = pd.DataFrame(result)
writer = pd.ExcelWriter("output.xlsx")
df.to_excel(writer, sheet_name="sheet_name_goes_here", index=False)
writer.save()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.