简体   繁体   English

使用python将文件名添加到csv中的最后一列

[英]Adding filename to last column in csv using python

I have a folder full of .mpt files, each of them having the same data format. 我有一个充满.mpt文件的文件夹,每个文件都具有相同的数据格式。 I need to delete the first 57 lines from all files and append these files into one csv - output.csv. 我需要从所有文件中删除前57行,并将这些文件附加到一个csv-output.csv中。 I have that section already: 我已经有该部分了:

import glob
import os

dir_name = 'path name'
lines_to_ignore = 57
input_file_format = '*.mpt'
output_file_name = "output.csv"

def convert():
    files = glob.glob(os.path.join(dir_name, input_file_format))
    with open(os.path.join(dir_name, output_file_name), 'w') as out_file:
        for f in files:
            with open(f, 'r') as in_file:
                content = in_file.readlines()
                content = content[lines_to_ignore:]
                for i in content:
                    out_file.write(i)

print("working")
convert()
print("done")

This part works ok. 这部分工作正常。

how do i add the filename of each .mpt file as the last column of the output.csv Thank you! 如何将每个.mpt文件的文件名添加为output.csv的最后一列,谢谢!

This is a quick 'n dirty solution. 这是一个快速的解决方案。

In this loop the variable i is just a string (a line from a CSV file): 在此循环中,变量i只是一个字符串(CSV文件中的一行):

            for i in content:
                out_file.write(i)

So you just need to 1) strip off the end of line character(s) (either "\\n" or "\\r\\n") and append ",". 因此,您只需要1)去除行尾字符(“ \\ n”或“ \\ r \\ n”)并附加“,”即可。

If you're using Unix, try: 如果您使用的是Unix,请尝试:

for i in content:
  i = i.rstrip("\n") + "," + output_file_name + "\n"
  out_file.write(i)

This assumes that the field separator is a comma. 假定字段分隔符是逗号。 Another option is: 另一个选择是:

for i in content:
  i = i.rstrip() + "," + output_file_name
  print >>out_file, i

This will strip all white space from the end of i . 这将从i的末尾删除所有空白。

Add quotes if you need to quote the output file name: 如果需要引用输出文件名,请添加引号:

  i = i.rstrip(...) + ',"' + output_file_name '"'

The relevant part: 相关部分:

with open(f, 'r') as in_file:
    content = in_file.readlines()
    content = content[lines_to_ignore:]
    for i in content:   
        new_line = ",".join([i.rstrip(), f]) + "\n" #<-- this is new
        out_file.write(new_line)                    #<-- this is new

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM