简体   繁体   English

使用Regex第一列在Python中读取和解析CSV文件

[英]Reading and parsing a CSV file in Python with Regex first column

I have a CSV file (calendar), 5 columns that I want to read and parse with the following conditions using a script: 我有一个CSV文件(日历),我希望阅读5列,并使用脚本解析以下条件:

  • Deleting headers (done) 删除标题(已完成)
  • Change the format of the first column from 01/01/2019 to 20190101 in the First column 在第一列中将第一列的格式从01/01/2019更改为20190101

The first part of the script is done to skip headers. 脚本的第一部分用于跳过标题。 The second part I think a regex is required but I just don't know how to first remove the / and then move the 0101 from before 2019 to after 2019 so that the result is 20190101 第二部分我认为正则表达式是必需的,但我只是不知道如何首先删除/然后将0101从2019年之前移动到2019年之后,结果是20190101

If someone could help that would be great! 如果有人可以提供帮助那就太棒了!

def parse_calendar(infile, outfile):
    with open(outfile, 'w', newline='') as output:
        with open(infile, newline='') as input:
            reader = csv.reader(input, delimiter=',', quotechar='"')
            next(reader, None)  # skip the headers
            writer = csv.writer(output, delimiter=',', quotechar='"')
            for row in reader:   # process each row
                writer.writerow(row)

I expect the output to be like the following compared to the initial file: 我希望输出与初始文件相比如下:

01/01/2019 New Year's Day NC US 01/01/2019新年元旦美国

20190101 New Year's Day NC US 20190101元旦NC美国

Thanks guys for the responses. 谢谢大家的回复。

So with this code I get the following output: 所以使用这段代码我得到以下输出:

import csv

def parse_calendar(infile, outfile):
    with open(outfile, 'w', newline='') as output:
        with open(infile, newline='') as input:
            reader = csv.reader(input, delimiter=',', quotechar='"')
            next(reader, None)  # skip the headers
            writer = csv.writer(output, delimiter=',', quotechar='"')
            for row in reader: # process each row
                replaced = row[0].replace('/','')  
                row[0] = replaced
                writer.writerow(row)

01012018,New Year's Day,N,C,US 01012018,元旦,N,C,US

01012018,New Year's Day,N,C,CA 01012018,元旦,N,C,CA

01152018,Martin L. King Day,N,C,US 01152018,Martin L. King Day,N,C,US

What code do I need to add to the script get the formatting different now from 01012018 to 20180101 given the type is a String? 我需要添加到脚本中的代码是什么,从01012018到20180101格式不同,因为类型是字符串? For each line then off course. 对于每一行然后偏离航线。

Appreciate it alot thanks 感谢它很多

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM