简体   繁体   English

验证 Python 中 txt 文件 (tsv) 的前 3 行

[英]Validate first 3 rows of txt file (tsv) in Python

I have been trying to build a validation rule for txt files that get uploaded to my environment.我一直在尝试为上传到我的环境的 txt 文件构建验证规则。 The files are tab separated and I need to validate the first 3 rows that are in a format such as:这些文件是制表符分隔的,我需要验证格式如下的前 3 行:

## This Text Here 
## This Text Here
## This Text Here

I need to build a pass fail validation.我需要建立一个通过失败验证。 I have tried doing this with the inbuilt csv function in python with no luck so far.我已经尝试使用 python 中的内置 csv function 执行此操作,但到目前为止没有运气。 Would appreciate any advice on the best route to go.希望获得有关通往 go 的最佳路线的任何建议。

Try this:尝试这个:

### it depends on how you open the file but...
# open using with..
with open("test.tsv") as inData:
    # split lines on tabs...
    allLines = [l.split("\t") for l in inData]
    # get the lines in question:
    testLines = [l[0] for l in allLines[:3]]
    # then you could use assert
    for l in testLines:
        assert(l.startswith("##"))
        # and whatever other validation you need for the string
    ### you could ad try/except
    try:
        for l in testLines:
            assert(l.startswith("##"))
    except AssertionError as e:
        print(e, "please use a validated file!")

Further reading: https://www.tutorialspoint.com/python/python_exceptions.htm进一步阅读: https://www.tutorialspoint.com/python/python_exceptions.htm

Maybe you should give a try pandas:也许您应该尝试一下 pandas:

import pandas as pd

file_name = # your file name
csv = pd.read_csv(file_name, sep='\t')

# do your stuff

Documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html文档: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM