简体   繁体   English

从 python 中的文件中读取以制表符分隔的数据

[英]Read data seperated with tab from file in python

If I have the following file:如果我有以下文件:

Pig
06-13-01    56.2
06-13-02    59.2
06-13-03    54.3
.
.
.
Cow
06-13-01   201.2
06-13-02   204.1
06-13-03   205.6
.
.
.

and want to create an instance of an object with the data for the animal with the associated date and weight (value seperated by tab).并希望创建一个 object 的实例,其中包含动物的数据以及相关的日期和重量(由制表符分隔的值)。 How do I do that in my main program?我如何在我的主程序中做到这一点?

I have begun with this:我从这个开始:

  with open(filnamn, encoding="utf-8") as file:
dateAndWeight = []
lines = fil.readlines()
lines = [line.rstrip() for line in lines]
stepsBetweenName = 68
numberOfAnimals = int(len(lines)/stepsBetweenName)`

But this is just the beginning.但这仅仅是开始。 Does anyone have any advice?有人有建议吗?

Your data alternates between animal name and tab separated data.您的数据在动物名称和制表符分隔的数据之间交替。 This is a good fit for itertools.groupby which creates its own iterators based on a condition such as column count.这非常适合itertools.groupby根据列数等条件创建自己的迭代器。

In this example, groupby starts a new sub-iteration whenever the row count changes between 1 and not-1.在此示例中,只要行数在 1 和非 1 之间变化, groupby就会开始新的子迭代。 When its 1, you know you have a new animal.当它为 1 时,你知道你有一个新动物。 When not-1, you have rows of data.非 1 时,您有数据行。 Here, I just built a dictionary that maps animal name to its date/weight info.在这里,我刚刚建立了一个字典,将动物名称映射到它的日期/重量信息。

import itertools
import io
import csv

# test file

file = io.StringIO("""Pig
06-13-01\t56.2
06-13-02\t59.2
06-13-03\t54.3
Cow
06-13-01\t201.2
06-13-02\t204.1
06-13-03\t205.6""")

# will hold `animal:[[date, weight], ...]` associations
animal_map = {}

# data is TSV file
reader = csv.reader(file, delimiter="\t")

# Group by rows of length 1 which start a new set of animal date, weight pairs
for new_animal, rows in itertools.groupby(reader, lambda row: len(row) == 1):
    if new_animal:
        # get animal from first row
        animal = next(rows)[0]
    else:
        # add animal and data to map
        animal_map[animal] = list(rows)
        del animal

print(animal_map)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM