简体   繁体   English

Python - 如何在第三个元素或第四个元素处按奇数和偶数对文本文件进行排序,并生成奇数或偶数条目的 output 文本文件

[英]Python - How to sort textfile by odd and even number at third element or fourth element and produce output textfiles of odd or even numbered entries

So I have some big long lists of animal identifiers.所以我有一些很长的动物标识符列表。 Our convention is to use two of alphabetical characters, followed by a litter identifier a dash and then the animal id within that litter.我们的约定是使用两个字母字符,后跟一个垃圾标识符,一个破折号,然后是该垃圾中的动物 ID。 The number before the dash identifies whether they are control or manipulated animals.破折号前的数字标识它们是受控动物还是受操纵动物。

So it looks like this:所以它看起来像这样:

XL20-4 is a control animal (0 - even), XL21-4 is a manipulated animal (1 - odd), XL20-4 是对照动物(0 - 偶数),XL21-4 是操纵动物(1 - 奇数),

this runs all the way through to 300s.这一直持续到 300 秒。

So our current litters are: XL304-5 (4 - even - control), XL303-4 (3 - odd - manipulated).所以我们目前的窝是:XL304-5(4 - 偶数 - 控制),XL303-4(3 - 奇数 - 操纵)。

So one of my first tasks is to create ordered textfiles of the animals in each condition from the original text file, so it can then be read by our matlab code.所以我的首要任务之一是从原始文本文件中创建每种条件下动物的有序文本文件,以便我们的 matlab 代码可以读取它。

Essentially, it needs to retain the order of animal generation within those new textfiles ie XL302-4, XL304-5, XL304-6, XL306-1,本质上,它需要在这些新文本文件中保留动物生成的顺序,即 XL302-4、XL304-5、XL304-6、XL306-1、

Each with a /n.每个都带有 /n。 I know this isn't so easy, but I've done quite a bit of looking and I think this is a bit beyond me.我知道这并不容易,但我已经做了很多寻找,我认为这有点超出我的能力。

Thanks in advance.提前致谢。

based on what you had said this would be the way to do it, but there should be some finer tweaking because the file contents originally are unknown (name and how they are placed in the text file)根据您所说的,这将是这样做的方法,但是应该进行一些更精细的调整,因为文件内容最初是未知的(名称以及它们在文本文件中的放置方式)

import re

def write_to_file(file_name, data_to_write):
    with open(file_name, 'w') as file:
        for item in data_to_write:
            file.write(f"{item}\n")

# read contents from file
with open('original.txt', 'r') as file:
    contents = file.readlines()

# assuming that each of the 'XL20-4,' are on a new line
control_group = []
manipulated_group = []
for item in contents:
    # get only the first number between the letters and dash
    test_generation = int(item[re.search(r"\d", item).start():item.rfind('-')])
    if test_generation % 2: # if even evaluates to 0 ~ being false
        manipulated_group.append(item)
    else:
        control_group.append(item)

# write to files with the data
write_to_file('control.txt', control_group)
write_to_file('manipulated.txt', manipulated_group)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM