简体   繁体   English

通过仅更改一个字符的Python行进行迭代

[英]Iterate through lines changing only one character python

I have a file that looks like this 我有一个看起来像这样的文件

N1 1.023 2.11 3.789 

Cl1 3.124 2.4534 1.678

Cl2 # # #

Cl3 # # #

Cl4

Cl5

N2

Cl6

Cl7

Cl8

Cl9

Cl10

N3

Cl11


Cl12

Cl13

Cl14

Cl15

The three numbers continue down throughout. 这三个数字一直持续下降。

What I would like to do is pretty much a permutation. 我想做的几乎是一个排列。 These are 3 data sets, set 1 is N1-Cl5, 2 is N2-Cl10, and set three is N3 - end. 这是3个数据集,集合1是N1-Cl5,集合2是N2-Cl10,集合3是N3-端。

I want every combination of N's and Cl's. 我想要N和Cl的所有组合。 For example the first output would be 例如,第一个输出是

Cl1

N1

Cl2

then everything else the same. 然后其他一切都一样。 the next set would be Cl1, Cl2, N1, Cl3...and so on. 下一组是Cl1,Cl2,N1,Cl3 ...等等。

I have some code but it won't do what I want, becuase it would know that there are three individual data sets. 我有一些代码,但它不会执行我想要的操作,因为它将知道有三个单独的数据集。 Should I have the three data sets in three different files and then combine, using a code like: 我应该将三个数据集放在三个不同的文件中,然后使用类似以下代码的方式合并:

list1 = ['Cl1','Cl2','Cl3','Cl4', 'Cl5']

for line in file1:
    line.replace('N1', list1(0))
    list1.pop(0)
    print >> file.txt, line,

or is there a better way?? 或者,还有更好的方法?? Thanks in advance 提前致谢

This should do the trick: 这应该可以解决问题:

from itertools import permutations

def print_permutations(in_file):
    separators = ['N1', 'N2', 'N3']
    cur_separator = None
    related_elements = []

    with open(in_file, 'rb') as f:
        for line in f:
            line = line.strip()

            # Split Nx and CIx from numbers.
            value = line.split()[0]

            # Found new Nx. Print previous permutations.
            if value in separators and related_elements:
                for perm in permutations([cur_separator] + related_elements)
                    print perm
                cur_separator = line
                related_elements = []
            else:
                # Found new CIx. Append to the list.
                related_elements.append(value)

You could use regex to find the line numbers of the "N" patterns and then slice the file using those line numbers: 您可以使用regex查找“ N”个模式的行号,然后使用这些行号对文件进行切片:

import re
n_pat = re.compile(r'N\d')
N_matches = []
with open(sample, 'r') as f:
    for num, line in enumerate(f):
        if re.match(n_pat, line):
            N_matches.append((num, re.match(n_pat, line).group()))

>>> N_matches
[(0, 'N1'), (12, 'N2'), (24, 'N3')]

After you figure out the line numbers where these patterns appear, you can then use itertools.islice to break the file up into a list of lists: 弄清楚这些模式出现的行号之后,可以使用itertools.islice将文件分解为列表列表:

import itertools

first = N_matches[0][0]
final = N_matches[-1][0]
step = N_matches[1][0]
data_set = []
locallist = []

while first < final + step:
    with open(file, 'r') as f:
        for item in itertools.islice(f, first, first+step):
            if item.strip():
                locallist.append(item.strip())
        dataset.append(locallist)
        locallist = []
    first += step

itertools.islice is a really nice way to take a slice of an iterable. itertools.islice是获取可迭代对象的一种非常好的方法。 Here's the result of the above on a sample: 这是上面的示例结果:

>>> dataset

[['N1 1.023 2.11 3.789', 'Cl1 3.126 2.6534 1.878', 'Cl2 3.124 2.4534 1.678', 'Cl3 3.924 2.1134 1.1278', 'Cl4', 'Cl5'], ['N2', 'Cl6 3.126 2.6534 1.878', 'Cl7 3.124 2.4534 1.678', 'Cl8 3.924 2.1134 1.1278', 'Cl9', 'Cl10'], ['N3', 'Cl11', 'Cl12', 'Cl13', 'Cl14', 'Cl15']]

After that, I'm a bit hazy on what you're seeking to do, but I think you want permutations of each sublist of the dataset? 在那之后,我对您要执行的操作有些困惑,但是我认为您想要对数据集的每个子列表进行排列? If so, you can use itertools.permutations to find permutations on various sublists of dataset: 如果是这样,则可以使用itertools.permutations查找数据集的各个子列表上的排列:

for item in itertools.permutations(dataset[0]):
    print(item)
etc.

Final Note: 最后说明:

Assuming I understand correctly what you're doing, the number of permutations is going to be huge. 假设我正确理解您在做什么,那么排列的数量将是巨大的。 You can calculate how many permutations there are them by taking the factorial of the number of items. 您可以通过乘以项目数来计算其中有多少种排列。 Anything over 10 (10!) is going to produce over 3,000,000 million permutations. 超过10(10!)的任何事物都将产生超过3,000,000百万个排列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM