简体   繁体   English

如何从每一行文本中读取特定字符并将其写到另一个文件中?

[英]How do I read in specific characters from each line of text and write it out to another file?

I have a txt file named "tclust.txt" and another named "ef_blue.txt." 我有一个名为“ tclust.txt”的txt文件和另一个名为“ ef_blue.txt”的文件。 I'm trying to write a python script which will allow me to import certain characters from ef_blue.txt to tclust.txt. 我正在尝试编写一个python脚本,该脚本将允许我将某些字符从ef_blue.txt导入到tclust.txt。 So far, I can only read in the values from ef_blue.txt and have everything from that txt file go to tclust.txt. 到目前为止,我只能从ef_blue.txt中读取值,并使该txt文件中的所有内容都进入tclust.txt。 My ef_blue.txt has multiple lines of text but I only want to take certain characters from each line (eg: "7.827382" from line 2 and "6.432342" from line 2. 我的ef_blue.txt有多行文本,但我只想从每行中提取某些字符(例如:第2行的“ 7.827382”和第2行的“ 6.432342”)

blue = open("ef_blue.xpk", "rt")
contents = blue.read()

with open("tclust.txt","a") as f2: 
    f2.writelines(contents)

blue.close()
f2.close()

Edit: My tclust.txt file looks like this: 编辑:我的tclust.txt文件看起来像这样:

"type rbclust “ rbclust型

Peak 0 8.5 0.05 4.0 0.05 峰值0 8.5 0.05 4.0 0.05

Atom 0 125.H8 126.H1' label dataset sw sf" Atom 0 125.H8 126.H1'标签数据集sw sf“

My ef_blue.xpk file looks like this: 我的ef_blue.xpk文件如下所示:

"label dataset sw sf “标签数据集sw sf

1H 1H_2 1H 1H_2

NOESY_F1eF2f.nv NOESY_F1eF2f.nv

4807.69238281 4803.07373047 4807.69238281 4803.07373047

600.402832031 600.402832031 600.402832031 600.402832031

1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9 1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U volt int stat注释flag0 flag8 flag9

0 {} 7.45766 0.01702 0.03286 ++ {0.0} {} {} 5.68094 0.07678 0.15049 ++ {0.0} {} 0.0 4.8459 0 {} 0 0 0 0 {} 7.45766 0.01702 0.03286 ++ {0.0} {} {} 5.68094 0.07678 0.15049 ++ {0.0} {} 0.0 4.8459 0 {} 0 0 0

1 {} 8.11276 0.02278 0.03212 ++ {0.0} {} {} 5.52142 0.07827 0.11252 ++ {0.0} {} 0.0 2.0824 0 {} 0 0 0 1 {} 8.11276 0.02278 0.03212 ++ {0.0} {} {} 5.52142 0.07827 0.11252 ++ {0.0} {} 0.0 2.0824 0 {} 0 0 0

2 {} 7.85285 0.02369 0.02232 ++ {0.0} {} {} 5.52444 0.07280 0.06773 ++ {0.0} {} 0.0 0.8844 0 {} 0 0 0 2 {} 7.85285 0.02369 0.02232 ++ {0.0} {} {} 5.52444 0.07280 0.06773 ++ {0.0} {} 0.0 0.8844 0 {} 0 0 0

3 {} 7.45819 0.01630 0.02914 ++ {0.0} {} {} 5.42587 0.07081 0.11733 ++ {0.0} {} 0.0 2.8708 0 {} 0 0 0 3 {} 7.45819 0.01630 0.02914 ++ {0.0} {} {} 5.42587 0.07081 0.11733 ++ {0.0} {} 0.0 2.8708 0 {} 0 0 0

4 {} 7.89775 0.01106 0.00074 ++ {0.0} {} {} 5.23989 0.07077 0.00226 ++ {0.0} {} 0.0 0.4846 0 {} 0 0 0 4 {} 7.89775 0.01106 0.00074 ++ {0.0} {} {} 5.23989 0.07077 0.00226 ++ {0.0} {} 0.0 0.4846 0 {} 0 0 0

5 {} 7.85335 0.02665 0.03635 ++ {0.0} {} {} 5.23688 0.09117 0.12591 ++ {0.0} {} 0.0 1.5210 0 {} 0 0 0" 5 {} 7.85335 0.02665 0.03635 ++ {0.0} {} {} 5.23688 0.09117 0.12591 ++ {0.0} {} 0.0 1.5210 0 {} 0 0 0”

So what I want to do is take the characters from my ef_blue.xpk such as "7.45766" and "5.68094" from line 7 and write it out to line 3 of my tclust.txt file 因此,我想做的是从ef_blue.xpk的第7行中提取字符,例如“ 7.45766”和“ 5.68094”,并将其写到tclust.txt文件的第3行中

So I would like my tclust.txt file to look like: 所以我希望我的tclust.txt文件看起来像:

type rbclust
Peak 0 8.5 0.05 4.0 0.05
       7.45766   5.68094
       8.11276   5.52142
 .... etc
Atom 0 125.H8 126.H1'label dataset sw sf

Edit2: @open-source Edit2:@ open-source

This is the output I get 这是我得到的输出

blue = open("ef_blue.txt", "rt")
contents = blue.readlines()

with open("tclust.txt","a") as f2: 
    for cont in range(len(contents)):
        if cont > 5:
            a = contents[cont].split(' ')
            print(a[2]+ '  ' + a[9])
            f2.writelines(a[2] + '  '+ a[9] + '  ')


blue.close()
f2.close()

Try with that, use readlines to convert every line in a list, then use a for to resort the list and check if is in the appropriate line, and finally make a list the actual line separate by a space, tell me in that work 尝试使用该方法,使用readlines转换列表中的每一行,然后使用for来重新排序列表并检查是否在适当的行中,最后使列表中的实际行由空格分隔,在工作中告诉我

You can try the following: 您可以尝试以下方法:

import re

# read tclust.txt file line by line 
# remove last line and empty second last line
# save last line in variable

lines = open('tclust.txt').readlines()
last_line = lines[-1]

# update tclust.txt without last two lines 
open('tclust.txt', 'w').writelines(lines[:-2])

# Open both files
with open("ef_blue.xpk", "rt") as f1, open("tclust.txt","a") as f2:
    # Read ef_blue.xpk line by line 
    for line in f1.readlines():
        # check for 1.23232 format numbers
        float_num = re.findall("[\s][1-9]{1}\.[0-9]+", line)
        # if any digit found in line that matches format
        # assumming there must be 2 in line if found
        if len(float_num)>1:
            # write with 6 empty spaces in the beginning and separated by tab
            f2.writelines(' '*6 + float_num[0] + '\t' + float_num[1] + '\n')

    # finally write the last line earlier removed
    f2.writelines(last_line)

Output for tclust.txt : tclust.txt输出:

"type rbclust

Peak 0 8.5 0.05 4.0 0.05
       7.45766   5.68094
       8.11276   5.52142
       7.85285   5.52444
       7.45819   5.42587
       7.89775   5.23989
       7.85335   5.23688
Atom 0 125.H8 126.H1' label dataset sw sf"

Input: ef_blue.xpk 输入: ef_blue.xpk

"label dataset sw sf

1H 1H_2

NOESY_F1eF2f.nv

4807.69238281 4803.07373047

600.402832031 600.402832031

1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9

0 {} 7.45766 0.01702 0.03286 ++ {0.0} {} {} 5.68094 0.07678 0.15049 ++ {0.0} {} 0.0 4.8459 0 {} 0 0 0

1 {} 8.11276 0.02278 0.03212 ++ {0.0} {} {} 5.52142 0.07827 0.11252 ++ {0.0} {} 0.0 2.0824 0 {} 0 0 0

2 {} 7.85285 0.02369 0.02232 ++ {0.0} {} {} 5.52444 0.07280 0.06773 ++ {0.0} {} 0.0 0.8844 0 {} 0 0 0

3 {} 7.45819 0.01630 0.02914 ++ {0.0} {} {} 5.42587 0.07081 0.11733 ++ {0.0} {} 0.0 2.8708 0 {} 0 0 0

4 {} 7.89775 0.01106 0.00074 ++ {0.0} {} {} 5.23989 0.07077 0.00226 ++ {0.0} {} 0.0 0.4846 0 {} 0 0 0

5 {} 7.85335 0.02665 0.03635 ++ {0.0} {} {} 5.23688 0.09117 0.12591 ++ {0.0} {} 0.0 1.5210 0 {} 0 0 0"

Input: tclust.txt 输入: tclust.txt

"type rbclust

Peak 0 8.5 0.05 4.0 0.05

Atom 0 125.H8 126.H1' label dataset sw sf"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM