简体   繁体   English

通过python或Tcl增加字符串中的数字

[英]Increasing numbers in strings by python or Tcl

I have some strings with integer numbers (From 2 to 5 numbers in one string, separated by spaces) this is an example: 我有一些带有整数的字符串(一个字符串中从2到5个数字,用空格分隔)是一个示例:

    1    4    5   19
    1    5
    2    3    6   59
    2    6
    3    2    4   60
    3    4
    4    1    3   61
    4    3
   25   13   23   64   65
   13   18
   14   13   15   75
   14   15
   15   14   16   76
   15   14
   45   44  102  103  104

I need to increase all numbers by 129 repeatedly, so the beginning will be: 我需要将所有数字重复增加129,因此开始将是:

130  133  134 148
130  134
131  132  135 188  ...

Next increase will be: 下一个增长将是:

259  262  263 277
259  263
260  261  264 317 ...

What is the best option for this type of string analysis? 这种字符串分析的最佳选择是什么? At first count the numbers, then make matrix filled with "0": [0, 0, 0, 0, 0] Than fill it - it will be: 1st line [1, 4, 5, 19, 0 ] 2nd line [1, 5, 0 , 0 , 0 ] 首先对数字进行计数,然后用“ 0”填充矩阵:[0,0,0,0,0]比填充它-这将是:第一行[ 1,4,5,19,0 ]第二行[ 1,5,0,0,0]

And increase all numbers which are not a zero. 并增加所有非零的数字。 I am thinking about the solution for this task in right direction or there is more easier way? 我正在考虑朝正确方向完成此任务的解决方案,或者有更简单的方法? Or there is any ready solution and I just don't understand how to exactly search it? 还是有任何现成的解决方案,我只是不知道如何精确搜索?

The result must be with specific formatting - it is PDB file CONECT records . 结果必须具有特定的格式- 它是PDB文件的CONECT记录

If you know the final dimensions, you can pre-allocate a numpy array of zeros. 如果知道最终尺寸,则可以预分配零的numpy数组。 Say what you how you want to handle each row initially ( process_row ), and then do that for every row in the file ( process_file ). 说出您要如何最初处理每一行( process_row ),然后对文件中的每一行( process_file )进行处理。

import numpy as np

def process_row(row, row_num, out):
    row = row.split()
    nvals = len(row)
    out[row_num,:nvals] = row

def process_file(fname, shape):
    data = np.zeros(shape)
    with open(fname) as fin:
        for i, row in enumerate(fin):
            process_row(row, i, data)
    return data

data = process_file(fname="C:/temp/temp.txt", shape=(15,5))
data[data != 0] += 129

A Tcl one, if you only need the numbers added up: 一个Tcl,如果只需要加起来的数字:

set text {    1    4    5   19
    1    5
    2    3    6   59
    2    6
    3    2    4   60
    3    4
    4    1    3   61
    4    3
   25   13   23   64   65
   13   18
   14   13   15   75
   14   15
   15   14   16   76
   15   14
   45   44  102  103  104}

proc addevery {txt amount} {
    # Creating an alias so we can modify the variable from outside the proc
    upvar text gtext
    set result [list]
    # Split to get lines
    foreach line [split $txt \n] {
        set lineresult [list]
        # Get each number added
        foreach item [regexp -all -inline {\S+} $line] {
            lappend lineresult [expr {$item+$amount}]
        }
        lappend result $lineresult
    }
    set gtext [join $result \n]
    puts $gtext
    return
}

puts "Adding 129:"
addevery $text 129

puts "\nAdding again 129:"
addevery $text 129

ideone demo ideone演示


EDIT: After getting an understanding of the underlying problem; 编辑:了解潜在问题后; we have to keep the formatting (more specifically, adding CONECT before each line of series of numbers, keep the numbers with a 5 space right indented format and be able to output the different 'steps' of the addition to the original numbers in the same file. One last thing, the first iteration actually should not add anything to the original numbers. 我们必须保留格式(更具体地说,在数字系列的每一行之前添加CONECT ,以5个空格向右缩进格式保留数字,并能够以相同的方式输出与原始数字不同的加法“步骤”最后一件事,第一次迭代实际上不应在原始数字上添加任何内容。

set fin [open "Input.txt" r]
set fout [open "Output.txt" w]

set lines [split [read $fin] "\n"]

# Amount to be added each time
set amount 129
# Number of times to be added
set times 100

proc addevery {amount} {
  global lines
  # Result list
  set lresult [list]
  foreach line $lines {
    # Result line
    set result {}
    # Get every 5 chars of the line
    foreach item [regexp -all -inline {.{5}} $line] {
      # Add, format then append to result line
      append result [format %5s [expr {[string trim $item]+$amount}]]
    }
    # Add line to the result list
    lappend lresult $result
  }
  # Set the lines to the new lines
  set lines $lresult
  return $lresult
}

for {set i 0} {$i < $times} {incr i} {
  # If 0, put the original with CONECT
  if {$i == 0} {
    puts $fout [join [lmap x $lines {set x "CONECT$x"}] "\n"]
  } else {
    puts $fout [join [lmap x [addevery $amount] {set x "CONECT$x"}] "\n"]
  }
}

close $fin
close $fout

And as a bonus, the python equivalent: 另外,python等效项是:

amount = 129
times = 100

import re

def addevery(amount):
  global lines
  lresult = []
  for line in lines:
    result = ''

    for item in re.findall(r'.{5}', line):
      result += "%5s" % (int(item.strip()) + amount)

    lresult.append(result)

  lines = list(lresult)
  return lresult

with open('Input.txt', 'r') as fin:
  with open('Output.txt', 'w') as fout:
    lines = fin.read().split('\n')
    for i in range(0,times):
      if i == 0:
        fout.write('\n'.join(['CONECT'+i for i in lines]) + '\n')
      else:
        fout.write('\n'.join(['CONECT'+i for i in addevery(amount)]) + '\n')

You're looking for a two-dimensional array (wich is a matrix). 您正在寻找一个二维数组(其中是一个矩阵)。

First of all, you can initialize it with 0's: 首先,您可以将其初始化为0:

Matrix = [[0 for x in range(20)] for x in range(5)]

You have to change the number for the size you want. 您必须将数字更改为所需的大小。 The 20 is the number of lines, the 5 is the number of columns. 20是行数,5是列数。

After that, you can put the numbers on the spot you want, using this: 之后,您可以使用以下方法将数字放在所需的位置:

Matrix[r][c] = 1

Where, again, R is the row and C is the column. 同样,R是行,C是列。

IF you want to fill the matrix on the beggining, you can also go for: 如果您想在开始时填写矩阵,也可以进行以下操作:

Matrix = [ [1, 4, 5, 19, 0], [1, 5, 0, 0, 0], [0, 0, 0, 0, 0],
[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0] ]

And then use two for-loops inside each other to increase the numbers 然后在彼此内部使用两个for循环来增加数量

for i in range(20):
   for j in range(5):
      Matrix[i][j] = Matrix[i][j] + 129

A couple of Tcl solutions. 几个Tcl解决方案。 Assume the original text with numbers is in the variable str , see below. 假设带有数字的原始文本在变量str ,请参见下文。

One way to do this is by replacing all numbers with command invocations to do the calculation and then perform substitution on the string (formatting will be a little off since numbers will now be wider but whitespace remains the same): 一种方法是通过用命令调用替换所有数字来进行计算,然后对字符串执行替换(格式将略有偏离,因为数字现在会更宽,但空格保持不变):

set res [subst [regsub -all {\d+} $str {[expr {&+129}]}]]

Another way is to split the string up into a matrix of lines and numbers and traverse it: 另一种方法是将字符串分成行和数字的矩阵并遍历它:

set res {}
foreach line [split $str \n] {
    foreach n $line {
        append res [format %5s [incr n 129]]
    }
    append res \n
}

The same method using Tcl 8.6 lmap mapping commands: 使用Tcl 8.6 lmap映射命令的相同方法:

set res [join [lmap line [split $str \n] {
    join [lmap n $line {
        format %5s [incr n 129]
    }] {}
}] \n]

In both cases, the resulting string will be in the variable res , with the original formatting preserved. 在这两种情况下,结果字符串都将在变量res ,并保留原始格式。

ETA: right-justified output: ETA:右对齐输出:

set res [join [lmap line [split $str \n] {
    format %25s [join [lmap n $line {
        format %5s [incr n 129]
    }] {}]
}] \n]

The variable str is assigned like this, as plain text (newlines are trimmed of at the ends to avoid empty ghost elements): 变量str的分配方式如下,即为纯文本(为避免空的幽灵元素,在行尾剪掉了换行符):

set str [string trim {
    1    4    5   19
    1    5
    2    3    6   59
    2    6
    3    2    4   60
    3    4
    4    1    3   61
    4    3
   25   13   23   64   65
   13   18
   14   13   15   75
   14   15
   15   14   16   76
   15   14
   45   44  102  103  104
} \n]

Documentation: append , expr , foreach , format , incr , lmap , regsub , set , split , string , subst 文档: appendexprforeachformatincrlmapregsubsetsplitstringsubst

Tcl: requires Tcl 8.6 for lmap Tcl: lmap需要Tcl 8.6

package require Tcl 8.6

# list of strings
set strings {
    {    1    4    5   19}
    {    1    5}
    {    2    3    6   59}
    {    2    6}
    {    3    2    4   60}
    {    3    4}
    {    4    1    3   61}
    {    4    3}
    {   25   13   23   64   65}
    {   13   18}
    {   14   13   15   75}
    {   14   15}
    {   15   14   16   76}
    {   15   14}
    {   45   44  102  103  104}
}

proc incr_all {listvar {n 1}} {
    upvar 1 $listvar lst
    set lst [lmap sublist $lst {lmap elem $sublist {expr {$elem + $n}}}]
}

proc format_strings {lst} {
    join [lmap sublist $lst {format [string repeat {%5s} [llength $sublist]] {*}$sublist}] \n
}

incr_all strings 119
puts [format_strings $strings]

output 输出

  120  123  124  138
  120  124
  121  122  125  178
  121  125
  122  121  123  179
  122  123
  123  120  122  180
  123  122
  144  132  142  183  184
  132  137
  133  132  134  194
  133  134
  134  133  135  195
  134  133
  164  163  221  222  223
set incriment 127
set molecules 450
set fout [open "Results.txt" w]
close $fout

proc addevery {filein fileout amount times} {
  set fh [open $filein r]
  set fout [open $fileout a+]
  while {[gets $fh line] != -1} {
    set result {}
    foreach item [regexp -all -inline {.{5}} $line] {
      append result [format %5s [expr {[string trim $item]+($amount*$times)}]]
    }
    puts $fout "CONECT$result"
  }
  close $fh
  close $fout
}

for {set i 0} {$i < $molecules} {incr i} {
    addevery "Connections_.txt" "Results.txt" $incriment $i
}

Thanks to the https://stackoverflow.com/users/1578604/jerry It is working, but not optimised yet. 感谢https://stackoverflow.com/users/1578604/jerry,它正在运行,但尚未优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM