简体   繁体   English

查找字符串并用某些内容替换接下来的几行

[英]Find string and replace the next few lines with something

I am writing a Python script that will ask for a file and a name (eg "John"). 我正在编写一个Python脚本,该脚本将要求一个文件和一个名称(例如“ John”)。

The file contains a whole bunch of lines like this: 该文件包含如下几行:

...
Name=John
Age=30
Pay=1000
Married=1
Name=Bob
Age=25
Pay=500
Married=0
Name=John
Age=56
Pay=3000
Married=1
...

I want to open this file, ask the user for a name, and replace the pay value for all entries that match that name. 我想打开此文件,要求用户输入名称,然后替换与该名称匹配的所有条目的薪水值。 So, for example, the user inputs "John", I want to change the Pay for all "John"s to be, say, 5000. The Pay value for other names don't change. 因此,例如,用户输入“ John”,我想将所有“ John”的Pay更改为5000。其他名称的Pay值保持不变。

So far, I've opened up the file and concatenated everything into one long string to make things a bit easier: 到目前为止,我已经打开了文件,并将所有内容连接到一个长字符串中,以使事情变得容易一些:

for line in file:
    file_string += line

At first, I was thinking about some sort of string replace but that didn't pan out since I would search for "John" but I don't want to replace the "John", but rather the Pay value that is two lines down. 最初,我在考虑某种字符串替换,但是由于我要搜索“ John”,所以并没有成功,但是我不想替换“ John”,而是将Pay值减小了两行。

I started using regex instead and came up with something like this. 我开始使用正则表达式,然后想到了类似的东西。

# non-greedy matching
re.findall("Name=(.*?)\nAge=(.*?)\nPay=(.*?)\n", file_string, re.S)

Okay, so that spits out a list of 3-tuples of those groupings and it does seem to find everything fine. 好的,这样就列出了这些分组的三元组列表,并且看起来确实一切正常。 Now, to do the actual replacement... 现在,进行实际更换...

I read on another question here on StackOverflow that I can set the name of a grouping and use that grouping later on...: 我在这里在StackOverflow上阅读了另一个问题,我可以设置分组的名称并在以后使用该分组...:

re.sub(r'Name=(.*?)\nAge=(.*?)\nPay=', r'5000', file_string, re.S)

I tried that to see if it would work and replace all Names with 5000, but it didn't. 我尝试过看看是否可以正常运行,并将所有名称替换为5000,但没有成功。 If it would then I would probably do a check on the first group to see if it matched the user-inputed name or something. 如果可以的话,我可能会检查第一个组,以查看它是否与用户输入的名称匹配。

The other problem is that I read on the Python docs that re.sub only replaces the left-most occurrence. 另一个问题是,我在Python文档上阅读了re.sub仅替换了最左边的内容。 I want to replace all occurrences. 我要替换所有出现的情况。 How do I do that? 我怎么做?

Now I am a bit loss of what to do so if anyone can help me that would be great! 现在,如果有人可以帮助我,那我将无所适从,那真是太好了!

Iterate 4 lines at a time. 一次迭代4行。 If the first line contains 'John' edit the line that comes two after. 如果第一行包含“ John”,则编辑第二行。

data = """
Name=John
Age=30
Pay=1000
Married=1
Name=Bob
Age=25
Pay=500
Married=0
Name=John
Age=56
Pay=3000
Married=1
"""

lines = data.split() 
for i, value in enumerate(zip(*[iter(lines)]*4)):
    if 'John' in value[0]:
        lines[i*4 + 2] = "Pay=5000"

print '\n'.join(lines)

I don't think that regex is the best solution to this problem. 我认为正则表达式不是解决此问题的最佳方法。 I prefer more general solutions. 我更喜欢一般的解决方案。 The other answers depend on one or more of the following things: 其他答案取决于以下一项或多项:

  1. There are always 4 properties for a person. 一个人总是有4个属性。
  2. Every person has the same properties. 每个人都有相同的属性。
  3. The properties are always in the same order. 属性始终是相同的顺序。

If these are true in your case, then regex could be ok. 如果您的情况是正确的,则可以使用正则表达式。

My solution is more verbose, but it isn't depending on these. 我的解决方案较为冗长,但并不取决于这些。 It handles mixed/missing properties, mixed order, and able to set and get any property value. 它处理混合/缺失属性,混合顺序,并能够设置和获取任何属性值。 You could even extend it a little, and support new property or person insertion if you need. 您甚至可以稍微扩展它,并在需要时支持新的财产或人员插入。

My code: 我的代码:

# i omitted "data = your string" here

def data_value(person_name, prop_name, new_value = None):
    global data
    start_person = data.find("Name=" + person_name + "\n")
    while start_person != -1:
        end_person = data.find("Name=", start_person + 1)
        start_value = data.find(prop_name + "=", start_person, end_person)        
        if start_value != -1:
            start_value += len(prop_name) + 1
            end_value = data.find("\n", start_value, end_person)
            if new_value == None:
                return data[start_value:end_value]
            else:
                data = data[:start_value] + str(new_value) + data[end_value:]                
        start_person = data.find("Name=" + person_name + "\n", end_person)
    return None

print data_value("Mark", "Pay")    # Output: None (missing person)
print data_value("Bob", "Weight")  # Output: None (missing property)
print data_value("Bob", "Pay")     # Output: "500" (current value)
data_value("Bob", "Pay", 1234)     # (change it)
print data_value("Bob", "Pay")     # Output: "1234" (new value)

data_value("John", "Pay", 555)     # (change it in both Johns)

The following code will do what you need: 以下代码将满足您的需求:

import re

text = """
Name=John
Age=30
Pay=1000
Married=1
Name=Bob
Age=25
Pay=500
Married=0
Name=John
Age=56
Pay=3000
Married=1
"""

# the name you're looking for
name = "John"
# the new payment
pay = 500

print re.sub(r'Name={0}\nAge=(.+?)\nPay=(.+?)\n'.format(re.escape(name)), r'Name=\1\nAge=\2\nPay={0}\n'.format(pay), text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM