简体   繁体   English

Python提取从索引开始到字符的字符串

[英]Python extract string starting with index up to character

Say I have an incoming string that varies a little:假设我有一个稍微变化的传入字符串:

  1. " 1 |r|=1.2e10 |v|=2.4e10"
  2. " 12 |r|=-2.3e10 |v|=3.5e-04"
  3. "134 |r|= 3.2e10 |v|=4.3e05"

I need to extract the numbers (ie. 1.2e10, 3.5e-04, etc)... so I would like to start at the end of '|r|'我需要提取数字(即 1.2e10、3.5e-04 等)...所以我想从 '|r|' 的末尾开始and grab all characters up to the ' ' (space) after it.并抓住它后面的所有字符直到''(空格)。 Same for '|v|' '|v|' 也一样

I've been looking for something that would: Extract a substring form a string starting at an index and ending on a specific character... But have not found anything remotely close.我一直在寻找这样的东西:从一个字符串中提取一个子字符串,从一个索引开始到一个特定的字符结束......但没有找到任何远程接近的东西。 Ideas?想法?

NOTE: Added new scenario, which is the one that is causing lots of head-scratching...注意:添加了新场景,这是导致很多头疼的场景......

To keep it elegant and generic, let's utilizesplit :为了保持优雅和通用,让我们使用split

  1. First, we split by ' ' to tokens首先,我们用 ' ' 分割成令牌
  2. Then we find if it has an equal sign and parse the key-value然后我们查找是否有等号并解析key-value
import re
sabich = "134 |r|     = 3.2e10 |v|=4.3e05"

parts = sabich.split(' |')
values = {}
for p in parts:
    if '=' in p:
        k, v = p.split('=')
        values[k.replace('|', '').strip()] = v.strip(' ')

# {'r': '3.2e10', 'v': '4.3e05'}
print(values)

This can be converted to the one-liner:这可以转换为单行:

import re
sabich = "134 |r|     = 3.2e10 |v|=4.3e05"

values = {t[0].replace('|', '').strip() :  t[1].strip(' ') for t in [tuple(p.split('=')) for p in sabich.split(' |') if '=' in p]}

# {'|r|': '1.2e10', '|v|': '2.4e10'}
print(values)

You can solve it with a regular expression.您可以使用正则表达式解决它。

import re

strings = [
    "  1 |r|=1.2e10 |v|=2.4e10",
    " 12 |r|=-2.3e10 |v|=3.5e-04"
]

out = []
pattern = r'(?P<name>\|[\w]+\|)=(?P<value>-?\d+(?:\.\d*)(?:e-?\d*)?)'
for s in strings:
    out.append(dict(re.findall(pattern, s)))

print(out)

Output输出

[{'|r|': '1.2e10', '|v|': '2.4e10'}, {'|r|': '-2.3e10', '|v|': '3.5e-04'}]

And if you want to convert the strings to number如果你想将字符串转换为数字

out = []
pattern = r'(?P<name>\|[\w]+\|)=(?P<value>-?\d+(?:\.\d*)(?:e-?\d*)?)'
for s in strings:
    # out.append(dict(re.findall(pattern, s)))
    out.append({
        name: float(value)
        for name, value in re.findall(pattern, s)
    })

Output输出

[{'|r|': 12000000000.0, '|v|': 24000000000.0}, {'|r|': -23000000000.0, '|v|': 0.00035}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM