简体   繁体   English

从字符串获取特定信息

[英]Getting specific information from a string

I have the same problem like this guy: 我有这个人一样的问题:

partition string in python and get value of last segment after colon 在python中分割字符串并获取冒号后的最后一段的值

Mine is like: 我的就像:

IP-Adress: 1.1.1.1 Device: Fritzbox Serialnumber: 123456789

I want to only get the Device so my Output should look like: "Fritzbox" i dont need anything else. 我只想获取设备,这样我的输出应如下所示:“ Fritzbox”我不需要其他任何东西。

result = mystring.rpartition(':')[2]

is this possible with this kinda code? 这种代码有可能吗? If yes what do i have to change to cut the rest off? 如果是的话,我该如何改变才能切断其余部分?

You can use re.split here and use the result to create a dictionary - that way you can access any keys you want, eg: 您可以在此处使用re.split并使用结果创建字典-这样,您可以访问所需的任何键,例如:

import re

text = 'IP-Adress: 1.1.1.1 Device: Fritzbox Serialnumber: 123456789 Description: something or other here test: 5'
split = re.split(r'\s*(\S+):\s+', text)
data = dict(zip(split[1::2], split[2::2]))

This gives you a data of: 这将为您提供以下data

{'IP-Adress': '1.1.1.1',
 'Device': 'Fritzbox',
 'Serialnumber': '123456789',
 'Description': 'something or other here',
 'test': '5'}

Then access that as you want, eg: 然后根据需要访问它,例如:

device = data.get('Device', '***No Device Found???***')

This way you get access to all key/value pairs should you ever want them, it doesn't rely on any ordering of keys nor their actual presence in your text. 通过这种方式,您可以在需要时访问所有键/值对,它不依赖于键的任何顺序或键在文本中的实际存在。

Assuming 'Device:' is always present, the following Regular expression should work for you: 假设始终存在'Device:' ,则以下正则表达式将为您工作:

s = 'IP-Adress: 1.1.1.1 Device: Fritzbox Serialnumber: 123456789'

import re
re.search(r'Device:\s*(\w+)', s).group(1)
# 'Fritzbox'

Or if you prefer string methods, you could do something like: 或者,如果您更喜欢字符串方法,则可以执行以下操作:

s.split(':')[-2].strip().split()[0]
# 'Fritzbox'

Assuming Device: and Serialnumber are always present: 假设Device:Serialnumber始终存在:

s = 'IP-Adress: 1.1.1.1 Device: Fritzbox Serialnumber: 123456789'

def GetInBetween(s, st, ed):
  return (s.split(st))[1].split(ed)[0]

print(GetInBetween(s, 'Device:', 'Serialnumber').strip())

OUTPUT : 输出

Fritzbox

EDIT : 编辑

If you have a list of those strings: 如果您有这些字符串的列表:

sList = ['IP-Adress: 1.2.2.2 Device: Fritzbox Serialnumber: 123456789',
        'IP-Adress: 1.3.4.3 Device: Macin Serialnumber: 123456789',
        'IP-Adress: 1.123.12.11 Device: IBM Serialnumber: 123456789',
         ]

for elem in sList:
    print(GetInbetween(elem, 'Device:', 'Serialnumber').strip())

OR 要么

Using list comprehension : 使用list comprehension

print([GetInbetween(x, 'Device:', 'Serialnumber').strip() for x in sList])

OUTPUT : 输出

['Fritzbox', 'Macin', 'IBM']

using pygrok python package we can extract data from the string in a structured format. 使用pygrok python包,我们可以以结构化格式从字符串中提取数据。

A Python library to parse strings and extract information from structured/unstructured data. 一个用于解析字符串并从结构化/非结构化数据中提取信息的Python库。

https://pypi.org/project/pygrok/ https://pypi.org/project/pygrok/

pip install pygrok 点安装pygrok

from pygrok import Grok
text = 'IP-Adress: 1.1.1.1 Device: Fritzbox Serialnumber: 123456789'
pattern = 'IP-Adress: 1.1.1.1 Device: %{WORD:device} Serialnumber: 123456789'
grok = Grok(pattern)
print (grok)
#output
{
  "device": [
   ["Fritzbox"]
]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM