简体   繁体   English

如何打印 YAML 字符串的特定部分

[英]How do I print a specific part of a YAML string

My YAML database:我的 YAML 数据库:

left:
  - title: Active Indicative
    fill: "#cb202c"
    groups:
      - "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"

My Python code:我的 Python 代码:

import io
import yaml

with open("C:/Users/colin/Desktop/LBot/latin3_2.yaml", 'r', encoding="utf8") as f:
    doc = yaml.safe_load(f)
txt = doc["left"][1]["groups"][1]
print(txt)

Currently my output is Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]目前我的 output Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt] Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt] Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt] but I would like the output to be ō , is , it , or imus . Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]但我希望 output 为ōisitimus Is this possible in PyYaml and if so how would I implement it?这在 PyYaml 中是否可行,如果可以,我将如何实现它? Thanks in advance.提前致谢。

I don't have a PyYaml solution, but if you already have the string from the YAML file, you can use Python's regex module to extract the text inside the [ ] .我没有 PyYaml 解决方案,但如果您已经拥有 YAML 文件中的字符串,您可以使用 Python 的regex模块来提取[ ]中的文本。

import re

txt = "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"

parts = txt.split(" | ")
print(parts)  
# ['Present', 'dūc[ō]', 'dūc[is]', 'dūc[it]', 'dūc[imus]', 'dūc[itis]', 'dūc[unt]']

pattern = re.compile("\\[(.*?)\\]")
output = []
for part in parts:
    match = pattern.search(part)
    if match:
        # group(0) is the matched part, ex. [ō]
        # group(1) is the text inside the (.*?), ex. ō
        output.append(match.group(1))
    else:
        output.append(part)

print(" | ".join(output))
# Present | ō | is | it | imus | itis | unt

The code first splits the text into individual parts, then loops through each part search -ing for the pattern [x] .代码首先将文本拆分为单独的部分,然后循环遍历每个部分search模式[x] If it finds it, it extracts the text inside the brackets from the match object and stores it in a list.如果找到它,它将从匹配 object中提取括号内的文本并将其存储在列表中。 If the part does not match the pattern (ex. 'Present' ), it just adds it as is.如果该part与模式不匹配(例如'Present' ),它只是按原样添加它。

At the end, all the extracted strings are join -ed together to re-build the string without the brackets.最后,所有提取的字符串都join在一起以重新构建没有括号的字符串。


EDIT based on comment :根据评论编辑

If you just need one of the strings inside the [ ] , you can use the same regex pattern but use the findall method instead on the entire txt , which will return a list of matching strings in the same order that they were found .如果您只需要[ ]中的一个字符串,您可以使用相同的正则表达式模式,但在整个txt上使用findall方法,这将返回匹配字符串的list其顺序与找到它们的顺序相同

import re

txt = "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"

pattern = re.compile("\\[(.*?)\\]")
matches = pattern.findall(txt)
print(matches) 
# ['ō', 'is', 'it', 'imus', 'itis', 'unt']

Then it's just a matter of using some variable to select an item from the list:然后只需使用一些变量来 select 列表中的一个项目:

selected_idx = 1  # 0-based indexing so this means the 2nd character
print(matches[selected_idx])
# is

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM