[英]How do I print a specific part of a YAML string
My YAML database:我的 YAML 数据库:
left:
- title: Active Indicative
fill: "#cb202c"
groups:
- "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"
My Python code:我的 Python 代码:
import io
import yaml
with open("C:/Users/colin/Desktop/LBot/latin3_2.yaml", 'r', encoding="utf8") as f:
doc = yaml.safe_load(f)
txt = doc["left"][1]["groups"][1]
print(txt)
Currently my output is Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]
目前我的 output
Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]
Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]
Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]
but I would like the output to be ō
, is
, it
, or imus
. Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]
但我希望 output 为ō
、 is
、 it
或imus
。 Is this possible in PyYaml and if so how would I implement it?这在 PyYaml 中是否可行,如果可以,我将如何实现它? Thanks in advance.
提前致谢。
I don't have a PyYaml solution, but if you already have the string from the YAML file, you can use Python's regex
module to extract the text inside the [ ]
.我没有 PyYaml 解决方案,但如果您已经拥有 YAML 文件中的字符串,您可以使用 Python 的
regex
模块来提取[ ]
中的文本。
import re
txt = "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"
parts = txt.split(" | ")
print(parts)
# ['Present', 'dūc[ō]', 'dūc[is]', 'dūc[it]', 'dūc[imus]', 'dūc[itis]', 'dūc[unt]']
pattern = re.compile("\\[(.*?)\\]")
output = []
for part in parts:
match = pattern.search(part)
if match:
# group(0) is the matched part, ex. [ō]
# group(1) is the text inside the (.*?), ex. ō
output.append(match.group(1))
else:
output.append(part)
print(" | ".join(output))
# Present | ō | is | it | imus | itis | unt
The code first splits the text into individual parts, then loops through each part search
-ing for the pattern [x]
.代码首先将文本拆分为单独的部分,然后循环遍历每个部分
search
模式[x]
。 If it finds it, it extracts the text inside the brackets from the match object and stores it in a list.如果找到它,它将从匹配 object中提取括号内的文本并将其存储在列表中。 If the
part
does not match the pattern (ex. 'Present'
), it just adds it as is.如果该
part
与模式不匹配(例如'Present'
),它只是按原样添加它。
At the end, all the extracted strings are join
-ed together to re-build the string without the brackets.最后,所有提取的字符串都
join
在一起以重新构建没有括号的字符串。
EDIT based on comment :根据评论编辑:
If you just need one of the strings inside the [ ]
, you can use the same regex pattern but use the findall
method instead on the entire txt
, which will return a list
of matching strings in the same order that they were found .如果您只需要
[ ]
中的一个字符串,您可以使用相同的正则表达式模式,但在整个txt
上使用findall
方法,这将返回匹配字符串的list
,其顺序与找到它们的顺序相同。
import re
txt = "Present | dūc[ō] | dūc[is] | dūc[it] | dūc[imus] | dūc[itis] | dūc[unt]"
pattern = re.compile("\\[(.*?)\\]")
matches = pattern.findall(txt)
print(matches)
# ['ō', 'is', 'it', 'imus', 'itis', 'unt']
Then it's just a matter of using some variable to select an item from the list:然后只需使用一些变量来 select 列表中的一个项目:
selected_idx = 1 # 0-based indexing so this means the 2nd character
print(matches[selected_idx])
# is
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.