简体   繁体   English

Python字典映射不能按预期工作

[英]Python Dictionary Mapping Not Working as Expected

So I have this text file that contains the following. 所以我有这个包含以下内容的文本文件。

<lang:Foreign> <lang:foreign>
</lang:Foreign> </lang:foreign>
<lang: Foreign> <lang:foreign>
</lang: Foreign> </lang:foreign>

What my program do is it maps the first text in the line to the second. 我的程序所做的是将行中的第一个文本映射到第二个文本。 So it would look like this in the dictionary. 所以它在字典中看起来像这样。

{<lang:Foreign> : <lang:foreign>}
flist = [line.split() for line in f]
for k, v in flist:
    fdict.update({k: v})

My mapping code is above. 我的映射代码在上面。 But the problem is the last two lines of entries 但问题是最后两行条目

<lang: Foreign> <lang:foreign>
</lang: Foreign> </lang:foreign>

The first entries have space between them and my code splits lang: and Foreign. 第一个条目之间有空格,我的代码拆分了lang:和Foreign。 But I want to specify that the first entry contains a space. 但我想指定第一个条目包含空格。 I have tried doing the following 我尝试过以下操作

<lang:\sForeign> <lang:foreign>
</lang:\sForeign> </lang:foreign>

Any idea how I can tell my program to accept this space and map it properly? 知道我怎么能告诉我的程序接受这个空间并正确映射它? Thanks! 谢谢!

Just use different split argument. 只需使用不同的split参数。 This should work for you: 这应该适合你:

line.split(' <')

I would suggest using regex. 我建议使用正则表达式。 Using the following pattern matching will give you a list of matching patterns enclosed in '<>' for each line. 使用以下模式匹配将为您提供每行包含在“<>”中的匹配模式列表。

    import re

    pattern = re.compile(r'<.*?>')
    flist = pattern.findall(line) # sample output of flist = ['<lang:Foreign>', '<lang:foreign>']
    if len(flist) == 2:
        fdict.update({flist[0]: flist[1]})

I would suggest that you split on "> <" and then add the ">" and "<" back to the first and second elements of the array. 我建议您拆分“> <”,然后将“>”和“<”添加回数组的第一个和第二个元素。 Something like this ... 像这样......

arr = line.split('> <')
arr[0] = arr[0] + '>'
arr[1] = '<' + arr[1]

Using regular expressions probably makes the most sense here. 使用正则表达式可能在这里最有意义。

import re

pattern = re.compile(r'(<.*?>)\s*(<.*?>)')

flist = [pattern.findall(line) for line in f]
for k, v in flist:
    fdict.update({k: v})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM