[英]Python - iterating through list and dictionary to get a nested list output
I have a dictionary mydict
which contains some filenames as keys and text within them as values.我有一个字典
mydict
,其中包含一些文件名作为键和其中的文本作为值。
I am extracting a list of words from the text in each file.我正在从每个文件的文本中提取单词列表。 Words are stored in a list
mywords
.单词存储在列表
mywords
。
I have tried the following.我尝试了以下方法。
mydict = {'File1': 'some text. \n Foo extract this. \n Bar extract this',
'File2': 'more text. \n Bar extract this too.'}
mywords = ['Foo', 'Bar']
mylist= []
for k,v in mydict.items():
for word in mywords:
extracted = (re.findall('^ ' + word + ".*", v, flags=re.IGNORECASE|re.MULTILINE))
mylist.append(extracted[:1])
This gives me这给了我
[[' Foo extract this. '],
[' Bar extract this'],
[],
[' Bar extract this too.']]
However, I want the output to have 2 nested lists (for each file) instead of a separate list each time it searches a word in a file.但是,我希望每次在文件中搜索单词时输出都有 2 个嵌套列表(对于每个文件),而不是一个单独的列表。
Desired output:期望的输出:
[[' Foo extract this. '], [' Bar extract this']],
[[], [' Bar extract this too.']]
You might want to try making sublists and appending them to your list instead.您可能想尝试制作子列表并将它们附加到您的列表中。 Here's a possible solution:
这是一个可能的解决方案:
mydict = {'File1': 'some text. \n Foo extract this. \n Bar extract this',
'File2': 'more text. \n Bar extract this too.'}
mywords = ['Foo', 'Bar']
mylist= []
for k,v in mydict.items():
sublist = []
for word in mywords:
extracted = (re.findall('^ ' + word + ".*", v, flags=re.IGNORECASE|re.MULTILINE))
sublist.append(extracted[:1])
mylist.append(sublist)
This outputs: [[[' Foo extract this. '], [' Bar extract this']], [[], [' Bar extract this too.']]]
这输出:
[[[' Foo extract this. '], [' Bar extract this']], [[], [' Bar extract this too.']]]
[[[' Foo extract this. '], [' Bar extract this']], [[], [' Bar extract this too.']]]
If you wanted to have the strings without the surrounding list, insert the first result only if there is a result:如果您想要没有周围列表的字符串,请仅在有结果时插入第一个结果:
import re
mydict = {'File1': 'some text. \n Foo extract this. \n Bar extract this',
'File2': 'more text. \n Bar extract this too.'}
mywords = ['Foo', 'Bar']
mylist= []
for k,v in mydict.items():
sublist = []
for word in mywords:
extracted = (re.findall('^ ' + word + ".*", v, flags=re.IGNORECASE|re.MULTILINE))
if extracted: # Checks if there is at least one element in the list
sublist.append(extracted[0])
mylist.append(sublist)
This outputs: [[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too.']]
这输出:
[[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too.']]
[[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too.']]
If you want to be able to get several results from each file, you can do as follows (note that I put another match for Foo
in the second file:如果您希望能够从每个文件中获得多个结果,您可以执行以下操作(请注意,我在第二个文件中为
Foo
放置了另一个匹配项:
import re
mydict = {'File1': 'some text. \n Foo extract this. \n Bar extract this',
'File2': 'more text. \n Bar extract this too. \n Bar extract this one as well'}
mywords = ['Foo', 'Bar']
mylist= []
for k,v in mydict.items():
sublist = []
for word in mywords:
extracted = (re.findall('^ ' + word + ".*", v, flags=re.IGNORECASE|re.MULTILINE))
if extracted:
sublist += extracted
mylist.append(sublist)
This outputs: [[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too. ', ' Bar extract this one as well']]
这输出:
[[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too. ', ' Bar extract this one as well']]
[[' Foo extract this. ', ' Bar extract this'], [' Bar extract this too. ', ' Bar extract this one as well']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.