Sorry that I have to ask this question if it's a rather easy one, as time's limited for this script..I've already written some codes like below:
localNames = re.findall(r"<\*\[local-name\(\)='.*?'.*?\/@\*\[name\(\)='.*?'.*?'\]", str(nontransTagsContent[0]))
for i in localNames:
tags = re.findall(r"local-name\(\)='(.*?)'", i)
attributes = re.findall(r"name\(\)='(.*?)'", i)
And the result for print(tags)
is below:
['tag1']
['tag2', 'tag3', 'tag4']
['tag5', 'tag6']
The result for print(attributes)
is below:
['attribute1', 'attribute2', 'attribute3', 'attribute4']
['attribute5', 'attribute6']
['attribute7', 'attribute8', 'attribute9']
The result I want to get is dictionaries like:
{'tag1':['attribute1', 'attribute2', 'attribute3','attribute4'}
{'tag2':['attribute5', 'attribute6']}
{'tag3':['attribute5', 'attribute6']}
{'tag4':['attribute5', 'attribute6']}
{'tag5':['attribute7', 'attribute8', 'attribute9']}
{'tag6':['attribute7', 'attribute8', 'attribute9']}
I thought in this way, I can manipulate the data easily as I can extract data and write them into other forms. Below is the code I tried:
for x in tags:
dict = zip(tags, attributes)
print (list(dict))
But the output doesn't seem to be right. Would you help take a look at it and see how to fix this...Thank you very much!
tags=[
['tag1'],
['tag2', 'tag3', 'tag4'],
['tag5', 'tag6'],
]
attributes=[
['attribute1', 'attribute2', 'attribute3', 'attribute4'],
['attribute5', 'attribute6'],
['attribute7', 'attribute8', 'attribute9'],
]
for idx, tag_line in enumerate(tags):
for tag in tag_line:
print {tag : attributes[idx]}
output:
{'tag1': ['attribute1', 'attribute2', 'attribute3', 'attribute4']}
{'tag2': ['attribute5', 'attribute6']}
{'tag3': ['attribute5', 'attribute6']}
{'tag4': ['attribute5', 'attribute6']}
{'tag5': ['attribute7', 'attribute8', 'attribute9']}
{'tag6': ['attribute7', 'attribute8', 'attribute9']}
if you want the dict has all the tag in one list
from itertools import repeat
for tag, attr in zip(tags,attributes):
print dict(zip(tag, repeat(attr,len(tag))))
output:
{'tag1': ['attribute1', 'attribute2', 'attribute3', 'attribute4']}
{'tag4': ['attribute5', 'attribute6'], 'tag2': ['attribute5', 'attribute6'], 'tag3': ['attribute5', 'attribute6']}
{'tag5': ['attribute7', 'attribute8', 'attribute9'], 'tag6': ['attribute7', 'attribute8', 'attribute9']}
additional request:
tags, attributes = [], []
for i in localNames:
tags.append(re.findall(r"local-name\(\)='(.*?)'", i))
attributes.append(re.findall(r"name\(\)='(.*?)'", i))
You can do this more easily and clearly if you explicitly create the dictionaries. zip
doesn't create a dictionary.
tags = [
['tag1'],
['tag2', 'tag3', 'tag4'],
['tag5', 'tag6']
]
attributes = [
['attribute1', 'attribute2', 'attribute3', 'attribute4'],
['attribute5', 'attribute6'],
['attribute7', 'attribute8', 'attribute9']
]
dict_list = []
for t_list, a_list in zip(tags, attributes):
for t in t_list:
dict_list.append({t: a_list})
print(dict_list[-1])
One liner:
guten_tag = { tag: attributes[i] for i, tag_group in enumerate(tags) for tag in tag_group}
make sure you have a list called tags
and a list called attributes
like in the other examples.
tags = [
['tag1'],
['tag2', 'tag3', 'tag4'],
['tag5', 'tag6']
]
attributes = [
['attribute1', 'attribute2', 'attribute3', 'attribute4'],
['attribute5', 'attribute6'],
['attribute7', 'attribute8', 'attribute9']
]
Speed comparison:
galaxy_an
answer is 1.05 µs per loop
at 1000000 loops
using the timeit
module
jeremy
answer is 1.21 µs
at 1000000 loops
using the timeit
module
guten_tag
(this method) is 850 ns per loop
at 1000000 loops
using the timeit
module
where µs
is 10 ^ -6
and nano
is 10 ^ -9
.
On a very superficial level, this gives you 2-3 order of magnitude performance increase.
You can achieve this by applying nested maps
:
map(
lambda x: map(
lambda y: {y: attributes[x[0]]},
x[1]
),
enumerate(tags)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.