简体   繁体   English

Python 从两个字典打印

[英]Python print from two dictionaries

I split the dialogue into two dictionaries, each of them contains words which the person say (i have 2 persons).我将对话分成两个字典,每个字典都包含该人所说的单词(我有 2 个人)。 I have to print 4 columns (keyword, number from first directory (how many times use that word first person), number from second directory and count of them) and order by keyword.我必须打印 4 列(关键字、第一个目录中的数字(第一人称使用该词的次数)、第二个目录中的数字以及它们的数量)并按关键字排序。 Can somebody help me?有人可以帮助我吗? Output have to look like this: Output 必须看起来像这样:

african   1  0  1
air-speed 1  0  0
an        1  1  2
arthur    1  0  1
...

As you can see I have som text如您所见,我有一些文字

text = """Bridgekeeper: Hee hee heh. Stop. What... is your name?
King Arthur: It is 'Arthur', King of the Britons.
Bridgekeeper: What... is your quest?
King Arthur: To seek the Holy Grail.
Bridgekeeper: What... is the air-speed velocity of an unladen swallow?
King Arthur: What do you mean? An African or European swallow?"""

Output of bridgekeeper_w and arthur_w: bridgekeeper_w 和 arthur_w 的 Output:

print (bridgekeeper_w) 

{'hee': 2, 'heh': 1, 'stop': 1, 'what': 3, 'is': 3, 'your': 2, 'name': 1, 'quest': 1, 'the': 1, 'air-speed': 1, 'velocity': 1, 'of': 1, 'an': 1, 'unladen': 1, 'swallow': 1}

print (arthur_w)
{'king': 4, 'it': 1, 'is': 1, 'arthur': 1, 'of': 1, 'the': 2, 'britons': 1, 'to': 1, 'seek': 1, 'holy': 1, 'grail': 1, 'what': 1, 'do': 1, 'you': 1, 'mean': 1, 'an': 1, 'african': 1, 'or': 1, 'european': 1, 'swallow': 1}

Now i need this (keyword, number from first dict, number from second dict, and count):现在我需要这个(关键字,第一个字典的数字,第二个字典的数字和计数):

african   1  0  1
air-speed 1  0  0
an        1  1  2
arthur    1  0  1
...
``

If you already have two dictionaries, the main problem is how to loop over keys which are in either dictionary.如果您已经有两个字典,主要问题是如何遍历一字典中的键。 But that's not hard;但这并不难。

for key in sorted(set(list(bridgekeeper_w.keys()) + list(arthur_w.keys()))):
    b_count = 0 if key not in bridgekeeper_w else bridgekeeper_w[key]
    a_count = 0 if key not in arthur_w else arthur_w[key]
    print('%-20s %3i %3i %3i' % (key, b_count, a_count, b_count+a_count))

If the integrity of the dictionaries is not important, a more elegant solution might be to add the missing keys to one of the dictionaries, and then simply loop over all its keys.如果字典的完整性不重要,更优雅的解决方案可能是将丢失的键添加到其中一个字典,然后简单地循环遍历其所有键。

for key in arthur_w.keys():
    if key not in bridgekeeper_w:
        bridgekeeper_w[key] = 0

for key, b_count in sorted(bridgekeeper_w.items()):
    a_count = 0 if key not in arthur_w else arthur_w[key]
    print('%-20s %3i %3i %3i' % (key, b_count, a_count, b_count+a_count))

This does away with the rather tedious and slightly complex set(list(keys()...)) of the first solution, at the cost of traversing one of the dictionaries twice.这消除了第一个解决方案的相当乏味和稍微复杂的set(list(keys()...)) ,代价是遍历一个字典两次。

There are few steps to achieve the below dataframe-实现以下数据框的步骤很少 -

  1. Spilt the string based on '\n' new line char.根据 '\n' 换行符溢出字符串。
  2. initialize the result as defaultdict(list), then split each row on ':' use value at index 0 as the key and the value at index 1 as value.将结果初始化为 defaultdict(list),然后在 ':' 上拆分每一行,使用索引 0 处的值作为键,索引 1 处的值作为值。
  3. Convert the value list for each key back to a string via join.通过连接将每个键的值列表转换回字符串。
  4. Remove puntuations删除标点符号
  5. Use Counter to evaluate the value of each word in the string.使用 Counter 评估字符串中每个单词的值。

Finally, we'll have a JSON like this -最后,我们将有一个像这样的 JSON -

{'Bridgekeeper': Counter({'Hee': 1,
          'hee': 1,
          'heh': 1,
          'Stop': 1,
          'What': 3,
          'is': 3,
          'your': 2,
          'name': 1,
          'quest': 1,
          'the': 1,
          'airspeed': 1,
          'velocity': 1,
          'of': 1,
          'an': 1,
          'unladen': 1,
          'swallow': 1}),

This JSON can be transformed into the required output very easily if we load it into a dataframe.如果我们将这个 JSON 加载到 dataframe 中,它可以很容易地转换为所需的 output。

from collections import defaultdict
import string
from collections import Counter
import pandas as pd

result = defaultdict(list)
for row in text.split('\n'):
    result[row.split(':')[0].strip()].append(row.split(':')[1].strip())

result = {key:(' '.join(value)).translate(str.maketrans('', '', string.punctuation)) for key,value in result.items()}
result = {key:Counter(value.split(' ')) for key,value in result.items()}
df = pd.DataFrame(result).fillna(0).astype(int)
df['sum'] = df['Bridgekeeper'] + df['King Arthur']
df.to_csv('out.csv', sep='\t')

Output Dataframe - Output Dataframe -

          Bridgekeeper  King Arthur  sum
Hee                  1            0    1
hee                  1            0    1
heh                  1            0    1
Stop                 1            0    1
What                 3            1    4
is                   3            1    4
your                 2            0    2
name                 1            0    1
quest                1            0    1
the                  1            2    3
airspeed             1            0    1
velocity             1            0    1
of                   1            1    2
an                   1            0    1
unladen              1            0    1
swallow              1            1    2
It                   0            1    1
Arthur               0            1    1
King                 0            1    1
Britons              0            1    1
To                   0            1    1
seek                 0            1    1
Holy                 0            1    1
Grail                0            1    1
do                   0            1    1
you                  0            1    1
mean                 0            1    1
An                   0            1    1

Or a solution without third-party libraries:或者没有第三方库的解决方案:

bridgekeeper_d = {'hee': 2, 'heh': 1, 'stop': 1, 'what': 3, 'is': 3, 'your': 2, 'name': 1, 'quest': 1, 'the': 1, 'air-speed': 1, 'velocity': 1, 'of': 1, 'an': 1, 'unladen': 1, 'swallow': 1}
arthur_d = {'king': 4, 'it': 1, 'is': 1, 'arthur': 1, 'of': 1, 'the': 2, 'britons': 1, 'to': 1, 'seek': 1, 'holy': 1, 'grail': 1, 'what': 1, 'do': 1, 'you': 1, 'mean': 1, 'an': 1, 'african': 1, 'or': 1, 'european': 1, 'swallow': 1}
joined = dict.fromkeys(list(bridgekeeper_d.keys()) + list(arthur_d.keys()), {})

for key, value in bridgekeeper_d.items():
    joined[key]["bridgekeeper"] = value

for key, value in arthur_d.items():
    joined[key]["arthur"] = value
# At this point, joined looks like this:
# {
#     'hee': {'bridgekeeper': 1, 'arthur': 1},
#     'heh': {'bridgekeeper': 1, 'arthur': 1},
#     'stop': {'bridgekeeper': 1, 'arthur': 1},
#     'what': {'bridgekeeper': 1, 'arthur': 1}
#     ...
# }

for key, dic in joined.items():
    print("%-15s %d %d %d" % (key, dic["bridgekeeper"], dic["arthur"], dic["bridgekeeper"] + dic["arthur"]))

Output: Output:

hee             1 1 2
heh             1 1 2
stop            1 1 2
what            1 1 2
is              1 1 2
your            1 1 2
name            1 1 2
quest           1 1 2
the             1 1 2
air-speed       1 1 2
velocity        1 1 2
of              1 1 2
an              1 1 2
unladen         1 1 2
swallow         1 1 2
king            1 1 2
it              1 1 2
arthur          1 1 2
britons         1 1 2
to              1 1 2
seek            1 1 2
holy            1 1 2
grail           1 1 2
do              1 1 2
you             1 1 2
mean            1 1 2
african         1 1 2
or              1 1 2
european        1 1 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM