简体   繁体   English

用python编写一个脚本,通过Unix列出相邻的单词?

[英]Make a script in python that lists adjacent words through Unix?

How can I write a script in python through nested dictionaries that takes a txt file written as, 如何通过嵌套字典在python中编写脚本,该脚本采用txt文件,

white,black,green,purple,lavendar:1

red,black,white,silver:3

black,white,magenta,scarlet:4

and make it print for each entry before the : character, all neighbors it showed up next to 并将其打印在字符之前的每个条目中,并显示在它旁边的所有邻居

white: black silver magenta

black: white green red 

green: black purple

and so on 等等

Edit: Well, I didn't post what I have because it is rather unsubstantial...I'll update it if I figure out anything else... I just have been stuck for a while - all I have figured out how to do is post each word/letter on a separate line with: 编辑:嗯,我没有发布我所拥有的东西,因为它相当虚假...如果我发现其他任何东西,我将对其进行更新...我只是被卡住了一段时间-我已经弄清楚了如何要做的是将每个单词/字母放在单独的行上,并带有:

from sys import argv
script,filename=argv
txt=open(filename)
for line in txt:
    line=line[0:line.index(';')]
    for word in line.split(","):
        print word

I guess what I want is to have some kind of for loop that runs through each word, if the word is not in an original dictionary, I'll add it to it, then I'll search through for words that appear next to it in the file. 我想我想要的是在每个单词中都有某种for循环,如果该单词不在原始词典中,则将其添加到其中,然后搜索其旁边出现的单词在文件中。

Input 输入项

a,c,f,g,hi,lw:1

f,g,j,ew,f,h,a,w:3

fd,s,f,g,s:4

Code

neighbours = {}

for line in file('4-input.txt'):
    line = line.strip()
    if not line:
        continue    # skip empty input lines

    line = line[:line.index(':')]   # take everything left of ':'

    previous_token = ''
    for token in line.split(','):
        if previous_token:
            neighbours.setdefault(previous_token, []).append(token)
            neighbours.setdefault(token, []).append(previous_token)
        previous_token = token

    import pprint
    pprint.pprint(neighbours)

Output 输出量

{'a': ['c', 'h', 'w'],
'c': ['a', 'f'],
'ew': ['j', 'f'],
'f': ['c', 'g', 'g', 'ew', 'h', 's', 'g'],
'fd': ['s'],
'g': ['f', 'hi', 'f', 'j', 'f', 's'],
'h': ['f', 'a'],
'hi': ['g', 'lw'],
'j': ['g', 'ew'],
'lw': ['hi'],
's': ['fd', 'f', 'g'],
'w': ['a']}

Tidying up the prettyprinted dictionary is left as an exercise for the reader. 整理整本漂亮印刷的词典留给读者练习。 (Because dictionaries are inherently not sorted into any order, and removing the duplicates without changing the ordering of the lists is also annoying). (因为字典本来就不按任何顺序排序,并且在不更改列表顺序的情况下删除重复项也很烦人)。

Easy solution: 简单的解决方案:

for word, neighbour_list in neighbours.items():
    print word, ':', ', '.join(set(neighbour_list))

But that does change the ordering. 但这确实改变了顺序。

Here you go: 干得好:

from collections import defaultdict

char_map = defaultdict(set)
with open('input', 'r') as input_file:
    for line in input_file:
        a_list, _ = line.split(':') # Discard the stuff after the :
        chars = a_list.split(',') # Get the elements before : as a list
        prev_char = ""
        for char, next_char in zip(chars, chars[1:]): # For every character add the 
                                                      # next and previous chars to the 
                                                      # dictionary
            char_map[char].add(next_char)
            if prev_char:
                char_map[char].add(prev_char)
            prev_char = char

print char_map
def parse (input_file):
char_neighbours = {}
File = open(input_file,'rb')
for line in File:
    line = line.strip().split(':')[0]
    if line != "":
        csv_list=line.split(',')
        for i in xrange(0,len(csv_list)-1):
            value = char_neighbours.get(csv_list[i]) or False
            if value is False:
                char_neighbours[csv_list[i]] = []
            if(i<len(csv_list)):
                if str(csv_list[i+1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i+1]))
            if(i>0):
                if str(csv_list[i-1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i-1]))
return char_neighbours

if __name__ == "__main__":
    dictionary=parse('test.txt')
    print dictionary

the parse method returns a dictionary of strings with a list of neighbours as their values parse方法返回一个字符串字典,其中包含邻居列表作为其值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM