简体   繁体   中英

Make a script in python that lists adjacent words through Unix?

How can I write a script in python through nested dictionaries that takes a txt file written as,

white,black,green,purple,lavendar:1

red,black,white,silver:3

black,white,magenta,scarlet:4

and make it print for each entry before the : character, all neighbors it showed up next to

white: black silver magenta

black: white green red 

green: black purple

and so on

Edit: Well, I didn't post what I have because it is rather unsubstantial...I'll update it if I figure out anything else... I just have been stuck for a while - all I have figured out how to do is post each word/letter on a separate line with:

from sys import argv
script,filename=argv
txt=open(filename)
for line in txt:
    line=line[0:line.index(';')]
    for word in line.split(","):
        print word

I guess what I want is to have some kind of for loop that runs through each word, if the word is not in an original dictionary, I'll add it to it, then I'll search through for words that appear next to it in the file.

Input

a,c,f,g,hi,lw:1

f,g,j,ew,f,h,a,w:3

fd,s,f,g,s:4

Code

neighbours = {}

for line in file('4-input.txt'):
    line = line.strip()
    if not line:
        continue    # skip empty input lines

    line = line[:line.index(':')]   # take everything left of ':'

    previous_token = ''
    for token in line.split(','):
        if previous_token:
            neighbours.setdefault(previous_token, []).append(token)
            neighbours.setdefault(token, []).append(previous_token)
        previous_token = token

    import pprint
    pprint.pprint(neighbours)

Output

{'a': ['c', 'h', 'w'],
'c': ['a', 'f'],
'ew': ['j', 'f'],
'f': ['c', 'g', 'g', 'ew', 'h', 's', 'g'],
'fd': ['s'],
'g': ['f', 'hi', 'f', 'j', 'f', 's'],
'h': ['f', 'a'],
'hi': ['g', 'lw'],
'j': ['g', 'ew'],
'lw': ['hi'],
's': ['fd', 'f', 'g'],
'w': ['a']}

Tidying up the prettyprinted dictionary is left as an exercise for the reader. (Because dictionaries are inherently not sorted into any order, and removing the duplicates without changing the ordering of the lists is also annoying).

Easy solution:

for word, neighbour_list in neighbours.items():
    print word, ':', ', '.join(set(neighbour_list))

But that does change the ordering.

Here you go:

from collections import defaultdict

char_map = defaultdict(set)
with open('input', 'r') as input_file:
    for line in input_file:
        a_list, _ = line.split(':') # Discard the stuff after the :
        chars = a_list.split(',') # Get the elements before : as a list
        prev_char = ""
        for char, next_char in zip(chars, chars[1:]): # For every character add the 
                                                      # next and previous chars to the 
                                                      # dictionary
            char_map[char].add(next_char)
            if prev_char:
                char_map[char].add(prev_char)
            prev_char = char

print char_map
def parse (input_file):
char_neighbours = {}
File = open(input_file,'rb')
for line in File:
    line = line.strip().split(':')[0]
    if line != "":
        csv_list=line.split(',')
        for i in xrange(0,len(csv_list)-1):
            value = char_neighbours.get(csv_list[i]) or False
            if value is False:
                char_neighbours[csv_list[i]] = []
            if(i<len(csv_list)):
                if str(csv_list[i+1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i+1]))
            if(i>0):
                if str(csv_list[i-1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i-1]))
return char_neighbours

if __name__ == "__main__":
    dictionary=parse('test.txt')
    print dictionary

the parse method returns a dictionary of strings with a list of neighbours as their values

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM