简体   繁体   中英

understanding functions in Python (split, join, dictionaries)

I have this code:

def magic1(text, skip):
    return text[::skip]

def magic2(text, skip):
    return ' '.join(text.split(' ')[::skip])

def magic3(text, appearances):
    d = {}

    for w in text.split(' '):
        d[w] = d.get(w, 0) + 1

    words = [w for w,v in d.items() if v <= appearances]
    return ' '.join(words)

print magic1('abcdefghijklmnop', 2)
print magic2('hi bye doom boom', 2)
print magic3('hi bye doom boom doom bye bye', 2)

The first function return the string but with skipping on one letter. "ace..." In the second function there's "' '.join" which enter space between each word, "split" returns a list of all the words in the string with spaces and again there's skipping. But why it was needed to use split and join together? In the 3th function, "d={}" represent a dictionary. again, there's split, and each word in the string will be added in the dictionary. But I didn't understand this line:

words = [w for w,v in d.items() if v <= appearances]

every w in d.items will be in "words" if ther apperance is not bigger than 2?

Can someone help me please? I use Google and docs , I'm just not sure If I understood this code. Thanks.

1) "...why it was needed to use split and join together?"

In [1]: def magic2(text, skip):
   ...:     return ' '.join(text.split(' ')[::skip])
   ...:

In [2]: print magic2('hi bye doom boom', 2)
hi doom

The key part of this is the [::skip] . If this was excluded, the split and join would cancel each other out and the string would remain the same:

In [3]: def magic2a(text, skip):
            return ' '.join(text.split(' '))
   ...:

In [4]: print magic2a('hi bye doom boom', 2)
hi bye doom boom

However because we are skipping every second word in the list created by .split(' ') and then joining we get hi doom .

2) "...every w in d.items will be in "words" if there appearance is not bigger than 2?"

d = {}

for w in text.split(' '):
    d[w] = d.get(w, 0) + 1

This counts the number of words. We look to see if we have a value for d[w] already (and default to 0 if not) and then increment.

words = [w for w,v in d.items() if v <= appearances]

This then returns a list of words that appear less than or equal to appearances .

Bonus

For the last part it would be simpler to use a defaultdict that you can specify a default value for every key in your dictionary.

from collections import defaultdict

d = defaultdict(int)

for w in text.split(' '):
    d[w] += 1

Here you count the number of words:

d = {}

for w in text.split(' '):
    d[w] = d.get(w, 0) + 1

And here

words = [w for w,v in d.items() if v <= appearances]

you return words that occur <= than appearances (2).

In this list comprehension w contains a word, and v the correspondent value of a dictionary, the number of appearances. If this number is less or equal to appearances , w is included into the result of the comprehension. Thus you can only words with appearances less or equal 2.

This part of method creates dictionary which contains word as an key and integer as counter of occurences:

for w in text.split(' '):
    d[w] = d.get(w, 0) + 1

so it looks like this:

'bye' => 3,
'doom' => 2,

so this line:

words = [w for w,v in d.items() if v <= appearances]

iterates through dictionary set w to key of dictionary (word), and v for it's value (counter) and puts into words only that keys which occurences is greater or equals 2 :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM