简体   繁体   中英

Creating a function that counts words and characters (Including punctuation, but excluding white space)

I need to make a function which counts number of characters (including punctuation and excluding white space) and words in a given phrase. I've created a function so far that can count the number of characters ,but it includes white space as well and does not count words. How can I exclude whitespace and implement counting words as well?

text = " If I compare myself to someone else, then I am playing a game 
I will never win. "
def count_chars_words(txt):
    chars = len(txt.replace(' ',''))
    words = len(txt.split(' '))
    return [words,chars]

print(count_chars_words(text))


output [19, 63]

Count characters by stripping whitespaces from the text with replace(' ','') , and then getting the length of the string.

Count words by splitting the sentence into a list of words, and checking the length of the list.

Then, return both in a list.

text ="If I compare myself to someone else, then I am playing a game I will never win."
def count_chars_words(txt):
        chars = len(txt.replace(' ',''))
        words = len(txt.split(' '))
        return [words,chars]

print(count_chars_words(text))

Output:

[17, 63]

To get an idea of what replace() and split() do:

>> text.replace(' ','')
'IfIcomparemyselftosomeoneelse,thenIamplayingagameIwillneverwin.'
>> text.split(' ')
['If', 'I', 'compare', 'myself', 'to', 'someone', 'else,', 'then', 'I', 'am', 'playing', 'a', 'game', 'I', 'will', 'never', 'win.']

The function string.split() might be useful for you! It can take a string, find every instance of whatever you feed into it (such as " " ) and split your string into a list of each set of characters separated by " " (pretty much by word). With this you should be able to continue!

"If I compare myself to someone else, then I am playing a game I will never win.".split(" ")

gives

['If', 'I', 'compare', 'myself', 'to', 'someone', 'else,', 'then', 'I', 'am', 'playing', 'a', 'game', 'I', 'will', 'never', 'win.']

In order to avoid counting whitespace, have you considered using an if statement? You might findstring.whitespace and the in operator useful here!

As for counting words,string.split is your friend. In fact, if you split the words up first, is there a simple way to avoid even the if referenced above?

This is just an idea and not the efficient way, if you need a good way to do that use regex:

text ="If I compare myself to someone else, then I am playing a game I will never win."

total_num = len(text)
spaces = len([s for s in text if s == ' '])
words = len([w for w in text.split()])

print('total characters = ', total_num)
print('words = ', words)
print('spaces=', spaces)
print('charcters w/o spaces = ', total_num - spaces)

output:

total characters =  79
words =  17
spaces= 16
charcters w/o spaces =  63

Edit: using regex the more efficient will be:

import re

chars_without_spaces = re.findall(r'[^\s]', text)  # charcters w/o spaces 
words = re.findall(r'\b\w+', text)  # words

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM