简体   繁体   中英

Word count with pattern in Python

So this is the question:

Write a program to read in multiple lines of text and count the number of words in which the rule i before e, except after c is broken, and number of words which contain either ei or ie and which don't break the rule.

For this question, we only care about the c if it is the character immediately before the ie or the ei. So science counts as breaking the rule, but mischievous doesn't. If a word breaks the rule twice (like obeisancies), then it should still only be counted once.

Example given:

 Line: The science heist succeeded Line: challenge accepted Line: Number of times the rule helped: 0 Number of times the rule was broken: 2 

and my code:

rule = []
broken = []
line = None
while line != '':
    line = input('Line: ')

    line.replace('cie', 'broken')
    line.replace('cei', 'rule')
    line.replace('ie', 'rule')
    line.replace('ei', 'broken')

    a = line.count('rule')
    b = line.count('broken')

    rule.append(a)
    broken.append(b)

print(sum(a)); print(sum(b))

How do I fix my code, to work like the question wants it to?

Firstly, replace does not chance stuff in place. What you need is the return value:

line = 'hello there'                     # line = 'hello there'
line.replace('there','bob')              # line = 'hello there'
line = line.replace('there','bob')       # line = 'hello bob'

Also I would assume you want actual totals so:

print('Number of times the rule helped: {0}'.format(sum(rule)))
print('Number of times the rule was broken: {0}'.format(sum(broken)))

You are printing a and b . These are the numbers of times the rule worked and was broken in the last line processed. You want totals.

As a sidenote: Regular expressions are good for things like this. re.findall would make this a lot more sturdy and pretty:

line = 'foo moo goo loo foobar cheese is great '
foo_matches = len(re.findall('foo', line))   # = 2

I'm not going to write the code to your exact specification as it sounds like homework but this should help:

import pprint

words = ['science', 'believe', 'die', 'friend', 'ceiling',
         'receipt', 'seize', 'weird', 'vein', 'foreign']

rule = {}
rule['ie'] = []
rule['ei'] = []
rule['cei'] = []
rule['cie'] = []

for word in words:
    if 'ie' in word:
        if 'cie' in word:
            rule['cie'].append(word)
        else:
            rule['ie'].append(word)
    if 'ei' in word:
        if 'cei' in word:
            rule['cei'].append(word)
        else:
            rule['ei'].append(word)

pprint.pprint(rule)

Save it to a file like i_before_e.py and run python i_before_e.py :

{'cei': ['ceiling', 'receipt'],
 'cie': ['science'],
 'ei': ['seize', 'weird', 'vein', 'foreign'],
 'ie': ['believe', 'die', 'friend']}

You can easily count the occurrences with:

for key in rule.keys():
    print "%s occured %d times." % (key, len(rule[key])) 

Output:

ei occured 4 times.
ie occured 3 times.
cie occured 1 times.
cei occured 2 times.

If I understand correctly, your main problematic is to get unique result per word. Is that what you try to achieve:

rule_count = 0
break_count = 0
line = None
while line != '':
    line = input('Line: ')
    rule_found = False
    break_found = False

    for word in line.split():
        if 'cie' in line:
            line = line.replace('cie', '')
            break_found = True
        if 'cei' in line:
            line = line.replace('cei', '')
            rule_found = True
        if 'ie' in line:
            rule_found = True
        if 'ei' in line:
            break_found = True

        if rule_found:
            rule_count += 1
        if break_found:
            break_count += 1

print(rule_found); print(break_count)

Let's split the logic up into functions, that should help us reason about the code and get it right. To loop over the line, we can use the iter function:

def rule_applies(word):
    return 'ei' in word or 'ie' in word

def complies_with_rule(word):
    if 'cie' in word:
        return False
    if word.count('ei') > word.count('cei'):
        return False
    return True

helped_count = 0
broken_count = 0
lines = iter(lambda: input("Line: "), '')
for line in lines:
    for word in line.split():
        if rule_applies(word):
            if complies_with_rule(word):
                helped_count += 1
            else:
                broken_count += 1

print("Number of times the rule helped:", helped_count)
print("Number of times the rule was broken:", broken_count)

We can make the code more concise by shortening the complies_with_rule function and by using generator expressions and Counter :

from collections import Counter

def rule_applies(word):
    return 'ei' in word or 'ie' in word

def complies_with_rule(word):
    return 'cie' not in word and word.count('ei') == word.count('cei')

lines = iter(lambda: input("Line: "), '')
words = (word for line in lines for word in line.split())
words_considered = (word for word in words if rule_applies(word))
did_rule_help_count = Counter(complies_with_rule(word) for word in words_considered)

print("Number of times the rule helped:", did_rule_help_count[True])
print("Number of times the rule was broken:", did_rule_help_count[False])
rule = []
broken = []
tb = 0
tr = 0
line = ' '
while line:
    lines = input('Line: ')
    line = lines.split()


    for word in line:

        if 'ie' in word:
            if 'cie' in word:
                tb += 1
            elif word.count('cie') > 1:
                tb += 1

            elif word.count('ie') > 1:
                tr += 1
            elif 'ie' in word:
                tr += 1

        if 'ei' in word:
            if 'cei' in word:
                tr += 1
            elif word.count('cei') > 1:
                tr += 1

            elif word.count('ei') > 1:
                tb += 1
            elif 'ei' in word:
                tb += 1




print('Number of times the rule helped: {0}'.format(tr))
print('Number of times the rule was broken: {0}'.format(tb))

Done.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM