so I was recently working on this function here:
# counts owls
def owl_count(text):
# sets all text to lowercase
text = text.lower()
# sets text to list
text = text.split()
# saves indices of owl in list
indices = [i for i, x in enumerate(text) if x == ["owl"] ]
# counts occurences of owl in text
owl_count = len(indices)
# returns owl count and indices
return owl_count, indices
My goal was to count how many times "owl" occurs in the string and save the indices of it. The issue I kept running into was that it would not count "owls" or "owl." I tried splitting it into a list of individual characters but I couldn't find a way to search for three consecutive elements in the list. Do you guys have any ideas on what I could do here?
PS. I'm definitely a beginner programmer so this is probably a simple solution.
thanks!
If you don't want to use huge libraries like NLTK, you can filter words that starts with 'owl'
, not equal to 'owl'
:
indices = [i for i, x in enumerate(text) if x.startswith("owl")]
In this case words like 'owlowlowl'
will pass too, but one should use NLTK to properly tokenize words like in real world.
Python has built in functions for these.These types of matching of strings comes under something called Regular Expressions,which you can go into detail later
a_string = "your string"
substring = "substring that you want to check"
matches = re.finditer(substring, a_string)
matches_positions = [match.start() for match in matches]
print(matches_positions)
finditer() will return an iteration object and start() will return the starting index of the found matches.
Simply put,it returns indices of all the substrings in the string
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.