I have an input file containing lines like the following:
"Kansas City Chiefs 42"
Each line contains a random number of spaces between the words and the numbers. I am trying to identify a way that I can slice the two values (word portion and number portion). My ideal output would be:
"Kansas City Chiefs"
"42"
Any ideas?
Checkout this regex:
import re
your_string = "Kansas City Chiefs 42"
items = re.split(r'\s+(?=\d)|(?<=\d)\s+', your_string)
print(items)
you got:
['Kansas City Chiefs', '42']
If your requirement is to read split as soon as you get the first number then below should work.
st = "Kansas City Chiefs 42"
text_part = ""
for each in st:
if each.isnumeric():
break
text_part += each
number_part = st.replace(text_part, "")
print(text_part)
print(number_part)
you can you.strip() on either of the values depending on whether you want to keep the spaces at the end or not
Here's my implementation:
from nltk import word_tokenize
sentence = "Kansas City Chiefs 42"
tokens = word_tokenize(sentence)
word_phrases = " ".join([token for token in tokens if not token.isnumeric()])
numeric_phrase = " ".join([token for token in tokens if token.isnumeric()])
print(word_phrases)
print(numeric_phrase)
# Python3 program to extract all the numbers from a string
import re
# Function to extract all the numbers from the given string
def getNumbers(str):
array = re.findall(r'[0-9]+', str)
return array
# Driver code
str = "adbv345hj43hvb42"
array = getNumbers(str)
print(*array)
Output:
345 43 42
You can use regex in python.
import re
def getNumber(str):
arr = re.findall(r'[0-9]+', str)
str1 = "Kansas City Chiefs 42"
numbers = getNumber(str1)
str_val = str1[str1.find(numbers[0]):] # Kansas City Chiefs
print(" ".join(numbers))
# output -> 42
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.