[英]Why am I getting "TypeError: string indices must be integers" on my function?
I have a string as follows:我有一个字符串如下:
paragraph = 'Below you’ll find KPIs (key performance indicators) and valuation metrics for 50+ public SaaS and cloud companies. This includes historical share price performance and valuation multiples, an interactive regression chart, efficiency metrics (magic number, payback period, ARR / FTE, etc.), average ACV (annual contract value), and financial metrics including ARR, OpEx margins and cash flow margins. These metrics can be filtered by year-over-year ARR growth rates (filter located under the Valuation Metrics section header). Share prices and financial data are updated as of 06-May-2022 and will continue to be updated frequently.'
I am trying to write a function to retrieve the date as a string as '06-May-2022'我正在尝试编写一个函数来将日期作为字符串检索为 '06-May-2022'
def get_date(inputString):
# this will require a list with two elements, both integers
boolean_list = [char.isdigit() for char in inputString]
all_indexes = [i for i, x in enumerate(boolean_list) if x]
all_indexes = all_indexes[2:]
indexes = [all_indexes[0],all_indexes[-1]]
index_one = int(indexes[0])
index_two = int(indexes[1])
date = inputString[index_one,index_two]
return date
get_date(paragraph)
But when I run it, I get the error saying "TypeError: string indices must be integers"但是当我运行它时,我收到错误消息“TypeError:字符串索引必须是整数”
When I run this:当我运行这个:
type(indexes[0])
it returns "int" so I do not understand the error.它返回“int”,所以我不明白这个错误。 Any help would be greatly appreciated.任何帮助将不胜感激。 Thanks!谢谢!
This doesn't answer your question directly, but if you're looking to find the 'date' in a given string, you might want to look at Regular Expressions .这不会直接回答您的问题,但如果您想在给定字符串中查找“日期”,您可能需要查看Regular Expressions 。
In Python, you might do something like...在 Python 中,您可能会执行类似...
import re
paragraph = """Below you’ll find KPIs (key performance indicators) and valuation metrics for 50+ public SaaS and cloud companies. This includes historical share price performance and valuation multiples, an interactive regression chart, efficiency metrics (magic number, payback period, ARR / FTE, etc.), average ACV (annual contract value), and financial metrics including ARR, OpEx margins and cash flow margins. These metrics can be filtered by year-over-year ARR growth rates (filter located under the Valuation Metrics section header). Share prices and financial data are updated as of 06-May-2022 and will continue to be updated frequently."""
exp = re.compile(r"[0-9][0-9]-.+-[0-9][0-9][0-9][0-9]")
dates = exp.search(paragraph)
if dates:
date = dates[0]
print(str(date))
That snippet would print '06-May-2022'
.该片段将打印'06-May-2022'
。 You can see more about the expression that matched that string on Regex101 .您可以在Regex101上查看有关与该字符串匹配的表达式的更多信息。
In Software Engineering, there's a concept known as "don't reinvent the wheel" : in other words, use existing technologies such as Regular Expressions rather than trying to design complex functions to parse out date strings.在软件工程中,有一个概念叫做“不要重新发明轮子” :换句话说,使用正则表达式等现有技术,而不是尝试设计复杂的函数来解析日期字符串。 More so, there's probably packages you can find that would extract dates from a string without having to even use Regular Expressions yourself.更重要的是,您可能会发现一些包可以从字符串中提取日期,而无需自己使用正则表达式。
You can use spacy module in Python's natural language processing:你可以在 Python 的自然语言处理中使用 spacy 模块:
import spacy
# Load English tokenizer, tagger, parser and NER
nlp = spacy.load("en_core_web_sm")
doc = nlp(paragraph)
# Find named entities, phrases and concepts
for entity in doc.ents:
if entity.label_== 'DATE' and str(entity)[0].isdigit(): #second condition isto select only the dates with integer values
print(entity)
#output
'06-May-2022'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.