[英]Python: Finding a (string) key in a dictionary that contains a substring
[英]Python: Split string such that each substring is a key in a dictionary
我有一个示例字符串:
"green apple, sly fox, cunning quick fox fur, cool water, yellow sand"
和一本字典:
strr_dict = {"green": "color", "apple": "fruit", "sly": "behavior", "fox": "animal", "cunning": "behavior", "quick fox": "animal", "cool water": "drink", "yellow": "color", "sand": "matter"}
我想将字符串中的子字符串及其字典中的值显示为 dataframe。 这就是我所做的:
import pandas as pd
sample_str = "green apple, sly fox, cunning quick fox fur, cool water, yellow sand"
strr_dict = {"green": "color", "apple": "fruit", "sly": "behavior", "fox": "animal", "cunning": "behavior", "quick fox": "animal", "cool water": "drink", "yellow": "color", "sand": "matter"}
df_list = []
stripped_list = [i.strip() for i in sample_str.split(',')]
for i in stripped_list:
if i in strr_dict:
df_list.append([i, strr_dict[i]])
else:
for j in i.split():
if j in strr_dict:
df_list.append([j, strr_dict[j]])
else:
df_list.append([j, ""])
strr_df = pd.DataFrame(df_list, columns=['Text', 'Value'])
print(strr_df)
我得到的 output 是:
Text Value
0 green color
1 apple fruit
2 sly behavior
3 fox animal
4 cunning behavior
5 quick
6 fox animal
7 fur
8 cool water drink
9 yellow color
10 sand matter
我想要的 output 是:
Text Value
0 green color
1 apple fruit
2 sly behavior
3 fox animal
4 cunning behavior
5 quick fox animal
6 fur
7 cool water drink
8 yellow color
9 sand matter
如果子字符串与字典键完全匹配,我想显示这些值。 我想知道如何相应地拆分字符串。 在这种情况下, cunning quick fox fur
应该拆分为cunning
, quick fox
, fur
。 但这可能并非总是如此,有时应该将其拆分为cunning
、 quick fox fur
,以从字典中获取它们的值。 我对如何处理这种情况感到非常困惑。
所以这确实给出了您指定的 output。 我不知道你如何以及为什么想要这个,我不知道这是否适用于你可能拥有的其他输入案例,但它应该 - 随意使用你准备好的任何其他可怕的数据集进行测试。
import pandas as pd
sample_str = "green apple, sly fox, cunning quick fox fur, cool water, yellow sand"
strr_dict = {"green": "color", "apple": "fruit", "sly": "behavior", "fox": "animal", "cunning": "behavior",
"quick fox": "animal", "cool water": "drink", "yellow": "color", "sand": "matter"}
df_list = []
stripped_list = [i.strip() for i in sample_str.split(',')]
checklist = []
for i in stripped_list:
if i in strr_dict:
df_list.append([i, strr_dict[i]])
checklist.append(i)
else:
for z in list(strr_dict.keys()):
if z in str(checklist):
continue
if z in i:
try:
df_list.append([i, strr_dict[i]])
checklist.append(i)
except:
df_list.append([z, strr_dict[z]])
checklist.append(z)
for x in i.split():
if x not in str(checklist) and x not in list(strr_dict.keys()):
df_list.append([x, ""])
strr_df = pd.DataFrame(df_list, columns=['Text', 'Value'])
print(strr_df)
Output:
Text Value
0 green color
1 apple fruit
2 sly behavior
3 fox animal
4 cunning behavior
5 quick fox animal
6 fur
7 cool water drink
8 yellow color
9 sand matter
Process finished with exit code 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.