![](/img/trans.png)
[英]Why is Sort(s1)==Sort(s2) better to use than Counter(s1)==Counter(s2)
[英]given two sentences S1,S2 ; task is to find Words in S1 but not in S2
问题陈述:-给定两个句子S1,S2; 任务是在 S1 中而不是在 S2 中查找单词
def string_features(s, f):
s=s.lower()
s1=set(s.split())
f=f.lower()
s2=set(f.split())
s4=s1-s2
s4_list=list(s4)
s5=s2-s1
s5_list=list(s5)
for i in s4_list:
if i.upper() in s:
s4_list.remove(i)
s4_list.append(i.upper())
print(f"Words in S1 but not in S2 are : {s4_list}")
string_features(s="the first column F will contain only 5 uniques values",
f="the second column S will contain only 3 uniques values")
有了这个,我的预期输出应该是 ['first','F','5']; 但我得到['f','first','5']。 'f' 仍然是小写字母,而它应该是大写字母'F'。
def string_features(s, f):
s=s.lower()
sen1 = s.split()
f=f.lower()
sen2 = f.split()
group = []
test = list(set(sen1).symmetric_difference(set(sen2)))
for i in test:
for x in sen1:
if i == x:
group.append(i)
print(f"Words in S1 but not in S2 are : {group}")
如果您希望结果具有正确的大小写,则需要保留原始大小写。
def string_features(s, f):
return [x for i, x in enumerate(s.split()) if x.lower() not in f.lower().split()]
string_features(s="the first column F will contain only 5 uniques values",f="the second column S will contain only 3 uniques values")
当您想要区分大小写的结果时,为什么要调用lower()
? 您在开头将整个字符串小写,然后 s 中没有大写字母,这使您的代码的最后一部分无用。 实际上,只需删除lower()
即可使您的代码正常工作。
由于set是无序集合,如果您想保留阅读顺序,您可能需要使用 for 循环来迭代字符串或使用列表推导来完成。
def string_diff(s1, s2):
diff = list(set(s1.split()) - set(s2.split()))
diff2 = [c1 for c1 in s1.split() if c1 not in s2.split()]
print(f"Words in S1 but not in S2 are : {diff}")
print(f"Words in S1 but not in S2 are : {diff2}")
s1 = "the first column F will contain only 5 uniques values"
s2 = "the second column S will contain only 3 uniques values"
string_diff(s1, s2)
Output:
Words in S1 but not in S2 are : ['5', 'first', 'F']
Words in S1 but not in S2 are : ['first', 'F', '5']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.