简体   繁体   English

如何将一个长字符串中的两个单词组合成一个并找到它们的关联句子?

[英]How to combine two words in a long string into one & find their associated sentence?

Given a very long string -给定一个很长的字符串-

"Given the large category of plants, the split ratio was determined to be 88.4. However, we're not sure if the split ratio was consistent across all subcategories or just a calculated average. If however, it deviated, it would be nonetheless, quite strange. “鉴于植物种类繁多,分流比被确定为 88.4。但是,我们不确定所有子类别的分流比是否一致,或者只是计算出的平均值。但是,如果它偏离了,它仍然是,很奇怪。

The words - split ratio .分比 In the output, I want them to appear as split-ratio (as a single word) and I also only want to retain sentences where these words occur.在 output 中,我希望它们显示为拆分比例(作为单个单词),我也只想保留这些单词出现的句子。 So in this case, only the first two sentences.所以在这种情况下,只有前两句话。

Is this possible?这可能吗?

You can use replace in a list comprehension:您可以在列表理解中使用replace

s = """Given the large category of plants, the split ratio was 
       determined to be 88.4. However, we're not sure 
       if the split ratio was consistent across all subcategories 
       or just a calculated average. If however, it deviated, 
       it would be nonetheless, quite strange."""

print('. '.join([x.replace('split ratio', 'split-ratio') for x in s.split('. ') if 'split ratio' in x]) + '.')

will print out only lines that contain 'split ratio' with each of them converted to 'split-ratio' .将仅打印包含'split ratio'行,每行都转换为'split-ratio'

Since python is in the tag line I expect you want it in that language right?由于 python 在标语行中,我希望您希望使用该语言对吗? And to be clear a simple find-replace in a normal text editor isn't going to solve this issue I suppose, you need actual logic to apply onto something.并且要清楚,在普通文本编辑器中进行简单的查找替换并不能解决我想的这个问题,你需要实际的逻辑来应用到某些东西上。

I would have to stop and look up python for a bit.我不得不停下来看看 python 一下。 But in any language the easiest way I can think of is to just parse out the file/stream and make the changes as you go.但是在任何语言中,我能想到的最简单的方法就是解析文件/流并在 go 时进行更改。 Read in the stream and look for the pattern you want a match for = "split ratio" - regardless, as you are reading in the stream, write out a new one that favors your changes.读入 stream 并寻找你想要匹配的模式 = “split ratio” - 无论如何,当你在读 stream 时,写出一个有利于你的改变的新模式。 But do it in the block size (or string length) of the pattern you are matching.但是在您匹配的模式的块大小(或字符串长度)中执行它。

When you find true for the pattern you are constantly comparing, stop.当你发现你不断比较的模式是正确的时,停下来。 Don't output that string, instead output the one you want to replace it with into the new target stream/file.不要 output 那个字符串,而是 output 那个你想用它替换到新目标流/文件中的字符串。

However, a search for python search and replace algorithm gives me this: https://www.geeksforgeeks.org/python-string-replace/但是,搜索 python 搜索和替换算法给了我这个: https://www.geeksforgeeks.org/python-string-replace/

Someone did the hard work for you already.已经有人为你做了艰苦的工作。 Love that super high level programming language that leaves folks in the dark as to what is actually happening.喜欢那种让人们对实际发生的事情一无所知的超高级编程语言。 Oh well.那好吧。

Enjoy.享受。

atomkey.原子键。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM