简体   繁体   English

如何在python中提取和保存字符前后的单词以及原始字符串

[英]How to extract and save words before & after a character along with the original string in python

Here is a string I need to extract from:这是我需要从中提取的字符串:

Tom and Jerry, Batman and Joker, Homer and Marge

The list goes on...and here is the end result I'm hoping to get as output (saved as CSV or something):清单还在继续……这是我希望得到的最终结果(保存为 CSV 或其他格式):

|Tom|
|Jerry|
|Tom and Jerry|
|Batman|
|Joker|
|Batman and Joker|
|Homer|
|Marge|
|Homer and Marge|

I know I can do .split(",") to get to Tom and Jerry and .split("and") to further separate to Tom, Jerry.我知道我可以使用.split(",")来找到Tom and Jerry ,使用.split("and")来进一步区分汤姆和杰瑞。
However, how can I keep all three records?但是,我怎样才能保留所有三个记录?

Thanks谢谢

str.split returns list instance and list instance doesn't have a split function. str.split返回list instance ,而list instance没有拆分功能。 Each different variable is required to take advantage of the execution results of an individual function.每个不同的变量都需要利用单个函数的执行结果。

text = "Tom and Jerry, Batman and Joker, Homer and Marge"
result = list()
for text_and in text.split(', '):
    if ' and ' in text_and:  # If 'and' doesn't exist in some of input data,
        for text_name in text_and.split(' and '):
            print(f"|{text_name}|")
            result.append(text_name)
    print(f"|{text_and}|")
    result.append(text_and)
|Tom|
|Jerry|
|Tom and Jerry|
|Batman|
|Joker|
|Batman and Joker|
|Homer|
|Marge|
|Homer and Marge|

Here is one-line code using itertools.chain function.这是使用itertools.chain函数的一行代码。

from itertools import chain
result = list(chain(*[[*text_and.split(' and '), text_and] if ' and ' in text_and else [text_and] for text_and in text.split(', ')]))
# result
['Tom', 'Jerry', 'Tom and Jerry', 'Batman', 'Joker', 'Batman and Joker', 'Homer', 'Marge', 'Homer and Marge']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM