简体   繁体   English

将字符串的各个部分附加到Python中的列表中

[英]Append sections of string to list in Python

I have a particularly long, nasty string that looks something like this: 我有一个特别长而讨厌的字符串,看起来像这样:

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '

and so on. 等等。 The key defining feature is that each "nameOfString" is followed by a \\n with two spaces after it. 关键的定义功能是每个“ nameOfString”后跟一个\\n ,后跟两个空格。 The first nameOfString has two spaces in front of it as well. 第一个nameOfString前面也有两个空格。

I'm trying to create a list that would look something like this: 我正在尝试创建一个看起来像这样的列表:

niceList = [nameOfString1, Inc_(stuff), nameOfString2, Inc_(Stuff)] and so on. niceList = [nameOfString1, Inc_(stuff), nameOfString2, Inc_(Stuff)]等。

I've tried to use newString = nastyString.split() as well as newString = nastyString.replace('\\n ', '') , but ultimately, these solutions can't work because each nameOfString has a space after the comma and before the 'I' of Inc. Furthermore, not all the nameOfStrings have an 'Inc,' but most do have some sort of space in their name. 我尝试使用newString = nastyString.split()以及newString = nastyString.replace('\\n ', '') ,但是最终,这些解决方案无法正常工作,因为每个nameOfString在逗号和空格后都有一个空格,并且此外,并非所有的nameOfStrings都有一个“ Inc”,但大多数名称中确实都有某种空格。

Would really appreciate some guidance or direction on how I could tackle this issue, thanks! 非常感谢您提供有关如何解决此问题的指导或指导,谢谢!

May be you can try something like this. 也许您可以尝试这样的事情。

 [word for word in nastyString.replace("\n", "").replace(",", "").strip().split(' ') if word !='']

Output: 输出:

['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']
nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
# replace '\n' with ','
nastyString = nastyString.replace('\n', ',')
# split at ',' and `strip()` all extra spaces
niceList = [v.strip() for v in nastyString.split(',') if v.strip()]

output: 输出:

niceList
['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']

Update: OP shared new input: 更新:OP共享新输入:

That's awesome, never knew about the strip function. 太棒了,从不知道剥离功能。 However, I actually am trying to including the "Inc" section, so I was hoping for output of: ['nameOfString1, Inc_(stuff)', 'nameOfString2, Inc_(stuff)'] and so on, any advice? 但是,我实际上试图包括“ Inc”部分,因此我希望输出以下内容:['nameOfString1,Inc_(stuff)','nameOfString2,Inc_(stuff)'],依此类推,有什么建议吗?

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
niceList = [v.strip() for v in nastyString.split('\n') if v.strip()]

new output: 新的输出:

niceList
['nameOfString1, Inc_(stuff)', 'nameOfString2, Inc_(stuff)']

You can use regular expressions: 您可以使用正则表达式:

import re

nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '

new_string = [i for i in re.split("[\n\s,]", nastyString) if i]

Output: 输出:

['nameOfString1', 'Inc_(stuff)', 'nameOfString2', 'Inc_(stuff)']

if you don't like to replacing '\\n' do this : 如果您不想替换'\\n'请执行以下操作:

import re
nastyString = '  nameOfString1, Inc_(stuff)\n  nameOfString2, Inc_(stuff)\n  '
word =re.findall(r'.',nastyString)
s=""
for i in word:
     s+=i
print s

output :'nameOfString1, Inc_(stuff) nameOfString2, Inc_(stuff) ' 输出:'nameOfString1,Inc_(东西)nameOfString2,Inc_(东西)'

now you can use split() 现在您可以使用split()

print s.split(',')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM