[英]How to extract certain parts of string?
I have a string of the form我有一串表格
11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 1 0 0 0 1 0 8 0 8 7 5 3 355243301884671724 17.0 15.0
and would like to write this to a csv-File in the form并想将其写入表单中的 csv 文件
1,1,1663.31578,6.045e-26,0.6292,0.0698,0.304,2724.0415,0.64,-0.00955
The only way I know how to do this in python is to to do something of the form我知道如何在 python 中执行此操作的唯一方法是执行以下形式的操作
import csv
s = "11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 1 0 0 0 1 0 8 0 8 7 5 3 355243301884671724 17.0 15.0"
with open(<path to output_csv>, "w") as csv_file:
writer = csv.writer(csv_file, delimiter=',')
for line in data:
writer.writerow([s[0:2], s[2], ..., s[59:68]])
This works of course but seems like a very unsophisticated way to do this.这当然有效,但似乎是一种非常简单的方法。 Are there any better options?
有没有更好的选择?
If your string has space between each element, easiest way would be:如果您的字符串在每个元素之间有空格,最简单的方法是:
s = "11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 1 0 0 0 1 0 8 0 8 7 5 3 355243301884671724 17.0 15.0"
s = [x for x in s.split(" ") if x != ""]
csv_string = ",".join(s)
This works even if there is multiple spaces between elements, like in the example.即使元素之间有多个空格,这仍然有效,如示例中所示。
--- EDIT --- - - 编辑 - -
According to conversation elements has fixed breakpoints.根据会话元素有固定的断点。 So that info could be used like this.
这样就可以像这样使用该信息。
s = "11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 1 0 0 0 1 0 8 0 8 7 5 3 355243301884671724 17.0 15.0"
breakpoints = [1,2,14,24,34,40,44,55,58,65]
breakpoints.insert(0,0) # we need starting zero to make for loop work
elements = []
for i in range(len(breakpoints)-1):
elements.append(s[breakpoints[i]:breakpoints[i+1]].strip())
",".join(elements)
This method also get rid of extra whitespaces because it is stripping substring before it is inserted to the elements list.此方法还消除了额外的空格,因为它在将 substring 插入到元素列表之前将其剥离。
If this:如果这:
11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550
Is really supposed to be this:真的应该是这样的:
s = "11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64 -.009550"
Then it is very easy:然后就很简单了:
print(s.split(" "))
Otherwise you will need to do the last bit of splitting manually:否则,您将需要手动进行最后一点拆分:
s = " 11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 "
parts = s.split(" ")
last = parts.pop().split("-")
parts += last
print(parts)
You can use s.split()
for splitting the string by whitespaces.您可以使用
s.split()
按空格拆分字符串。
>>> s = " 11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550 1 0 0 0 1 0 8 0 8 7 5 3 355243301884671724 17.0 15.0"
>>> s.split()
['11', '1663.315780', '6.045E-26', '6.292E-01.06980.304', '2724.04150.64-.009550', '1', '0', '0', '0', '1', '0', '8', '0', '8', '7', '5', '3', '355243301884671724', '17.0', '15.0']
Here a possible solution using pandas. You really need the element to be space separated.这里有一个可能的解决方案,使用 pandas。您确实需要用空格分隔元素。
import pandas as pd
txt = "1 1 1663.315780 6.045E-26 6.292E-01 .06980 .304 2724 .04150 .64 -.009550"
# split and put the list to a dataframe
df = pd.DataFrame({"a": txt.split(" ")})
# convert to numeric
df["a"] = pd.to_numeric(df["a"])
# save to csv
df.to_csv("file.csv", index=False)
you can do你可以做
sample considering样品考虑
c = '11 1663.315780 6.045E-26 6.292E-01.06980.304 2724.04150.64-.009550'
d = c.replace(" ", ",")
print(d)
will give会给
11,1663.315780,6.045E-26,6.292E-01.06980.304,2724.04150.64-.009550
or或者
print(c.split(" "))
will give会给
['11', '1663.315780', '6.045E-26', '6.292E-01.06980.304', '2724.04150.64-.009550']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.