[英]match float with comma and then replace comma with dot?
I have a text file which contains some different types of numbers: integers, binary and float. 我有一个文本文件,其中包含一些不同类型的数字:整数,二进制和浮点数。 I want to match only float numbers and replace comma with dot.
我只想匹配浮点数,并用点替换逗号。
example of my text file (there order is casual): 我的文本文件示例(顺序是随意的):
1000101 33434,34 1992 [3,41,4,5]
after conversion : 转换后:
1000101 33434.34 1992 [3,41,4,5]
my code is : 我的代码是:
lines = []
in_file = open("input.txt", "r")
for line in in_file:
line = line.split(" ")
for x in line:
try:
if isinstance(float(x.replace(',', '.')), float):
line[line.index(x)] = float(x.replace(',', '.'))
except:
pass
lines.append(line)
in_file.close()
but that will convert all other data to float, so what is the best way to resolve this? 但这会将所有其他数据转换为浮点数,那么解决此问题的最佳方法是什么?
I thought to use regex
but I don't know how to do it in python. 我以为使用
regex
但我不知道如何在python中进行。
Another approach, using regular expressions as well: 另一种使用正则表达式的方法:
import re
with open('input.txt', 'r+') as f:
newf = re.sub(r'(\s+[+-]?[0-9]+),([0-9]+\s+)',r'\1.\2', f.read())
f.seek(0)
f.write(newf)
test file: 测试文件:
1000101 33434,34 1992 [3,41,4,5]
12,43 129012 91 [1,2]
1000101 33434,34 1992 [3, 41,4,5]
result: 结果:
1000101 33434.34 1992 [3,41,4,5]
12.43 129012 91 [1,2]
1000101 33434.34 1992 [3, 41,4,5]
Try this: 尝试这个:
import re
from ast import literal_eval
FLOAT_RE = re.compile('^\d+,\d+$')
lines = []
with open("input.txt", "r") as in_file:
for line in in_file:
line = line.strip().split(" ")
for x in line:
i = line.index(x)
if FLOAT_RE.match(x):
x = x.replace(',', '.')
line[i] = literal_eval(x)
lines.append(line)
print lines
this should work for you: 这应该为您工作:
lines = []
in_file = open("input.txt", "r")
for line in in_file:
line = line.split(" ")
for x in line:
tmp = eval(x)
if isinstance(tmp, tuple):
line[line.index(x)] = float(float(x.replace(',', '.')))
else:
line[line.index(x)] = tmp
lines.append(line)
in_file.close()
it will convert everything to the right type 它将所有内容转换为正确的类型
if all your strings are in the same format you can sub only the first occurrence of ,
: 如果您的所有字符串都可以分仅第一次出现相同的格式
,
:
s = "1000101 33434,34 1992 [3,41,4,5]"
print re.sub(",",".",s,1)
1000101 33434.34 1992 [3,41,4,5]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.