[英]Remove all occurrences of dot, except for the first occurence in the elements of a list
I have a list that I need to convert to floats. 我有一个列表需要转换为浮点数。 As the data is not inputted by me, there are elements that have an accidental extra period, for example
39.04.1450
. 由于数据不是我输入的,因此有些元素具有意外的额外期限,例如
39.04.1450
。 I need to be able to automatically remove all of the periods except for the first one that appears so that I don't get an error when I say list=float(list)
. 我需要能够自动删除所有周期,除了出现的第一个周期外,这样我说
list=float(list)
时就不会出错。
Sample list: 样本清单:
latitude= [' -86.57', ' 39.04.1450', ' 37.819' ,' 45.82', ' 54.42', ' 0.' ,' 53.330444',
' +45.75' ,' 52.36', ' 43.2167', ' -36.75', ' 6.8N' ,' 40.833' ,' -97.981',
' 41.720', ' 41.720', ' 37.41' ,' 37.41' ,' 37.41', ' 37.41']
As you can see, latitude[1]
has an extra decimal point. 如您所见,
latitude[1]
有一个额外的小数点。 Of course, I will also need to strip the N in 6.8N
but that is a separate problem. 当然,我还需要在
6.8N
中6.8N
N,但这是一个单独的问题。
I would do it like this: 我会这样做:
def fix_float(s):
return s.replace('.', '[DOT]', 1).replace('.', '').replace('[DOT]', '.')
The function replaces the first occurrence of '.'
该函数替换了第一次出现的
'.'
with '[DOT]'
. 与
'[DOT]'
。 Then, it removes all the ocurrences of '.'
然后,它将删除所有出现的
'.'
. 。 Finally, it replaces
'[DOT]'
back to '.'
最后,它将
'[DOT]'
替换回'.'
. 。
To apply it to all the elements of your list, write: 要将其应用于列表的所有元素,请输入:
fixed_latitudes = [fix_float(s) for s in latitude]
def my_float(s):
s=s.split(".")
return float(".".join([s[0],"".join(s[1:]))])
will split on . 将会分裂。 and rejoin only adding the first period ... it does not however do anything about
-6.8N
并仅加入第一个句点就重新加入...但是,
-6.8N
You can use regular expressions : 您可以使用正则表达式 :
import re
pattern = re.compile(r'(\d+\.\d+)\.')
new_lst = [re.sub(pattern, r'\1', i).replace('N', '') for i in latitude]
\\d
means any digit, +
means one or more, \\.
\\d
表示任何数字, +
表示一个或多个\\.
matches the dot character. 匹配点字符。 The parenthesis is capturing that part of the match, and later on in the
sub()
is used as \\1
(meaning first capturing group). 括号捕获了匹配的那部分,随后在
sub()
中用作\\1
(表示第一个捕获组)。
A small hack if your corrupted data contains only N
at the end and more than one .
如果您的损坏数据最后仅包含
N
且包含多个N
,则为小技巧.
... Else you've gotta add more except clauses ...否则,您必须添加更多除条款
latitude = [' -86.57', ' 39.04.1450', ' 37.819', ' 45.82', ' 54.42', ' 0.', ' 53.330444', ' +45.75', ' 52.36', ' 43.2167', ' -36.75', ' 6.8N', ' 40.833', ' -97.981', ' 41.720', ' 41.720', ' 37.41', ' 37.41', ' 37.41', ' 37.41']
flist = []
for i in latitude:
try:
flist.append(float(i))
except ValueError:
if (i[-1] == 'N'):
flist.append(float(i[:-1]))
else:
flist.append(float("{}.{}".format(i.split(".")[0],''.join(i.split(".")[1:]))))
print (flist)
Output 输出量
[-86.57, 39.04145, 37.819, 45.82, 54.42, 0.0, 53.330444, 45.75, 52.36, 43.2167, -36.75, 6.8, 40.833, -97.981, 41.72, 41.72, 37.41, 37.41, 37.41, 37.41]
You can use regular expression to extract the numbers out of the list and convert them to floats right away. 您可以使用正则表达式从列表中提取数字并将其立即转换为浮点数。
import re
lat = lambda l: float(re.search('[+-]*\d*\.\d*',l).group(0))
print map(lat,latitude)
edit: 编辑:
Sorry, I haven't noticed, the digits following second decimal point are also valid. 抱歉,我没有注意到,第二个小数点后的数字也有效。 A new solution still expects the first dot is OK and all the rest are to be removed.
一个新的解决方案仍然希望第一个点可以,其余的都将被删除。
One of the values contain N, so I suppose there might be also S which means it's southern, ie negative latitude. 其中一个值包含N,因此我想可能还会有S,这表示它是南方,即负纬度。 Therefore I implemented this assumption into code.
因此,我将此假设实现为代码。
def valid_lat(s): a = re.findall('\\s*[+-]*\\d*\\.\\d*',s)[0] b = s.lstrip(a) d = b.replace('.','') c = re.sub('[nNsS]$','',d) sign = 1. if re.match('[sS]$',d):sign = -1. return (float(a + c))*sign
Then just map
it: 然后只需
map
其map
:
map(valid_lat,latitude)
What about this one? 这个如何?
def lol_float(_str):
# check where decimal point is (starting from right) '3.45' -> 2
dpi = (len(_str) - _str.count('.') - _str.index('.')) if '.' in _str else 0
# '3.45' -> 345.0
float_as_int = float(filter(lambda x: x.isdigit(), _str))
# dpi = 2, float_as_int = 34.0 -> 3.45
return float_as_int / (10 ** dpi)
Output: 输出:
>>> lol_float('3.34')
3.34
>>> lol_float('3.45')
3.45
>>> lol_float('345')
345.0
>>> lol_float('34.5')
34.5
>>> lol_float('3.4.5')
3.45
>>> lol_float('3.45')
3.45
>>> lol_float('345')
345.0
>>> lol_float('3.4..5')
3.45
>>> lol_float('3.4..5.4')
3.454
Just being original... :) 只是原创... :)
You can remove any letters using str.rstrip
: 您可以使用
str.rstrip
删除任何字母:
from string import ascii_letters
out = []
for x in latitude:
x = x.rstrip(ascii_letters)
spl = x.split(".")
if len(spl) > 2:
out.append(float("{}.{}".format(spl[0],"".join(spl[1:]))))
else:
out.append(float(x)))
print(out)
[-86.57, 39041450.0, 37.819, 45.82, 54.42, 0.0, 53.330444, 45.75, 52.36, 43.2167, -36.75, 6.8, 40.833, -97.981, 41.72, 41.72, 37.41, 37.41, 37.41, 37.41]
You can do it in a single list comp but less efficiently: 您可以在单个列表中执行此操作,但是效率较低:
print([float(x[::-1].rstrip(ascii_letters).replace(".","")[::-1]) if x.count(".") > 1 else float(x.rstrip(ascii_letters)) for x in latitude ])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.