简体   繁体   English

删除所有出现的点,除了列表元素中的第一次出现

[英]Remove all occurrences of dot, except for the first occurence in the elements of a list

I have a list that I need to convert to floats. 我有一个列表需要转换为浮点数。 As the data is not inputted by me, there are elements that have an accidental extra period, for example 39.04.1450 . 由于数据不是我输入的,因此有些元素具有意外的额外期限,例如39.04.1450 I need to be able to automatically remove all of the periods except for the first one that appears so that I don't get an error when I say list=float(list) . 我需要能够自动删除所有周期,除了出现的第一个周期外,这样我说list=float(list)时就不会出错。

Sample list: 样本清单:

latitude= [' -86.57', ' 39.04.1450', ' 37.819' ,' 45.82', ' 54.42', ' 0.' ,' 53.330444',
  ' +45.75' ,' 52.36', ' 43.2167', ' -36.75', ' 6.8N' ,' 40.833' ,' -97.981',
  ' 41.720', ' 41.720', ' 37.41' ,' 37.41' ,' 37.41', ' 37.41']

As you can see, latitude[1] has an extra decimal point. 如您所见, latitude[1]有一个额外的小数点。 Of course, I will also need to strip the N in 6.8N but that is a separate problem. 当然,我还需要在6.8N6.8N N,但这是一个单独的问题。

I would do it like this: 我会这样做:

def fix_float(s):
    return s.replace('.', '[DOT]', 1).replace('.', '').replace('[DOT]', '.')

The function replaces the first occurrence of '.' 该函数替换了第一次出现的'.' with '[DOT]' . '[DOT]' Then, it removes all the ocurrences of '.' 然后,它将删除所有出现的'.' . Finally, it replaces '[DOT]' back to '.' 最后,它将'[DOT]'替换回'.' .

To apply it to all the elements of your list, write: 要将其应用于列表的所有元素,请输入:

fixed_latitudes = [fix_float(s) for s in latitude]
def my_float(s):
    s=s.split(".")
    return float(".".join([s[0],"".join(s[1:]))])

will split on . 将会分裂。 and rejoin only adding the first period ... it does not however do anything about -6.8N 并仅加入第一个句点就重新加入...但是, -6.8N

You can use regular expressions : 您可以使用正则表达式

import re  

pattern = re.compile(r'(\d+\.\d+)\.')
new_lst = [re.sub(pattern, r'\1', i).replace('N', '') for i in latitude]

\\d means any digit, + means one or more, \\. \\d表示任何数字, +表示一个或多个\\. matches the dot character. 匹配点字符。 The parenthesis is capturing that part of the match, and later on in the sub() is used as \\1 (meaning first capturing group). 括号捕获了匹配的那部分,随后在sub()中用作\\1 (表示第一个捕获组)。

A small hack if your corrupted data contains only N at the end and more than one . 如果您的损坏数据最后仅包含N且包含多个N ,则为小技巧. ... Else you've gotta add more except clauses ...否则,您必须添加更多除条款

latitude = [' -86.57', ' 39.04.1450', ' 37.819', ' 45.82', ' 54.42', ' 0.', ' 53.330444', ' +45.75', ' 52.36', ' 43.2167', ' -36.75', ' 6.8N', ' 40.833', ' -97.981', ' 41.720', ' 41.720', ' 37.41', ' 37.41', ' 37.41', ' 37.41']
flist = []
for i in latitude:
    try:
        flist.append(float(i))
    except ValueError:
        if (i[-1] == 'N'):
            flist.append(float(i[:-1]))
        else:
            flist.append(float("{}.{}".format(i.split(".")[0],''.join(i.split(".")[1:]))))

print (flist)

Output 输出量

[-86.57, 39.04145, 37.819, 45.82, 54.42, 0.0, 53.330444, 45.75, 52.36, 43.2167, -36.75, 6.8, 40.833, -97.981, 41.72, 41.72, 37.41, 37.41, 37.41, 37.41]

You can use regular expression to extract the numbers out of the list and convert them to floats right away. 您可以使用正则表达式从列表中提取数字并将其立即转换为浮点数。

import re
lat = lambda l: float(re.search('[+-]*\d*\.\d*',l).group(0))
print map(lat,latitude)

edit: 编辑:
Sorry, I haven't noticed, the digits following second decimal point are also valid. 抱歉,我没有注意到,第二个小数点后的数字也有效。 A new solution still expects the first dot is OK and all the rest are to be removed. 一个新的解决方案仍然希望第一个点可以,其余的都将被删除。

One of the values contain N, so I suppose there might be also S which means it's southern, ie negative latitude. 其中一个值包含N,因此我想可能还会有S,这表示它是南方,即负纬度。 Therefore I implemented this assumption into code. 因此,我将此假设实现为代码。

def valid_lat(s): a = re.findall('\\s*[+-]*\\d*\\.\\d*',s)[0] b = s.lstrip(a) d = b.replace('.','') c = re.sub('[nNsS]$','',d) sign = 1. if re.match('[sS]$',d):sign = -1. return (float(a + c))*sign

Then just map it: 然后只需mapmap
map(valid_lat,latitude)

What about this one? 这个如何?

def lol_float(_str):
    # check where decimal point is (starting from right) '3.45' -> 2
    dpi = (len(_str) - _str.count('.') - _str.index('.')) if '.' in _str else 0
    # '3.45' -> 345.0
    float_as_int = float(filter(lambda x: x.isdigit(), _str))
    # dpi = 2, float_as_int = 34.0 -> 3.45
    return float_as_int / (10 ** dpi)

Output: 输出:

>>> lol_float('3.34')
3.34
>>> lol_float('3.45')
3.45
>>> lol_float('345')
345.0
>>> lol_float('34.5')
34.5
>>> lol_float('3.4.5')
3.45
>>> lol_float('3.45')
3.45
>>> lol_float('345')
345.0
>>> lol_float('3.4..5')
3.45
>>> lol_float('3.4..5.4')
3.454

Just being original... :) 只是原创... :)

You can remove any letters using str.rstrip : 您可以使用str.rstrip删除任何字母:

from string import ascii_letters

out = []
for x in latitude:
    x = x.rstrip(ascii_letters)
    spl = x.split(".")
    if len(spl) > 2:
        out.append(float("{}.{}".format(spl[0],"".join(spl[1:]))))
    else:
        out.append(float(x)))
print(out)

[-86.57, 39041450.0, 37.819, 45.82, 54.42, 0.0, 53.330444, 45.75, 52.36, 43.2167, -36.75, 6.8, 40.833, -97.981, 41.72, 41.72, 37.41, 37.41, 37.41, 37.41]

You can do it in a single list comp but less efficiently: 您可以在单个列表中执行此操作,但是效率较低:

print([float(x[::-1].rstrip(ascii_letters).replace(".","")[::-1]) if x.count(".") > 1 else float(x.rstrip(ascii_letters)) for x in latitude ])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM