简体   繁体   English

将浮点数转换为字符串 numba python numpy 数组

[英]convert float to string numba python numpy array

I am running a @nb.njit function within which I am trying to put an integer within a string array.我正在运行@nb.njit function,我试图将 integer 放入字符串数组中。

import numpy as np
import numba as nb

@nb.njit(nogil=True)
def func():
    my_array = np.empty(6, dtype=np.dtype("U20"))
    my_array[0] = np.str(2.35646)
    return my_array


if __name__ == '__main__':
    a = func()
    print(a)

I am getting the following error:我收到以下错误:

numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<class 'str'>) with argument(s) of type(s): (float64)

Which function am I supposed to use to do the conversion from float to string within numba ?我应该使用哪个 function 在numba中进行从floatstring的转换?

The numpy.str function is not supported so far.目前不支持numpy.str function。 A list of all the supported numpy functions is available on Numba's website. Numba 的网站上提供了所有受支持的numpy功能的列表。

The built-in str is not supported either.也不支持内置的str This can be checked on the supported Python features page .这可以在受支持的 Python 功能页面上进行检查。

The only way to do what you are trying would be to somehow make a function that converts a float to a string, using only the features of Python and Numpy supported by Numba.做你正在尝试的唯一方法是以某种方式制作一个将浮点数转换为字符串的 function,仅使用 NPython 和 Numpy 支持的功能。

Before going in this direction, I would nevertheless reconsider the necessity to convert floats into strings.在朝着这个方向前进之前,我仍然会重新考虑将浮点数转换为字符串的必要性。 It may not be very efficient and you may lose the benefit of jitting a few functions by adding some overhead due to the conversion of floats to string.它可能不是很有效,并且由于将浮点数转换为字符串,您可能会通过增加一些开销而失去对一些函数进行抖动的好处。

Of course, this is hard to tell without knowing more about the project.当然,如果不了解该项目的更多信息,这很难说。

I wanted a float to string conversion that delivers a string like "23.45" with two fraction digits.我想要一个浮点到字符串的转换,它提供一个像“23.45”这样带有两个小数位的字符串。 My solution is this.我的解决方案是这样的。 Maybe it helps someone.也许它可以帮助某人。

    def floatToString(self, floatnumber:float32) -> str:
        stringNumber:str = ""
        whole:int = math.floor(floatnumber)
        frac:int = 0
        digits:float = float(floatnumber % 1)
        digitsTimes100:float = float(digits) * float(100.0)
        if digitsTimes100 is not None:
            frac = math.floor(digitsTimes100)
        stringNumber = str(whole)+"."+str(frac)
        return stringNumber

Be aware of rounding issues tho, but it was enough for me.请注意舍入问题,但这对我来说已经足够了。

With later version of Numba you can use str() on integers:使用更高版本的 Numba,您可以在整数上使用str()

import numba as nb


@nb.njit
def int2str(value):
    return str(value)


n = 10
print([int2str(x) for x in range(-n, n)])
# ['-10', '-9', '-8', '-7', '-6', '-5', '-4', '-3', '-2', '-1', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

n = 1000
%timeit [str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 391 µs per loop
%timeit [int2str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 813 µs per loop

This is slower than pure Python but it may come handy to ensure njit() acceleration.这比纯 Python 慢,但它可以方便地确保njit()加速。

The same functionality is not implemented yet for floats.浮点数尚未实现相同的功能。

A relatively generic approach is provided:提供了一种相对通用的方法:

import math
import numba as nb


@nb.njit
def cut_trail(f_str):
    cut = 0
    for c in f_str[::-1]:
        if c == "0":
            cut += 1
        else:
            break
    if cut == 0:
        for c in f_str[::-1]:
            if c == "9":
                cut += 1
            else:
                cut -= 1
                break
    if cut > 0:
        f_str = f_str[:-cut]
    if f_str == "":
        f_str = "0"
    return f_str


@nb.njit
def float2str(value):
    if value == 0.0:
        return "0.0"
    elif value < 0.0:
        return "-" + float2str(-value)
    else:
        max_digits = 16
        min_digits = -4
        e10 = math.floor(math.log10(value)) if value != 0.0 else 0
        if min_digits < e10 < max_digits:
            i_part = math.floor(value)
            f_part = math.floor((1 + value % 1) * 10.0 ** max_digits)
            i_str = str(i_part)
            f_str = cut_trail(str(f_part)[1:max_digits - e10])
            return i_str + "." + f_str
        else:
            m10 = value / 10.0 ** e10
            exp_str_len = 4
            i_part = math.floor(m10)
            f_part = math.floor((1 + m10 % 1) * 10.0 ** max_digits)
            i_str = str(i_part)
            f_str = cut_trail(str(f_part)[1:max_digits])
            e_str = str(e10)
            if e10 >= 0:
                e_str = "+" + e_str
            return i_str + "." + f_str + "e" + e_str

This is less precise and comparatively slower (by a ~3x factor) than pure Python:这与纯 Python 相比,精度较低且相对较慢(约 3 倍):

numbers = 0.0, 1.0, 1.000001, -2000.000014, 1234567890.12345678901234567890, 1234567890.12345678901234567890e10, 1234567890.12345678901234567890e-30, 1.234e-200, 1.234e200
k = 32
for number in numbers:
    print(f"{number!r:{k}}  {str(number)!r:{k}}  {float2str(number)!r:{k}}")
# 0.0                               '0.0'                             '0.0'                           
# 1.0                               '1.0'                             '1.0'                           
# 1.000001                          '1.000001'                        '1.000001'                      
# -2000.000014                      '-2000.000014'                    '-2000.0000139'                 
# 1234567890.1234567                '1234567890.1234567'              '1234567890.123456'             
# 1.2345678901234567e+19            '1.2345678901234567e+19'          '1.234567890123456e+19'         
# 1.2345678901234568e-21            '1.2345678901234568e-21'          '1.234567890123457e-21'         
# 1.234e-200                        '1.234e-200'                      '1.234e-200'                    
# 1.234e+200                        '1.234e+200'                      '1.2339e+200' 

%timeit -n 10 -r 10 [float2str(x) for x in numbers]
# 10 loops, best of 10: 18.1 µs per loop
%timeit -n 10 -r 10 [str(x) for x in numbers]
# 10 loops, best of 10: 6.5 µs per loop

but it may be used as a workaround until str() gets implemented for float arguments natively in Numba.但它可以用作解决方法,直到str()在 Numba 中本地实现浮点 arguments。

Note that the edge cases are handled relatively coarsely, and an error in the last 1-2 digits is relatively frequent.请注意,边缘情况处理相对粗略,最后 1-2 位的错误相对频繁。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM