简体   繁体   中英

Scale Data to log-normal. Is my approach right?

I have a one dimensional Array where the datas are between 1 and 500. The distribution of the data looks like log-normal.

What i want is to resample the array to log(data)

i am not sure about which function to use:

numpy.log or numpy.log1p

my rescale Function looks like this now, but i am not sure if its right:

def ScaleData(dataset):
    datas = []
    for x in np.nditer(dataset):
        a = np.log(x)
        datas.append(a)
    return np.array(datas)

Test:

38, 48, 39, 83, 64, 57
goes to:
3.63758616,  3.87120101,  3.66356165,  4.41884061, 4.15888308,  4.04305127

Is that right?

  • if you want to fit your data to a log normal distribution : you should use scipy.stats.lognorm.fit(listofdata) and check the quality of the fitting with a Kalmogorov Smirnov test : scipy.stats.kstest
  • if you want to transform your data np.log(dataset) should be enough.

Best

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM