简体   繁体   中英

Find max r-value**2 in python

I have a (x,y) dataset, and I would like to calculate the r_value**2 for every 10 elements (so between element 0 and 9, between 1 and 10, ..., between n-10 and n).

Ideally the code should give out the r_value**2_max and save all r -values in a list. I've made a loop, but don't know how to tell stats.linregress to look between test_i and test_i+10 and save all r-values**2 in a list.

So far, I have this:

import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
import csv


path = '/storage/.../01_python_in/'

test = np.loadtxt(path + 'sample_data.txt', skiprows=0)

test_min = 0
test_max = len(test)

for test_i in range(test_min, test_max-10):
    slope, intercept, r_value, p_value, std_err = stats.linregress(test[:, 0], test[:, 1])
    print 'i:', test_i, 'r**2:', r_value**2

The way to manually implement this is to slice the first dimension of your array from test_i to test_i + 10 , like this:

linregress(test[test_i:test_i+window, 0], test[test_i:test_i+window, 1])

Actually, you don't have to split apart the x and y parts for linregress :

linregress(test[test_i:test_i+window])

You could also save the r_values by building a list in your loop. This, along with the above is shown here:

window = 10
r_values = []
for test_i in range(len(test)-window):
    slope, intercept, r_value, p_value, std_err = \
            stats.linregress(test[test_i:test_i + window])
    r_values.append(r_value)
    print 'i:', test_i, 'r**2:', r_value**2

It's actually simple enough for a list comprehension:

r_values = [stats.linregress(test[i:i+w]).rvalue for i in range(len(test)-w)]

You can get the squares then with:

r_values = np.asarray(r_values)
r_values2 = r_values**2

And the max i with:

max_i = np.argmax(r_values2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM