[英]Matrix Approximation and Predicting Timeseries in Python/R with SVD
I have an excel file that is 126 rows and 5 columns full of numbers, I have to use that data and SVD methods to predict 5-10 more rows of data. 我有一个126行和5列充满数字的Excel文件,我必须使用该数据和SVD方法来预测5-10行以上的数据。 I have implemented SVD in Python successfully using numpy:
我已经使用numpy在Python中成功实现了SVD:
import numpy as np from numpy import genfromtxt 从numpy导入genfromtxt导入numpy作为np
my_data = genfromtxt('data.csv', delimiter=',')
U, s, V = np.linalg.svd(my_data)
print ("U:")
print (U)
print ("\nSigma:")
print (s)
print ("\nVT:")
print (V)
which outputs: 输出:
U:
[[-0.03339497 0.10018171 0.01013636 ..., -0.10076323 -0.09740801
-0.08901366]
[-0.02881809 0.0992715 -0.01239945 ..., -0.02920558 -0.04133748
-0.06100236]
[-0.02501102 0.10637736 -0.0528663 ..., -0.0885227 -0.05408083
-0.01678337]
...,
[-0.02418483 0.10993637 0.05200962 ..., 0.9734676 -0.01866914
-0.00870467]
[-0.02944344 0.10238372 0.02009676 ..., -0.01948701 0.98455034
-0.00975614]
[-0.03109401 0.0973963 -0.0279125 ..., -0.01072974 -0.0109425
0.98929811]]
Sigma:
[ 252943.48015512 74965.29844851 15170.76769244 4357.38062076
3934.63212778]
VT:
[[-0.16143572 -0.22105626 -0.93558846 -0.14545156 -0.16908786]
[ 0.5073101 0.40240734 -0.34460639 0.45443181 0.50541365]
[-0.11561044 0.87141558 -0.07426656 -0.26914744 -0.38641073]
[ 0.63320943 -0.09361249 0.00794671 -0.75788695 0.12580436]
[-0.54977724 0.14516905 -0.01849291 -0.35426346 0.74217676]]
But I am not sure how to use this data to preidct my values. 但是我不确定如何使用这些数据来体现我的价值观。 I am using this link http://datascientistinsights.com/2013/02/17/single-value-decomposition-a-golfers-tutotial/ as a reference but that is in R. At the end they use R to predict values but they use this command in R:
我使用此链接http://datascientistinsights.com/2013/02/17/single-value-decomposition-a-golfers-tutotial/作为参考,但在R中。最后,他们使用R来预测值,但他们在R中使用以下命令:
approxGolf_1 <- golfSVD$u[,1] %*% t(golfSVD$v[,1]) * golfSVD$d[1]
Here is the IdeOne link to the entire R code: http://ideone.com/Yj3y6j 这是指向整个R代码的IdeOne链接: http ://ideone.com/Yj3y6j
I'm not really familiar with R so can anyone let me know if there is a similar function in Python to the command above or explain what that command is doing exactly? 我对R并不是很熟悉,所以有人可以让我知道Python中是否有与上述命令相似的功能,或者可以解释该命令的功能吗?
Thanks. 谢谢。
I will use the golf course example data you linked, to set the stage: 我将使用您链接的高尔夫球场示例数据来设置舞台:
import numpy as np
A=np.matrix((4,4,3,4,4,3,4,2,5,4,5,3,5,4,5,4,4,5,5,5,2,4,4,4,3,4,5))
A=A.reshape((3,9)).T
This gives you the original 9 rows, 3 columns table with scores of 9 holes for 3 players: 这为您提供了原始的9行3列表格,其中有3个玩家的9洞得分:
matrix([[4, 4, 5],
[4, 5, 5],
[3, 3, 2],
[4, 5, 4],
[4, 4, 4],
[3, 5, 4],
[4, 4, 3],
[2, 4, 4],
[5, 5, 5]])
Now the singular value decomposition: 现在奇异值分解:
U, s, V = np.linalg.svd(A)
The most important thing to investigate is the vector s
of singular values: 调查最重要的是矢量
s
奇异值:
array([ 21.11673273, 2.0140035 , 1.423864 ])
It shows that the first value is much bigger than the others, indicating that the corresponding Truncated SVD with only one value represents the original matrix A
quite well. 它表明第一个值比其他值大得多,表明只有一个值的相应截断SVD很好地表示了原始矩阵
A
To calculate this representation, you take column 1 of U
multiplied by the first row of V
, multiplied by the first singular value. 要计算此表示形式,您需要将
U
的第1列乘以V
的第一行,再乘以第一个奇异值。 This is what the last cited command in R does. 这就是R中最后引用的命令。 Here is the same in Python:
这与Python相同:
U[:,0]*s[0]*V[0,:]
And here is the result of this product: 这是此产品的结果:
matrix([[ 3.95411864, 4.64939923, 4.34718814],
[ 4.28153222, 5.03438425, 4.70714912],
[ 2.42985854, 2.85711772, 2.67140498],
[ 3.97540054, 4.67442327, 4.37058562],
[ 3.64798696, 4.28943826, 4.01062464],
[ 3.69694905, 4.3470097 , 4.06445393],
[ 3.34185528, 3.92947728, 3.67406114],
[ 3.09108399, 3.63461111, 3.39836128],
[ 4.5599837 , 5.36179782, 5.0132808 ]])
Concerning the vector factors U[:,0]
and V[0,:]
: Figuratively speaking, U
can be seen as a representation of a hole's difficulty, while V
encodes a player's strength. 关于矢量因子
U[:,0]
和V[0,:]
:从形象上讲, U
可以看作是球洞难度的代表,而V
编码玩家的力量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.