This question concerns iterating over, and differentiating of, vector variables of parametric size in sympy. They are not given, and never will be.
For instance, take the following simple setup:
I have M vectors x, each of dimension K. and I have a weights vector w, also of dimension K. (M and K will never take on a value, they will just stay M and K).
My function sums over the x vectors, and then performs a dot-product of the sum with the weights w:
f = dot(sum(x),w)
The function's derivative wrt w_i should be sum(x_i), i = 1..M
How do I code that in sympy? both the summation, and the differentiation.
Since Francesco's great answers made clear that this is currently not possible, I'm looking for a workaround. In other words: is it possible to:
I'm already kind of doing step 4 manually by breaking the derivative into simpler terms to make it more readable, and then looking for 'patterns'. I'm using this code taken from this answer for readability:
tmpsyms = numbered_symbols("tmp")
symbols_list, assignements = cse((grad_0), symbols=tmpsyms)
for (symb, ass) in symbols_list:
print (str(symb) + ' = ' + str(ass))
print(assignements)
You can use the new tensor-array module introduced in SymPy version 1.0.
I suppose that your K and M parameters are numbers, not symbols (otherwise I'd suggest to use sympy.tensor.Indexed ).
Consider this example: X is array of two vectors, each of length 3. Therefore X is a tensor of rank 2 and shape (2, 3). I also chose a simple example of weight vector with symbols in it:
In [1]: from sympy import *
In [2]: from sympy.tensor.array import *
In [3]: var("a,b,c,d,e,f")
Out[3]: (a, b, c, d, e, f)
In [4]: X = Array([[a, b, c], [d, e, f]])
In [5]: var("w1,w2,w3")
Out[5]: (w1, w2, w3)
In [6]: W = Array([w1, w2, w3])
Create now a product tensor with three indices (2 from X , 1 from W ):
In [7]: tp = tensorproduct(X, W)
In [8]: tp
Out[8]: [[[a*w1, a*w2, a*w3], [b*w1, b*w2, b*w3], [c*w1, c*w2, c*w3]], [[d*w1, d*w2, d*w3], [e*w1, e*w2, e*w3], [f*w1, f*w2, f*w3]]]
In [9]: tp.shape
Out[9]: (2, 3, 3)
Let's sum over the 2nd and 3rd indices (index 1 and 2 in Python notation, as indices start from zero), this is equivalent to what you call dot product:
In [10]: tensorcontraction(tp, (1, 2))
Out[10]: [a*w1 + b*w2 + c*w3, d*w1 + e*w2 + f*w3]
The same expression can be summed over:
In [12]: stc = sum(tensorcontraction(tp, (1, 2)))
In [13]: stc
Out[13]: a*w1 + b*w2 + c*w3 + d*w1 + e*w2 + f*w3
For derivatives of arrays, you could use derive_by_array(...) . It will create a tensor of higher rank, with each of its components derived by a component of the latter argument:
In [14]: derive_by_array(stc, W)
Out[14]: [a + d, b + e, c + f]
EDIT
Since it has now been specified that the parameters M and K are symbolic, I'll add this part.
Declare X and W as IndexBase :
In [1]: X = IndexedBase("X")
In [2]: W = IndexedBase("W")
In [3]: var("i,j,M,K", integer=True)
Out[3]: (i, j, M, K)
Your expression is the sum of the product XW over indices i and j , expressed as follows:
In [4]: s = Sum(X[i, j]*W[j], (i, 1, M), (j, 1, K))
In [5]: s
Out[5]:
K M
___ ___
╲ ╲
╲ ╲ W[j]⋅X[i, j]
╱ ╱
╱ ╱
‾‾‾ ‾‾‾
j = 1 i = 1
Now, one would derive s.diff(W[j])
or with a different index s.diff(W[k])
, unfortunately this has not yet been implemented in SymPy. There was a PR on github, which would add support for the derivative of indexed objects, but it hasn't been merged so far: https://github.com/sympy/sympy/pull/9314
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.