Introduction to Xerus

This section gives an overview for different Xerus functionality. For that let \begin{align} A \in \mathbb{R}^{n_1 \times \dots \times n_d} \end{align} be a tensor of order $d$ and $i$th-mode dimension $n_d$.

For $d = 4$ and $n_i = 2$ it could look like the following.

Creating a Tensor

In [87]:
import xerus
A = xerus.Tensor.random([2,2,2,2])
print(A)
0.196057 0.731576 / -0.706212 -0.966188 	-0.382603 -1.171656 / 0.039142 -0.381124 
-0.380762 -0.376121 / -0.968919 -0.922080 	-1.022959 -0.425644 / 0.336417 0.675811 

As you see xerus has a way to display the order $4$ dimensionality. Internally xerus stores tensors as an array, which is sorted in row major format. This is: \begin{align} A(i_1,\dots,i_d) = \sum_{k=1}^d (\prod_{l=k+1}^{d} n_l )i_k \end{align} With this you can excess elements directly:

In [88]:
print(A[[0,0,0,0]])              # first entry in all dimensions
print(A[0])                      # first entry in 1D array format
print(A[[0,0,1,1]])              # Entry (0, 0, 1, 1)
0.19605651514632025
0.19605651514632025
-0.9661883112538733

In this sense Tensors are just arrays. But xerus can do more, later. The two standard creations of tensors are the following:

all zeros

In [89]:
A = xerus.Tensor([2,2,2])
print(A)
0.000000 / 0.000000 	0.000000 / 0.000000 
0.000000 / 0.000000 	0.000000 / 0.000000 

identity

In [24]:
A = xerus.Tensor.identity([2,2])
print(A)
A = xerus.Tensor.identity([2,2,2,2])
print(A)
1.000000 	0.000000 
0.000000 	1.000000 

1.000000 0.000000 / 0.000000 0.000000 	0.000000 1.000000 / 0.000000 0.000000 
0.000000 0.000000 / 1.000000 0.000000 	0.000000 0.000000 / 0.000000 1.000000 

Doing stuff with Tensors

In all above cases, the frobenius Norm \begin{align} \| A \|_F = \sqrt{\sum_i A(i_1,\dots,i_d)^2} \end{align}

can be calculated like this:

In [25]:
A = xerus.Tensor.random([2,2,2,2])
print(A.frob_norm())
3.894055873814064

Tensors are linear objects, so you can add them (if the dimensions agree) and do scalar multiplication:

In [31]:
A = xerus.Tensor.random([2,2,2])
print(A)
B = xerus.Tensor.random([2,2,2])
print(B)
print(A+2*B)
0.812391 / 0.659847 	0.363319 / -0.604920 
-0.785982 / -0.501271 	-0.328713 / -0.492868 

-1.157211 / 0.799950 	0.405037 / -1.980871 
1.115767 / -0.158546 	0.148285 / 0.588489 

-1.502030 / 2.259747 	1.173392 / -4.566663 
1.445552 / -0.818363 	-0.032143 / 0.684110 

But most importantly, you can multiply them. To do this you need xerus indices. These are magic indices which allow you to get rid of tideous summations. You can get a set of indices like this:

In [33]:
i,j,k,l = xerus.indices(4)

Now, if you want to do for example matrix vector multiplication you can do this:

In [38]:
A = xerus.Tensor([2,2])
A[[0,1]] = 1
A[[1,0]] = 1
print('A = \n '  + str(A))
x = xerus.Tensor([2])
x[[0]] = -1
x[[1]] = 1
print('x = \n '  + str(x))
b = xerus.Tensor()
b(i) << A(i,j) * x(j)
print('b = \n '  + str(b))
A = 
 0.000000 	1.000000 
1.000000 	0.000000 

x = 
 -1.000000 
1.000000 

b = 
 1.000000 
-1.000000 

But it does not stop here. You can do all weird kinds of multiplications and you only need to write Einstein Notation:

In [41]:
A = xerus.Tensor.ones([2,2,2,2])
print('A = \n '  + str(A))
x = xerus.Tensor([2,2,2])
x[[0,0,0]] = -1
x[[1,0,0]] = 1
x[[1,1,0]] = 2
print('x = \n '  + str(x))
b = xerus.Tensor()
b(i) << A(i,j,k,l) * x(j,k,l)
print('b = \n '  + str(b))
A = 
 1.000000 1.000000 / 1.000000 1.000000 	1.000000 1.000000 / 1.000000 1.000000 
1.000000 1.000000 / 1.000000 1.000000 	1.000000 1.000000 / 1.000000 1.000000 

x = 
 -1.000000 / 0.000000 	0.000000 / 0.000000 
1.000000 / 0.000000 	2.000000 / 0.000000 

b = 
 2.000000 
2.000000 

Also, you can do a basis transformation \begin{align} \Sigma = P^T A P \end{align} like this:

In [91]:
import numpy as np
A = xerus.Tensor([2,2])
A[[1,0]] = 1
A[[0,1]] = 1
P = xerus.Tensor([2,2])
P[[0,0]] = -1/np.sqrt(2)
P[[1,0]] = 1/np.sqrt(2)
P[[0,1]] = 1/np.sqrt(2)
P[[1,1]] = 1/np.sqrt(2)

S = xerus.Tensor()
s1,s2,s3,s4 = xerus.indices(4)
S(s1,s4) << P(s4,s3)*P(s1,s2)*A(s2,s3)

print('A = \n' + str(A))
print('P = \n' + str(P))
print('S = \n' + str(S))
A = 
0.000000 	1.000000 
1.000000 	0.000000 

P = 
-0.707107 	0.707107 
0.707107 	0.707107 

S = 
-1.000000 	0.000000 
0.000000 	1.000000 

As you can see transposing is a not that important concept in tensor mthematic. You just transform each index.

Doing more complex stuff

You can even do a singular value decomposition of a Tensor just by using the index notation. How you distribute the indices defines what kind of singular value decomposition you do want. I.e.: For $A\in \mathbb{R}^{4 \times 4 \times 3 \times 2} $ an SVD could look like that (depending on the chosen matrification, here $n_1n_2 \times n_3n_4$): \begin{align} A(i_1,i_2,i_3,i_4) = \sum_{k=1}^r U(i_1,i_2,k) S(k,k) V(k,i_3,i_4) = \sum_{k_1,k_2=1}^r U(i_1,i_2,k_1) S(k_1,k_2) V(k_2,i_3,i_4), \end{align} where $r$ is the rank of this matrification and $U$ and $V^t$ have orthogonal columns. This operation is the key to everything what happens on Tensors (HOSVD, HSVD etc.). And in xerus?

In [ ]:
U = xerus.Tensor()
S = xerus.Tensor()
V = xerus.Tensor()
i1,i2,i3,i4,k1,k2 = xerus.indices(6)

A = xerus.Tensor.ones([4,4,3,2])


(U(i1,i2,k1), S(k1,k2), V(k2,i3,i4)) << xerus.SVD(A(i1,i2,i3,i4))

print('S = \n '  + str(S))
print('U = \n '  + str(U))
print('V = \n '  + str(V))
A_test = xerus.Tensor()
A_test(i1,i2,i3,i4) << U(i1,i2,k1) * S(k1,k2) * V(k2,i3,i4)
print('Test if A can be recovered: ' + str((A - A_test).frob_norm()))

You can also choose the following matrification: \begin{align} A(i_1,i_2,i_3,i_4) = \sum_{k=1}^r U(i_1,k) S(k,k) V(k,i_2,i_3,i_4) = \sum_{k_1,k_2=1}^r U(i_1,k_1) S(k_1,k_2) V(k_2,i_2,i_3,i_4), \end{align} This translate in Xerus to:

In [94]:
A = xerus.Tensor.ones([4,4,3,2])


(U(i1,k1), S(k1,k2), V(k2,i2,i3,i4)) << xerus.SVD(A(i1,i2,i3,i4))

print('S = \n '  + str(S))
print('U = \n '  + str(U))
print('V = \n '  + str(V))
A_test = xerus.Tensor()
A_test(i1,i2,i3,i4) << U(i1,k1) * S(k1,k2) * V(k2,i2,i3,i4)
print('Test if A can be recovered: ' + str((A - A_test).frob_norm()))
S = 
 9.797959 

U = 
 -0.500000 
-0.500000 
-0.500000 
-0.500000 

V = 
 -0.204124 -0.204124 / -0.204124 -0.204124 / -0.204124 -0.204124 	-0.204124 -0.204124 / -0.204124 -0.204124 / -0.204124 -0.204124 	-0.204124 -0.204124 / -0.204124 -0.204124 / -0.204124 -0.204124 	-0.204124 -0.204124 / -0.204124 -0.204124 / -0.204124 -0.204124 

Test if A can be recovered: 7.71105639526996e-15

Tensor Trains (choo choo)

The Tensor Train format is most facinating. It helps you to find a tractable low rank representation (if possible) for a given Tensor. In formulars it looks like: \begin{align} A(i_1,\dots,i_d) = \sum_{k_1,\dots,k_{d-1} = 1}^{r_1,\dots,r_{d-1}} A_1(i_1,k_1) A_2(k_1,i_2,k_2)\dots A_{d-1}(k_{d-2},i_{d-1},k_{d-1}) A_d(k_{d-1},i_d) = A_1(i_1)\dots A_d(i_d) \end{align} The last representation is the reason why in physics and elsewhere it is called Matrix Product State (or MPS). In xerus you can create a random Tensor Train with 5 'legs' and dimension 3 and ranks 2 like this. The components are order 3 Tensors with ranks (2,3,2) except the first and last component. Below you see the components and their mode dimensions.

In [138]:
A_TT = xerus.TTTensor.random([3,3,3,3,3],[2,2,2,2])

for i in range(0,5):
    print(A_TT.get_component(i))
    print(A_TT.get_component(i).dimensions)
2.154159 / -5.196785 	-12.433336 / 10.713734 	-41.781168 / 10.995844 

[1, 3, 2]
-0.619392 / 0.321871 	-0.286526 / -0.403453 	0.236852 / 0.460198 
-0.466493 / -0.224586 	0.281018 / -0.456082 	-0.505104 / -0.435699 

[2, 3, 2]
-0.232436 / -0.054978 	0.457979 / -0.357575 	-0.068001 / -0.775063 
-0.011535 / -0.065424 	-0.016182 / 0.575916 	-0.790260 / -0.197826 

[2, 3, 2]
-0.827810 / -0.219559 	0.356596 / -0.245217 	-0.278378 / -0.041678 
-0.103579 / -0.424998 	-0.078879 / -0.320550 	0.778864 / 0.305032 

[2, 3, 2]
-0.179327 	-0.640862 	0.746416 
0.820885 	-0.515634 	-0.245498 

[2, 3, 1]

HT Tensors (here them grow (because they are trees))

They are still in beta in Xerus but are developed right now. The linearity of Tensor Trains might not be the best way of capturing the low rank approxibility of a given Tensor. What you see be low is the result of the intialization of a random HT Tensor and the application of the hierarchical SVD. All but the first Tensors are orthogonal with respect to a certain matrification.

In [110]:
A_HT = xerus.HTTensor.random([3,3,3,3],[2,2,2,2,2,2])

for i in range(0,7):
    print(A_HT.get_component(i))
    print(A_HT.get_component(i).dimensions)
110.995669 / 3.500361 	-7.966495 / -13.255195 

[1, 2, 2]
-0.988298 / 0.011566 	0.120790 / 0.092424 
0.076827 / -0.750537 	0.653468 / 0.061413 

[2, 2, 2]
-0.072096 / -0.055726 	-0.995821 / -0.006044 
0.477242 / -0.704921 	0.008080 / -0.524653 

[2, 2, 2]
-0.956455 	0.284378 	-0.065751 
-0.266123 	-0.942164 	-0.203727 

[2, 3]
-0.621611 	0.106156 	0.776100 
0.562669 	-0.628797 	0.536673 

[2, 3]
-0.099193 	-0.713339 	-0.693763 
0.130969 	-0.700496 	0.701536 

[2, 3]
-0.432024 	0.436448 	-0.789220 
-0.026002 	-0.880765 	-0.472839 

[2, 3]

To see the orthogonality

In [112]:
h1,h2,h3,h4,h5,h6 = xerus.indices(6)

for i in range(1,3):
    res = xerus.Tensor()
    tmp = A_HT.get_component(i)
    res(h1,h2) << tmp(h1,h3^2)*tmp(h2,h3^2)
    print(str(i) +'th component = \n' + str(res))

for i in range(3,7):
    res = xerus.Tensor()
    tmp = A_HT.get_component(i)
    res(h1,h2) << tmp(h1,h3)*tmp(h2,h3)
    print(str(i) +'th component = \n' + str(res))
    
1th component = 
1.000000 	-0.000000 
-0.000000 	1.000000 

2th component = 
1.000000 	-0.000000 
-0.000000 	1.000000 

3th component = 
1.000000 	-0.000000 
-0.000000 	1.000000 

4th component = 
1.000000 	0.000000 
0.000000 	1.000000 

5th component = 
1.000000 	-0.000000 
-0.000000 	1.000000 

6th component = 
1.000000 	-0.000000 
-0.000000 	1.000000 

Furthermore, the norm of the core (root) tensor is the same as the norm of the whole Tensor.

In [111]:
root = A_HT.get_component(0)
print(root.frob_norm())
print(A_HT.frob_norm())
112.1225061996774
112.1225061996774

A simple ALS implementation on the Tensor Train format

To do linear algebra on tensors formats like tensor train you can use alternating schemes, like ALS (alternating least squared). You iteratively project your big problem on each component solve there and then substitute the component with the 'local' solution. For some reason that works quite well! (It can be shown that the residuum is monotonic decreasing, but that does not imply convergence all the time).

First we import numpy and specify the number of components of our Tensor train (dim), the size of each 'leg' (n) the maxmal number of ranks of our operator (r) and our solution (r+1), the error magin (eps) and the maximal number of iterations (max_iter).

In [96]:
import numpy as np
dim = 8
n = 4
r = 2
eps = 10e-8
max_iter = 100

Then we define some indices to create a random symmetric Tensor train operator (A). The multiplication is to make the operator symmetric. Then we cast a random solution and calculate a right hand side (b). In the end we have \begin{align} A x = b, A \in \mathbb{R}^{(n \times \dots \times n) \times (n \times \dots \times n)}, b \in \mathbb{R}^{n \times \dots \times n} \end{align}

In [97]:
ii,jj,kk = xerus.indices(3)
A = xerus.TTOperator.random([n for i in range(0,2*dim)],[n for i in range(0,dim-1)])

A(ii^dim,jj^dim) << A(ii^dim,kk^dim) * A(jj^dim,kk^dim) # A*A^T

solution = xerus.TTTensor.random([n for i in range(0,dim)],[r+1 for i in range(0,dim-1)])
b = xerus.TTTensor()
b(ii&0)  << A(ii^dim,jj^dim) * solution(jj&0)

print(b.frob_norm())
22628130339924.72

Now we initialize a random starting vector $x$. And move the core to the first component with move_core.

In [98]:
x = xerus.TTTensor.random([n for i in range(0,dim)],[r+1 for i in range(0,dim-1)])

x.move_core(0,True)

Finally, the ALS iteration is performed. There are max_iter sweeps from left to right. At each sweep we touch each TT component once. We build the projection operators on the component and build the 'local' operator (op) and the local right hand side (rhs). Then, we transform the tensor to a numpy array, reshape it, solve the system of equations and create a new Xerus tensor from the solution (sol). This is the new component. Finally we check if the error in the frobenius norm is smaller than eps. If so, we stop.

In [99]:
i1,i2,i3,j1,j2,j3,k1,k2 = xerus.indices(8)
for i in range(0,max_iter):
    for pos in range(0,dim):
        rA = xerus.Tensor.ones([1,1,1])
        lA = xerus.Tensor.ones([1,1,1])
        rb = xerus.Tensor.ones([1,1])
        lb = xerus.Tensor.ones([1,1])
        for ind in range(0,pos):
            x_tmp = x.get_component(ind)
            A_tmp = A.get_component(ind)
            b_tmp = b.get_component(ind)
            lA(i1,i2,i3) << lA(j1,j2,j3) * x_tmp(j1,k1,i1) * A_tmp(j2,k1,k2,i2) * x_tmp(j3,k2,i3) 
            lb(i1,i2) << lb(j1,j2) * x_tmp(j1,k1,i1) * b_tmp(j2,k1,i2) 
        for ind in range(7,pos,-1): #check
            x_tmp = x.get_component(ind)
            A_tmp = A.get_component(ind)
            b_tmp = b.get_component(ind)
            rA(i1,i2,i3) << rA(j1,j2,j3) * x_tmp(i1,k1,j1) * A_tmp(i2,k1,k2,j2) * x_tmp(i3,k2,j3) 
            rb(i1,i2) << rb(j1,j2) * x_tmp(i1,k1,j1) * b_tmp(i2,k1,j2)
        
        op = xerus.Tensor()
        rhs = xerus.Tensor()
    
        Ai = A.get_component(pos)
        bi = b.get_component(pos)   
        
        op(i1,i2,i3,j1,j2,j3) << lA(i1,k1,j1) * Ai(k1,i2,j2,k2) * rA(i3,k2,j3)
        rhs(i1,i2,i3) << lb(i1,k1) * bi(k1,i2,k2) * rb(i3,k2)
        
        op_arr = op.to_ndarray()
        rhs_arr = rhs.to_ndarray()
        
        op_arr_reshape = op_arr.reshape((op.dimensions[0] * op.dimensions[1] * op.dimensions[2], op.dimensions[3]*op.dimensions[4]*op.dimensions[5]))
        rhs_arr_reshape = rhs_arr.reshape((rhs.dimensions[0] * rhs.dimensions[1] * rhs.dimensions[2]))
        
        sol_arr = np.linalg.solve(op_arr_reshape,rhs_arr_reshape)
        sol_arr_reshape = sol_arr.reshape((op.dimensions[0] , op.dimensions[1] , op.dimensions[2]))
        sol = xerus.Tensor.from_ndarray(sol_arr_reshape)
        
        x.set_component(pos,sol)
        if pos + 1 < dim:
            x.move_core(pos+1,True)
    x.move_core(0,True)           
    error = (x-solution).frob_norm()   
    print(error)
    if error < eps:
        break
5357.754758888123
4572.739092618634
3125.6612409371874
1055.1836077946255
329.56434093541696
97.88858658078871
28.17179610786219
8.336695587807348
2.61217433332212
0.8702694600903721
0.3035866535044666
0.10879754761396497
0.03950969208985312
0.01442637557190582
0.0052758958692545755
0.00192911745114747
0.0007047607682974446
0.000257192658138099
9.376005099908139e-05
3.414802683010904e-05
1.2426772734242695e-05
4.519092702038387e-06
1.6424549429323073e-06
5.966673604349374e-07
2.1667213227886945e-07
7.865934354526905e-08
© Michael Götte, Robert Gruhlke, Manuel Marschall, Phillip Trunschke, 2018-2019