python中的有效張量收縮

Question

我有一個張量列表L （ ndarray對象），每個都有幾個索引。 我需要根據連接圖收縮這些指數。

的連接進行編碼元組的列表的形式((m,i),(n,j))表示“合同的張量的第i個指數L[m]與張力器的第j指數L[n] .

如何處理非平凡的連接圖？ 第一個問題是，只要我收縮一對索引，結果就是一個不屬於列表L的新張量。 但即使我解決了這個問題（例如通過為所有張量的所有索引提供唯一標識符），也存在一個問題，即可以選擇任何順序來執行收縮，並且某些選擇會在中間計算中產生不必要的巨大野獸（即使最終結果很小）。 建議？

Answer 1

除了內存方面的考慮，我相信您可以在對einsum的一次調用中完成einsum ，盡管您需要一些預處理。 我不完全確定你所說的“當我收縮一對索引時，結果是一個不屬於列表L的新張量”是什么意思，但我認為一步收縮就可以解決這個問題.

我建議使用einsum的替代數字索引語法：

einsum(op0, sublist0, op1, sublist1, ..., [sublistout])

所以你需要做的是將索引編碼為整數序列。 首先，您需要最初設置一系列唯一索引，並保留另一個副本以用作sublistout 。 然后，迭代您的連接圖，您需要在必要時將收縮索引設置為相同的索引，同時從sublistout刪除收縮索引。

import numpy as np

def contract_all(tensors,conns):
    '''
    Contract the tensors inside the list tensors
    according to the connectivities in conns

    Example input:
    tensors = [np.random.rand(2,3),np.random.rand(3,4,5),np.random.rand(3,4)]
    conns = [((0,1),(2,0)), ((1,1),(2,1))]
    returned shape in this case is (2,3,5)
    '''

    ndims = [t.ndim for t in tensors]
    totdims = sum(ndims)
    dims0 = np.arange(totdims)
    # keep track of sublistout throughout
    sublistout = set(dims0.tolist())
    # cut up the index array according to tensors
    # (throw away empty list at the end)
    inds = np.split(dims0,np.cumsum(ndims))[:-1]
    # we also need to convert to a list, otherwise einsum chokes
    inds = [ind.tolist() for ind in inds]

    # if there were no contractions, we'd call
    # np.einsum(*zip(tensors,inds),sublistout)

    # instead we need to loop over the connectivity graph
    # and manipulate the indices
    for (m,i),(n,j) in conns:
        # tensors[m][i] contracted with tensors[n][j]

        # remove the old indices from sublistout which is a set
        sublistout -= {inds[m][i],inds[n][j]}

        # contract the indices
        inds[n][j] = inds[m][i]

    # zip and flatten the tensors and indices
    args = [subarg for arg in zip(tensors,inds) for subarg in arg]

    # assuming there are no multiple contractions, we're done here
    return np.einsum(*args,sublistout)

一個簡單的例子：

>>> tensors = [np.random.rand(2,3), np.random.rand(4,3)]
>>> conns = [((0,1),(1,1))]
>>> contract_all(tensors,conns)
array([[ 1.51970003,  1.06482209,  1.61478989,  1.86329518],
       [ 1.16334367,  0.60125945,  1.00275992,  1.43578448]])
>>> np.einsum('ij,kj',tensors[0],tensors[1])
array([[ 1.51970003,  1.06482209,  1.61478989,  1.86329518],
       [ 1.16334367,  0.60125945,  1.00275992,  1.43578448]])

如果有多個收縮，循環中的邏輯會變得有點復雜，因為我們需要處理所有重復項。 然而，邏輯是一樣的。 此外，上述顯然缺少確保可以收縮相應索引的檢查。

事后看來，我意識到不必指定默認的sublistout ，無論如何einsum使用該順序。 我決定在代碼中保留該變量，因為如果我們想要一個非平凡的輸出索引順序，我們必須適當地處理該變量，它可能會派上用場。

至於收縮順序的優化，您可以從 1.12 版開始在np.einsum實現內部優化（如@hpaulj 在現已刪除的評論中所述）。 這個版本向np.einsum引入了optimize可選關鍵字參數，允許選擇一個收縮順序，以內存為代價減少計算時間。 傳遞'greedy'或'optimal'作為optimize關鍵字將使 numpy 以維度大小的大致遞減順序選擇收縮順序。

可用於optimize關鍵字的選項來自顯然未記錄的（就在線文檔而言； help()幸運地工作）函數np.einsum_path ：

einsum_path(subscripts, *operands, optimize='greedy')

Evaluates the lowest cost contraction order for an einsum expression by
considering the creation of intermediate arrays.

np.einsum_path的輸出收縮路徑也可以用作np.einsum的optimize參數的np.einsum 。 在您的問題中，您擔心使用了太多內存，所以我懷疑默認沒有優化（運行時間可能更長，內存占用更小）。

Answer 2

也許有幫助：查看https://arxiv.org/abs/1402.0939 ，它描述了一個有效框架，用於在單個函數ncon(...)中收縮所謂的張量網絡。 據我所知，它的實現可直接用於 Matlab（可在鏈接中找到）和 Python3（ https://github.com/mhauru/ncon ）。

python中的有效張量收縮

問題描述

2 個解決方案

解決方案1
5 已采納 2017-02-04 00:07:32

解決方案2
1 2020-05-05 07:48:58

python中的有效張量收縮

問題描述

2 個解決方案

解決方案1 5 已采納 2017-02-04 00:07:32

解決方案2 1 2020-05-05 07:48:58

解決方案1
5 已采納 2017-02-04 00:07:32

解決方案2
1 2020-05-05 07:48:58