#### retrieve original index of sequentially removed column (a row is also removed) of an matrix in Julia or python

``````# a matrix of random numbers
mat = rand(10, 10);
# column sum of the original matrix
matColSum = sum(mat, dims=1);

# iteratively remove columns with the largest sum
idxColRemoveList = [];
matTemp = mat;

for i in 1:4  # Suppose 4 columns need to be removed

# 1. find the index of the column with the largest column sum at current iteration
sumTemp = sum(matTemp, dims=1);
maxSumTemp = maximum(sumTemp);
idxColRemoveTemp = argmax(sumTemp)[2];

# 2. record the orignial index of the removed scenario
idxColRemoveOrig = findall(x->x==maxSumTemp, matColSum)[1][2];
push!(idxColRemoveList, idxColRemoveOrig);

# 3. update the matrix. Note that the corresponding row is also removed.
matTemp = matTemp[Not(idxColRemoveTemp), Not(idxColRemoveTemp)];

end
``````
2 个回复

``````import numpy as np

mat = np.random.rand(5, 5)
n_remove = 3

original = np.arange(len(mat)).tolist()
removed = []

for i in range(n_remove):
col_sum = np.sum(mat, axis=0)
col_rm = np.argsort(col_sum)[-1]
removed.append(original.pop(col_rm))
mat = np.delete(np.delete(mat, col_rm, 0), col_rm, 1)

print(removed)
print(original)
print(mat)
``````

``````import numpy as np

n = 1000
mat = np.random.rand(n, n)
n_remove = 500
removed = []

for i in range(n_remove):
# get sum of each column
col_sum = np.sum(mat, axis=0)
col_rm = np.argmax(col_sum)
# record the column ID
removed.append(col_rm)

# replace elements in the col_rm-th column and row with the zeros
mat[:, col_rm] = 1e-10
mat[col_rm, :] = 1e-10

print(removed)

``````
1 Rcpp：从矩阵中删除列和行[重复]

2015-10-14 07:27:48 1 284   rcpp
2 基于Julia中列的重复值删除矩阵行

4 从矩阵删除列

2013-07-23 13:33:46 2 122   binary
5 在Python中删除列

2013-03-25 15:11:55 1 137   csv
7 如何从kdb表中按索引删除列？

2018-09-25 10:46:07 4 196   kdb
8 从数据框中删除列索引

2018-07-02 13:37:22 2 468   pandas
9 如果所有值均相等，则从二维矩阵中删除列和行

2017-11-10 19:59:49 5 303   matrix
10 awk第一行不起作用删除列

2014-03-03 11:06:43 3 85   awk