Get the subarray with same numbers and consecutive index

Question

I have a text file like this

   0, 23.00, 78.00, 75.00, 105.00,  2,0.97
   1, 371.00, 305.00, 38.00, 48.00,  0,0.85
   1, 24.00, 78.00, 75.00, 116.00,  2,0.98
   1, 372.00, 306.00, 37.00, 48.00,  0,0.84
   2, 28.00, 87.00, 74.00, 101.00,  2,0.97
   2, 372.00, 307.00, 35.00, 47.00,  0,0.80
   3, 32.00, 86.00, 73.00, 98.00,  2,0.98
   3, 363.00, 310.00, 34.00, 46.00,  0,0.83
   4, 40.00, 77.00, 71.00, 98.00,  2,0.94
   4, 370.00, 307.00, 38.00, 47.00,  0,0.84
   4, 46.00, 78.00, 74.00, 116.00,  2,0.97
   5, 372.00, 308.00, 34.00, 46.00,  0,0.57
   5, 43.00, 66.00, 67.00, 110.00,  2,0.96

Code I tried

frames = []
x = []
y = []
labels = []
with open(file, 'r') as lb:
    for line in lb:
        line = line.replace(',', ' ')
        arr = line.split()
        frames.append(arr[0])
        x.append(arr[1])
        y.append(arr[2])
        labels.append(arr[5])
    print(np.shape(frames))
    for d, a in enumerate(frames):
        compare = []
        if a == frames[d+2]:
            compare.append(x[d])
            compare.append(x[d+1])
            compare.append(x[d+2])
            xm = np.argmin(compare)
            label = {0: int(labels[d]), 1: int(labels[d+1]), 2: int(labels[d+2])}.get(xm)
        elif a == frames[d+1]:
            compare.append(x[d])
            compare.append(x[d+1])
            xm = np.argmin(compare)
            label = {0: int(labels[d]), 1: int(labels[d+1])}.get(xm)

In the first line, because the first number (0) is unique so I extract the sixth number (2) easily. But after that, I got many lines with the same first number, so I want somehow to store all the lines with the same first number to compare the second number, then extract the sixth number of the line which has the lowest second number. Can someone provide python solutions for me? I tried readline() and next() but don't know how to solve it.

Answer 1

you can read the file with pandas.read_csv instead, and things will come much more easily

import pandas as pd
df = pd.read_csv(file_path, header = None)

You'll read the file as a table

    0      1      2     3      4  5     6
0   0   23.0   78.0  75.0  105.0  2  0.97
1   1  371.0  305.0  38.0   48.0  0  0.85
2   1   24.0   78.0  75.0  116.0  2  0.98
3   1  372.0  306.0  37.0   48.0  0  0.84
4   2   28.0   87.0  74.0  101.0  2  0.97
5   2  372.0  307.0  35.0   47.0  0  0.80
6   3   32.0   86.0  73.0   98.0  2  0.98
7   3  363.0  310.0  34.0   46.0  0  0.83
8   4   40.0   77.0  71.0   98.0  2  0.94
9   4  370.0  307.0  38.0   47.0  0  0.84
10  4   46.0   78.0  74.0  116.0  2  0.97
11  5  372.0  308.0  34.0   46.0  0  0.57
12  5   43.0   66.0  67.0  110.0  2  0.96

then you can group in subtables based on one of the columns (in your case column 0)

for group, sub_df in d.groupby(0):
    row = sub_df[1].idxmin() # returns the index of the minimum value for column 1
    df.loc[row, 5] # this is the number you are looking for

Answer 2

I think this is what you need using pandas :

import pandas as pd

df = pd.read_table('./test.txt', sep=',', names = ('1','2','3','4','5','6','7'))
print(df)
#     1      2      3     4      5  6     7
# 0   0   23.0   78.0  75.0  105.0  2  0.97
# 1   1  371.0  305.0  38.0   48.0  0  0.85
# 2   1   24.0   78.0  75.0  116.0  2  0.98
# 3   1  372.0  306.0  37.0   48.0  0  0.84
# 4   2   28.0   87.0  74.0  101.0  2  0.97
# 5   2  372.0  307.0  35.0   47.0  0  0.80
# 6   3   32.0   86.0  73.0   98.0  2  0.98
# 7   3  363.0  310.0  34.0   46.0  0  0.83
# 8   4   40.0   77.0  71.0   98.0  2  0.94
# 9   4  370.0  307.0  38.0   47.0  0  0.84
# 10  4   46.0   78.0  74.0  116.0  2  0.97
# 11  5  372.0  308.0  34.0   46.0  0  0.57
# 12  5   43.0   66.0  67.0  110.0  2  0.96

df_new = df.loc[df.groupby("1")["6"].idxmin()]
print(df_new)
#     1      2      3     4      5  6     7
# 0   0   23.0   78.0  75.0  105.0  2  0.97
# 1   1  371.0  305.0  38.0   48.0  0  0.85
# 5   2  372.0  307.0  35.0   47.0  0  0.80
# 7   3  363.0  310.0  34.0   46.0  0  0.83
# 9   4  370.0  307.0  38.0   47.0  0  0.84
# 11  5  372.0  308.0  34.0   46.0  0  0.57

Get the subarray with same numbers and consecutive index

Question

2 answers

solution1
0 ACCPTED 2019-10-11 11:32:11

solution2
0 2019-10-11 12:09:07

Get the subarray with same numbers and consecutive index

Question

2 answers

solution1 0 ACCPTED 2019-10-11 11:32:11

solution2 0 2019-10-11 12:09:07

solution1
0 ACCPTED 2019-10-11 11:32:11

solution2
0 2019-10-11 12:09:07