Compare panda data frame indices and update the rows

Question

I have two excel files which I read by pandas. I am comparing the index in file 1 with the index in file 2 (not the same length (ex: 10,100) and if they match, the row[index] in the second file will be zeros and else will not change. I am using for and if loops for this, but the more data I want to process(1e3,5e3), the run time becomes longer. So, is there a better way to perform such a comparison?. Here's an example of what I am using.

df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
                  index=[4, 5, 6], columns=['A', 'B', 'C'])
df1 = pd.DataFrame([['w'], ['y' ], ['z']],
                  index=[4, 5, 1])
for j in df1.index:
    for i in df.index:
        if i == j:
            df.loc[i, :] = 0
        else:
            df.loc[i, :] = df.loc[i, :]
print(df)

Answer 1

Here loops are not necessary, you can set values to 0 per rows by DataFrame.mask with Series.isin (necessary convert index to Series for avoid ValueError: Array conditional must be same shape as self ):

df = df.mask(df.index.to_series().isin(df1.index), 0)

Or with Index.isin and numpy.where if want improve performance:

arr = np.where(df.index.isin(df1.index)[:, None], 0, df)
df = pd.DataFrame(arr, index=df.index, columns=df.columns)
print(df)
    A   B   C
4   0   0   0
5   0   0   0
6  10  20  30

Compare panda data frame indices and update the rows

Question

1 answers

solution1
2 ACCPTED 2020-04-12 12:28:49

Compare panda data frame indices and update the rows

Question

1 answers

solution1 2 ACCPTED 2020-04-12 12:28:49

solution1
2 ACCPTED 2020-04-12 12:28:49