简体   繁体   中英

How to replace values in pandas dataframe

My goal is to design a program that will take create a program that will replace unique values in a pandas dataframe.

The following code performs the operation

    # replace values
    print(f" {s1['A1'].value_counts().index}")
    for i in s1['A1'].value_counts().index:
        s1['A1'].replace(i,1)

    print(f" {s2['A1'].value_counts().index}")
    for i in s2['A1'].value_counts().index:
        s2['A1'].replace(i,2)

    print("s1 after replacing values")
    print(s1)
    print("******************")
    print("s2 after replacing values")
    print(s2)
    print("******************")

Expected: The values in the first dataframe s1 should be replaced with 1s. The values in the second dataframe s2 should be replaced with 2s.

Actual:

 Int64Index([8, 5, 2, 7, 6], dtype='int64')
 Int64Index([2, 8, 5, 6, 7, 4, 3], dtype='int64')
s1 after replacing values
    A1        A2   A3  Class
3    5  0.440671  2.3      1
9    8  0.070035  2.9      1
14   2  0.868410  1.5      1
29   6  0.587487  2.6      1
34   8  0.652936  3.0      1
38   8  0.181508  3.0      1
45   8  0.953230  3.0      1
54   7  0.737604  2.7      1
68   5  0.187475  2.2      1
70   5  0.511385  2.3      1
71   8  0.688134  3.0      1
73   2  0.054908  1.5      1
87   8  0.461797  3.0      1
90   2  0.756518  1.5      1
91   2  0.761448  1.5      1
93   5  0.858036  2.3      1
94   5  0.306459  2.2      1
98   5  0.692804  2.2      1
******************
s2 after replacing values
    A1        A2   A3  Class
0    2  0.463134  1.5      3
1    8  0.746065  3.0      3
2    6  0.264391  2.5      2
4    2  0.410438  1.5      3
5    2  0.302902  1.5      2
..  ..       ...  ...    ...
92   5  0.775842  2.3      2
95   5  0.844920  2.2      2
96   5  0.428071  2.2      2
97   5  0.356044  2.2      3
99   5  0.815400  2.2      3

Any help understanding how to replace the values in these dataframes would be greatly appreciated. Thank you.

This could be confusing given the documentation on the replace method. You need to reassign the dataframe.

# replace values
    print(f" {s1['A1'].value_counts().index}")
    for i in s1['A1'].value_counts().index:
        print(f"s1['A1'].replace({i},1)")
        s1['A1'] = s1['A1'].replace(i,1)

    print(f" {s2['A1'].value_counts().index}")
    for i in s2['A1'].value_counts().index:
        print(f"s2['A1'].replace({i},2)")
        s2['A1'] = s2['A1'].replace(i,2)

The docs do not say that: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM