更新子集數據幀以更新父數據幀

Question

我有一個4x4數據幀（df）。 我創建了兩個子數據幀（4x1），（4x2）。 並更新了兩者。 在第一種情況下，父更新，其次，它不是。 如何確保更新子數據幀時更新父數據幀？

我有一個4x4數據幀（df）。 從這里作為父母，我創建了兩個子數據幀 - 具有單列（4x1）的dfA和具有兩列（4x2）的dfB。 我在兩個子集中都有NaN值。 現在，當我在兩者上使用fillna時，在相應的dfA和dfB中，我可以看到用給定值更新的NaN值。 好到現在。 但是，現在當我檢查父數據幀時，在第一種情況下（4x1），更新的值反映了，而在第二種情況（4x2），它沒有。 為什么會這樣。 我應該怎么做才能讓子數據框中的更改反映在父數據框中。

studentnames = ['Maths','English','Soc.Sci', 'Hindi', 'Science']
semisteronemarks = [15, 50, np.NaN, 50, np.NaN]
semistertwomarks = [25, 53, 45, 45, 54]
semisterthreemarks = [20, 50, 45, 15, 38]
semisterfourmarks = [26, 33, np.NaN, 35, 34]
semisters = ['Rakesh','Rohit', 'Sam', 'Sunil']
df1 = pd.DataFrame([semisteronemarks,semistertwomarks,semisterthreemarks,semisterfourmarks],semisters, studentnames)

# case 1
dfA = df['Soc.Sci']
dfA.fillna(value = 98, inplace = True)
print(dfA)
print(df)

# case 2
dfB = df[['Soc.Sci', 'Science']]
dfB.fillna(value = 99, inplace = True)
print(dfB)
print(df)
'''

## contents of parent df ->>
## Actual Output -
# case 1
               Maths    English Soc.Sci Hindi   Science
      Rakesh    15        50      98.0   50      NaN
      Rohit     25        53      45.0   45      54.0
      Sam       20        50      45.0   15      38.0
      Sunil     26        33      98.0   35      34.0

# case 2
               Maths    English Soc.Sci Hindi   Science
       Rakesh   15        50      NaN    50      NaN
       Rohit    25        53      45.0   45      54.0
       Sam      20        50      45.0   15      38.0
       Sunil    26        33      NaN    35      34.0


## Expected Output -
# case 1
               Maths    English Soc.Sci Hindi   Science
        Rakesh  15        50      98.0   50      NaN
        Rohit   25        53      45.0   45      54.0
        Sam     20        50      45.0   15      38.0
        Sunil   26        33      98.0   35      34.0

# case 2
              Maths     English Soc.Sci Hindi   Science
        Rakesh  15        50      99.0   50      NaN
        Rohit   25        53      45.0   45      54.0
        Sam     20        50      45.0   15      38.0
        Sunil   26        33      99.0   35      34.0

# note the difference in output for column Soc.Sci in case 2.

Answer 1

在你的代碼中， df1被定義為df不是。

隨着方法的使用

# case 1
dfA = df1['Soc.Sci']   # changed df to df1
dfA.fillna(value = 98, inplace = True)

df1['Soc.Sci'] = dfA  # Because dfA is not a dataframe but a series
# if you want to do
df1['Soc.Sci'] = dfA['Soc.Sci']  
# you will need to change the dfA
dfA = df1[['Soc.Sci']]  # this makes it a dataframe


# case 2
dfB = df1[['Soc.Sci', 'Science']] # changed df to df1
dfB.fillna(value = 99, inplace = True)

df1[['Soc.Sci','Science']] = dfB[['Soc.Sci','Science']]

print(df1)

我建議只使用父df中的fillna 。

df1['Soc.Sci'].fillna(value=99,inplace=True)

Answer 2

你應該看到一個警告：

Warning (from warnings module):
...
SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

這意味着dfB可能是副本而不是視圖。 並根據結果。 這里幾乎無法做到，特別是你不能強迫pandas生成一個視圖。 選擇取決於只有熊貓及其開發者所知的參數。

但始終可以分配父DataFrame的列：

# case 2
df = pd.DataFrame([semisteronemarks,semistertwomarks,semisterthreemarks,semisterfourmarks],semisters, studentnames)
df[['Soc.Sci', 'Science']] = df[['Soc.Sci', 'Science']].fillna(value = 99)
print(df)

更新子集數據幀以更新父數據幀

問題描述

2 個解決方案

解決方案1
0 2019-05-02 12:00:50

解決方案2
0 2019-05-02 12:57:06

更新子集數據幀以更新父數據幀

問題描述

2 個解決方案

解決方案1 0 2019-05-02 12:00:50

解決方案2 0 2019-05-02 12:57:06

解決方案1
0 2019-05-02 12:00:50

解決方案2
0 2019-05-02 12:57:06