如何用另一个数据框列切片中的值替换数据框列的切片？

Question

I have a two dataframes with several columns including a timestamp column.我有两个数据框，其中包含多个列，包括时间戳列。 I would like to copy the first 1000 timestamps from the second dataframe to the first one.我想将前 1000 个时间戳从第二个数据帧复制到第一个。

df1 = pd.read_csv(file1.csv)
df2 = pd.read_csv(file2.csv)
df1.timestamp.iloc[:1000] = df2.timestamp.iloc[:1000]

I tried various things like adding .copy() to the right hand side, using .loc[:1000, 'timestamp'] instead of the columnname.iloc syntax, converting the column series into a numpy array first, but I keep getting errors ranging from "too many indexers", to a directive to use .loc[rowindexing, columnindexing] (which doesn't fix the issue), and other error messages.我尝试了各种方法，例如将.copy()添加到右侧，使用.loc[:1000, 'timestamp']而不是 columnname.iloc 语法，首先将列序列转换为 numpy 数组，但我不断收到错误从“太多索引器”到使用 .loc[rowindexing, columnindexing] 的指令（不能解决问题）和其他错误消息。

Answer 1

Use Index.get_loc for positions of columns by names, so possible pass to DataFrame.iloc :使用Index.get_loc按名称获取列的位置，因此可以传递给DataFrame.iloc ：

s = df2.iloc[:1000, df2.columns.get_loc('timestamp')]  
df1.iloc[:1000, df1.columns.get_loc('timestamp')] = s

Or if use DataFrame.loc with slice index, but working only if length of both DataFrames is greater like 1000 :或者，如果使用带有切片索引的DataFrame.loc ，但仅当两个 DataFrame 的长度都大于1000时才有效：

df1.loc[:df1.index[1000], 'timestamp'] = df2.loc[:df2.index[1000], 'timestamp']

I think your solution failed, because different lengths of DataFrames.我认为您的解决方案失败了，因为 DataFrames 的长度不同。

Sample :样品：

df1 = pd.DataFrame({ "timestamp" : [2000, 2001, 2002, 2003, 1990, 1991,
                                    1992, 1993, 1994, 2010, 2011, 2012]})
df2 = pd.DataFrame({
        'A':list('abcdef'),
         'timestamp':[4,5,4,5,5,4],
})

s = df2.iloc[:1000, df2.columns.get_loc('timestamp')]  
df1.iloc[:1000, df1.columns.get_loc('timestamp')] = s
print (df1)
    timestamp
0         4.0
1         5.0
2         4.0
3         5.0
4         5.0
5         4.0
6         NaN
7         NaN
8         NaN
9         NaN
10        NaN
11        NaN

df1 = pd.DataFrame({ "timestamp" : [2000, 2001, 2002, 2003, 1990, 1991,
                                    1992, 1993, 1994, 2010, 2011, 2012]})
df2 = pd.DataFrame({
        'A':list('abcdef'),
         'timestamp':[4,5,4,5,5,4],
})

s = df1.iloc[:1000, df1.columns.get_loc('timestamp')]  
df2.iloc[:1000, df2.columns.get_loc('timestamp')] = s

print (df2)
   A  timestamp
0  a       2000
1  b       2001
2  c       2002
3  d       2003
4  e       1990
5  f       1991

Answer 2

Given df1, df2:给定 df1，df2：

df1 = pd.DataFrame({'timestamp': range(0,2000)})
df2 = -df1

using .loc:使用 .loc：

df1.loc[:999,'timestamp'] = df2.loc[:999,'timestamp']
df1.loc[997:1002,'timestamp']

997     -997
998     -998
999     -999
1000    1000
1001    1001
1002    1002
Name: timestamp, dtype: int64

or using iloc (optionally converting loc -> iloc using get_loc )或使用 iloc （可选地使用 get_loc 转换 loc -> get_loc ）

df1.iloc[:1000,0] = df2.iloc[:1000,0]
df1.loc[997:1002,'timestamp']

997     -997
998     -998
999     -999
1000    1000
1001    1001
1002    1002
Name: timestamp, dtype: int64

note that the slicing behavior on iloc and loc is differrent.请注意， iloc 和 loc 上的切片行为是不同的。
.loc includes the right value, .iloc doesn't include it (like in range) .loc包含正确的值， .iloc不包含它（例如在范围内）

如何用另一个数据框列切片中的值替换数据框列的切片？

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-09-23 07:25:35

解决方案2
0 2022-05-22 04:30:42

如何用另一个数据框列切片中的值替换数据框列的切片？

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-09-23 07:25:35

解决方案2 0 2022-05-22 04:30:42

解决方案1
1 已采纳 2020-09-23 07:25:35

解决方案2
0 2022-05-22 04:30:42