简体   繁体   English

pd.merge 生成新的列名

[英]pd.merge generating new column names

Merging 2 dataframes which have some common and some different column names.合并 2 个具有一些常见列名和一些不同列名的数据框。 Results in new column names which are in neither but combine the string of names in each.导致新的列名既不在但在每个列中组合名称字符串。

2 dataframes: 2个数据帧:

df.columns has among others 'particle', 'frame', 'x old', 'y old' corrected_traj.columns has 'particle', 'frame', 'x', 'y' df.columns 有 'particle', 'frame', 'x old', 'y old' Correct_traj.columns 有 'particle', 'frame', 'x', 'y'

neither dataframe has 'frame_x' or 'frame_y'.两个数据框都没有“frame_x”或“frame_y”。

yet when I try to merge I end up with no column named 'frame' but 2 new columns 'frame_x' and 'frame_y'然而,当我尝试合并时,我最终没有名为“frame”的列,而是 2 个新列“frame_x”和“frame_y”

Neither dataframes index is currently named although they are linked to the frame number.尽管数据帧索引与帧编号相关联,但它们目前都未命名。 I have been trying to avoid an error associated where the index and a column number have the same name.我一直在努力避免索引和列号具有相同名称的相关错误。 Hence some of the code trying to remove index names etc. Not sure whether this is relevant so have included.因此,一些代码试图删除索引名称等。不确定这是否相关,因此已包含在内。

The dataframes are being produced by functions from trackpy but I think the issue is related to the pd.merge.数据帧是由 trackpy 的函数生成的,但我认为这个问题与 pd.merge 有关。

The overall aim is to subtract the mean drift of some particles from the motion of the particles.总体目标是从粒子的运动中减去一些粒子的平均漂移。 I want to move the old x and y to 'x old' and 'y old' and put the corrected values in 'x' and 'y'我想将旧的 x 和 y 移动到 'x old' 和 'y old' 并将更正后的值放入 'x' 和 'y'


drift = tp.motion.compute_drift(df)
corrected_traj = tp.motion.subtract_drift(df[['frame','x','y','particle']].copy(), drift)

df['x old'] = df['x'].copy()
df['y old'] = df['y'].copy()


df = df.drop(columns=['x','y'])
corrected_traj.index.name=None

df = pd.merge(df, corrected_traj,
                  on='particle')

Apologies I've tried indenting and hitting the code thing but can't seem to get it to markup correctly抱歉,我试过缩进并点击代码,但似乎无法正确标记

I was expecting a dataframe df with 'x','y','frame','particle','x old', 'y old'.我期待一个带有“x”、“y”、“frame”、“particle”、“x old”、“y old”的数据帧 df。

Instead I'm getting 'x','y', 'frame_x', 'frame_y', 'x old','y old', 'particle'相反,我得到 'x','y', 'frame_x', 'frame_y', 'x old','y old', 'particle'

The contents of 'frame x and 'frame y' do seem to be the frame number values. 'frame x 和 'frame y' 的内容似乎是帧编号值。

If you want to have a dataframe df with 'x','y','frame','particle','x old', 'y old', then you should merge as below to incorporate both particle and frame columns in joining scope.如果你想要一个数据框 df 与 'x','y','frame','particle','x old', 'y old',那么你应该按如下方式合并以合并粒子和帧列加入范围。 Otherwise they will be treated as columns to be joined based on parameters passed to "on" and thus be treated as frame_x, frame_y to identify them separately.否则,它们将根据传递给“on”的参数被视为要连接的列,因此将被视为 frame_x、frame_y 以分别标识它们。

df = pd.merge(df, corrected_traj, on=['particle', 'frame'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM