[英]copy dataframe lines and replace on the same dataframe
I have a dataframe in which I have 2 records that are with few values, I wanted to replace those records with others with more values, make a copy.我有一个数据框,其中有 2 条记录的值很少,我想用其他值替换这些记录,然后复制一份。 Does anyone know how to do this on pandas or vaex?有谁知道如何在熊猫或 vaex 上做到这一点?
wanted to replace the values 148 for example with the values 140 Someone help?想用值 140 替换例如值 148 有人帮忙吗?
edit: my dataframe is this编辑:我的数据框是这个
I would like to replace all values of day_of_week = 148 with values of day_of_week = 140 because (day_of_week = 148) has 1000 records and (day_of_week = 140) has 200000 records我想用 day_of_week = 140 的值替换 day_of_week = 148 的所有值,因为 (day_of_week = 148) 有 1000 条记录,(day_of_week = 140) 有 200000 条记录
I want to copy all lines day_of_year == 140 and replace for all lines that are day_of_year == 148我想复制所有行 day_of_year == 140 并替换为 day_of_year == 148 的所有行
If I understand you correctly, this should be straightforward in vaex:如果我理解正确的话,这在 vaex 中应该很简单:
df['new_col'] = df.func.where(df.day_of_week==148, 140, df.day_of_week)
In vaex the new column will be virtual, ie not take any memory.在 vaex 中,新列将是虚拟的,即不占用任何内存。 So it does not matter if you overwrite your existing one, or keep a separate one with the mapping (maybe better to keep a separate one, in case you need to debug your process later on).因此,是覆盖现有的映射还是保留一个单独的映射都没有关系(最好保留一个单独的映射,以防以后需要调试过程)。
I think something similar can be done with pandas, using numpy.where
as it was already commented before me.我认为使用numpy.where
可以用numpy.where
做类似的事情,因为它已经在我之前评论过了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.