[英]Assign data.table column values based on a value being in a certain range
So I have two data.tables.所以我有两个data.tables。
size_categories = data.table(category = c("S", "M", "L"), size_min = c(0, 10, 25),
size_max = c(10, 25, Inf), bin = c("blue", "red", "green"))
products = data.table(object_id = 1:10, size = seq(1, 37, 4))
I want to merge the tables such that each row of the product table is assigned a bin and size category based on its size.我想合并这些表,以便根据其大小为产品表的每一行分配一个 bin 和 size 类别。
The ham-fisted way I know would be to assign assign a category to each row on products and then merging我知道的笨拙的方法是为产品的每一行分配一个类别,然后合并
products[size >= 0 & size < 10, category := "S"]
products[size >= 10 & size < 25, category := "M"]
products[size >= 25, category := "L"]
merge(products, size_categories)
Of course this is not flexible at all and I would have to rewrite it if size_categories changed.当然,这根本不灵活,如果 size_categories 发生变化,我将不得不重写它。
I am open to using other packages, but would prefer a solution just using data.table.我愿意使用其他软件包,但更喜欢仅使用 data.table 的解决方案。
Thanks!谢谢!
I would do it with non-equi join:我会用非 equi 加入来做到这一点:
products[size_categories, `:=`(category = i.category, bin = i.bin),
on = .(size >= size_min, size < size_max)]
# > products
# object_id size category bin
# 1: 1 1 S blue
# 2: 2 5 S blue
# 3: 3 9 S blue
# 4: 4 13 M red
# 5: 5 17 M red
# 6: 6 21 M red
# 7: 7 25 L green
# 8: 8 29 L green
# 9: 9 33 L green
# 10: 10 37 L green
For reference, here's an approach using foverlaps
:作为参考,这是一种使用
foverlaps
的方法:
foverlaps(setkey(size_categories, size_min, size_max),
setkey(products[, size2 := size], size, size2))[, size2 := NULL][]
# object_id size category size_min size_max bin
# 1: 1 1 S 0 10 blue
# 2: 2 5 S 0 10 blue
# 3: 3 9 S 0 10 blue
# 4: 4 13 M 10 25 red
# 5: 5 17 M 10 25 red
# 6: 6 21 M 10 25 red
# 7: 7 25 M 10 25 red
# 8: 7 25 L 25 Inf green
# 9: 8 29 L 25 Inf green
# 10: 9 33 L 25 Inf green
# 11: 10 37 L 25 Inf green
It would probably be helpful in cases where your "size_categories" table has more columns that you want included in the final output.如果您的“size_categories”表包含更多您希望包含在最终 output 中的列,这可能会有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.