I have a dataset that has Stock Codes with the range from 2-90214 (which has around 3000 unique values). Obviously, some values between 2 and 90214 are getting skipped. I want to convert these stock codes so that they range from 1-3000 and in such a way that if the previous stock code was 1234, then everytime this number occurs, the new stock code (say 100) will be assigned.
In short, I want to convert :
Stock_Code
1234
5678
4321
1234
5678
into :
Stock_Code
100
101
102
100
101
How do I do this in R ?
We can convert the numbers into factor and then transform it into numeric
as.numeric(factor(df$StockCode))
#[1] 1 3 2 1 3
If we need it starting from 100 we can add 99 in it
as.numeric(factor(df$StockCode)) + 99
Same numbers would get same factor level which upon converting into numeric would give same numeric value
We can use match
to get the index of the unique values, and then add 99
df1$Stock_Code <- match(df1$Stock_Code, unique(df1$Stock_Code)) + 99
df1$Stock_Code
[1] 100 101 102 100 101
Or another option is to convert to factor
and coerce to integer
with(df1, as.integer(factor(Stock_Code, levels = unique(Stock_Code)))+ 99)
#[1] 100 101 102 100 101
Using dplyr
library(dplyr)
dense_rank(df$Stock_Code) + 99
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.