简体   繁体   中英

Assign random values to column according to another column's values in R

I have a dataset that has Stock Codes with the range from 2-90214 (which has around 3000 unique values). Obviously, some values between 2 and 90214 are getting skipped. I want to convert these stock codes so that they range from 1-3000 and in such a way that if the previous stock code was 1234, then everytime this number occurs, the new stock code (say 100) will be assigned.

In short, I want to convert :

Stock_Code
 1234
 5678
 4321
 1234
 5678

into :

Stock_Code
 100
 101
 102
 100
 101

How do I do this in R ?

We can convert the numbers into factor and then transform it into numeric

as.numeric(factor(df$StockCode))

#[1] 1 3 2 1 3

If we need it starting from 100 we can add 99 in it

as.numeric(factor(df$StockCode)) + 99

Same numbers would get same factor level which upon converting into numeric would give same numeric value

We can use match to get the index of the unique values, and then add 99

df1$Stock_Code <- match(df1$Stock_Code, unique(df1$Stock_Code)) + 99
df1$Stock_Code
[1] 100 101 102 100 101

Or another option is to convert to factor and coerce to integer

with(df1, as.integer(factor(Stock_Code, levels = unique(Stock_Code)))+ 99)
#[1] 100 101 102 100 101

Using dplyr

library(dplyr)
dense_rank(df$Stock_Code) + 99

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM