简体   繁体   中英

R set column value to be other column value based on string search

I'm trying to find a clean way to get the first column of my DT, for each row, to be equal to the user_id found in other columns. That is, I must perform a search of "user_id" across each row, and return the entirety of the cell where the instance is found.

I first tried to get the index of the column where the partial match is found, and then use this to set the first column's values, but it did not work. Example:

       user_id          1             2
   1:     N/A          300       user_id154
   2:     N/A   user_id301    user_id125040
   3:     N/A          302         user_id2

For instance, I want to obtain the following

   **user_id**
  user_id154
  user_id301
  user_id2

Please bear in mind I am new to such data formatting in R (most of the work I do does not involve cleaning JSON files..), and that my data.table has overs 1M rows. The answer does not need to be super efficient, but it definitely shouldn't take more than 5 minutes or it will be considered as too slow by my boss.

Hopefully it is understandable

I'm sure someone will provide a more elegant solution, but this does the trick:

dt[, user_id := str_extract(str_c(1, 2), "user_id[0-9]*")]

This first combines all columns row-per-row, then for each row, looks for the first user_id in the combined value.

(Requires the stringr package)

对于表中的每一行, grep第一个值中都有"user_id" ,并将结果放入user_id列中。

df$user_id <- apply(df, 1, function(x) grep("user_id", x, value = TRUE)[1])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM