简体   繁体   中英

Move column values to new column based on several conditions in R

I am currently cleaning a dataset to be ready for analysis and need to move values around columns depending on the way one column matches to another.

In my data are participants who are assigned to various numbers of projects, and they have rated each project on different variables. Therefore, Name is the participant, Project Name is the project's name, Project Number is the number of that project for that participant, and the variables with "voice" in them are questions about the project with a value indicating a rating (1-5). Each person has rated anywhere from 1-17 projects.

An example of my data in its current state can be replicated as so:

colnames<-c("Name", "Project Name", "Project Number", "T2_voice1", "T2_voice2", "T2project1_voice1", "T2project1_voice2", "T2project2_voice1", "T2project2_voice2")
r1<- c("Bob", "ProjectX", "Project1", NA, NA, 5, 2, 4 ,5)
r2<- c("Bob", "ProjectZ", "Project2", NA, NA, 5, 2, 4 ,5)
r3<- c("Amy", "ProjectQ", "Project1", NA, NA, 1, 2, 1 ,1)
r4<- c("Amy", "ProjectD", "Project2", NA, NA, 1, 2, 1 ,1)

data<-rbind(r1, r2, r3, r4)
colnames(data)<-colnames

What I would like to do is put the number value from project1_voice1 in the column T2_voice1 for project1 for each participant. This would continue then for each project number for each participant. The final product would look like this once I delete the unneeded columns:

Name Project Name Project Number T2_voice1 T2_voice2
Bob ProjectX 1 5 2
Bob ProjectY 2 4 5
Amy ProjectQ 1 1 2
Amy ProjectD 2 1 1

The only way I have thought to do this is through some sort of grepl or substr to match Project Number's projet1 to the column names with project1 in it. Or, to do this positionally since 17 (2 in the example) projects are rated for each participant - some just have NAs if the participant did not have that many projects.

Any guidance or ideas would be extremely appreciated!

I would split your data into separate tables of project information and ratings; tidyr::pivot_longer() the ratings table; then merge back together:

library(dplyr)
library(stringr)
library(tidyr)

# convert example data from matrix to dataframe 
data <- as.data.frame(data)

projects <- data %>% 
  select(Name, `Project Name`, `Project Number`) %>% 
  mutate(`Project Number` = str_extract(`Project Number`, "\\d+$"))

ratings <- data %>%
  distinct(Name, across(T2project1_voice1:T2project2_voice2)) %>% 
  pivot_longer(
    !Name, 
    names_to = c("Project Number", ".value"), 
    names_pattern = "T2project(\\d+)_(.+)"
  ) %>% 
  rename_with(.cols = voice1:voice2, ~ str_c("T2_", .x))

final <- full_join(projects, ratings)
  Name Project Name Project Number T2_voice1 T2_voice2
1  Bob     ProjectX              1         5         2
2  Bob     ProjectZ              2         4         5
3  Amy     ProjectQ              1         1         2
4  Amy     ProjectD              2         1         1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM