简体   繁体   English

在行中重复时更改单元格的值 - R

[英]Change value of a cell when it is repeated in the row - R

Let's start with this dataset:让我们从这个数据集开始:

structure(list(Etiqueta = structure(c(17L, 19L, 4L, 26L, 25L, 
11L, 23L, 5L, 10L, 8L, 13L, 15L, 12L, 9L, 14L, 18L, 1L, 19L, 
4L, 26L), .Label = c("70th Anniversary of First Soviet Stamp", 
"Biathlon", "Buy Now:", "Catalog codes:", "Colors:", "Cross-country skiing", 
"Description:", "Emission:", "Face value:", "Format:", "Issued on:", 
"Paper:", "Perforation:", "Print run:", "Printing:", "Related items:", 
"Sable (Martes zibellina), Cedar", "Score:", "Series:", "Sheet of 8 x SU5789", 
"Sheet of 8 x SU5790", "Similar:", "Size:", "Slalom", "Themes:", 
"Variants:", "XV Winter Olympic Games in Calgary."), class = "factor"), 
    Valor = structure(c(72L, 52L, 54L, 44L, 38L, 11L, 15L, 43L, 
    78L, 51L, 47L, 66L, 70L, 20L, 23L, 28L, 32L, 32L, 55L, 44L
    ), .Label = c("", "1 (See)", "10 Russian kopek", "11%\tAccuracy: Very High", 
    "13%\tAccuracy: Very High", "15 Russian kopek", "15,000", 
    "15%\tAccuracy: Very High", "18%\tAccuracy: Very High", "1988-01-04", 
    "1988-03", "20 Russian kopek", "22%\tAccuracy: Very High", 
    "23%\tAccuracy: Very High", "26 x 37 mm", "28 x 40 mm", "3 sale offers from US$ 0.09", 
    "3,000,000", "3,320,000", "35 Russian kopek", "4 sale offers from US$ 0.09", 
    "4 sale offers from US$ 0.20", "4,000,000", "4,120,000", 
    "40 Russian kopek", "5 Russian kopek", "5,320,000", "53%\tAccuracy: Medium", 
    "56 x 40 mm", "6 sale offers from US$ 0.21", "7 sale offers from US$ 0.07", 
    "70th Anniversary of First Soviet Stamp", "8 sale offers from US$ 0.05", 
    "8*15 Russian kopek", "80%\tAccuracy: Medium", "81%\tAccuracy: Medium", 
    "83%\tAccuracy: Medium", "Animals (Fauna) | Mammals", "Anniversaries and Jubilees | Hands | Stamps", 
    "Biathlon", "Biathlon | Olympic Games | Sports", "Biathlon | Olympic Games | Sports | Winter Sports", 
    "Brown black", "Click to see variants", "coated", "comb 11½", 
    "comb 12½ x 12", "Commemorative", "Cross-country skiing", 
    "Cross-country Skiing | Olympic Games | Sports | Winter Sports", 
    "Definitive", "Definitive Issue No.12", "frame 11½", "Mi:SU 5427AwI", 
    "Mi:SU 5786-5787, Sn:SU 5626A, Yt:SU 5472-5473, Sg:SU 5836-5837, AFA:SU 5726-27", 
    "Mi:SU 5786, Sn:SU 5625, Yt:SU 5472, Sg:SU 5836, AFA:SU 5726", 
    "Mi:SU 5787, Sn:SU 5626, Yt:SU 5473, Sg:SU 5837, AFA:SU 5727", 
    "Mi:SU 5788, Sn:SU 5627, Yt:SU 5474, Sg:SU 5830", "Mi:SU 5788KB", 
    "Mi:SU 5789, Sn:SU 5628, Yt:SU 5475, Sg:SU 5831", "Mi:SU 5789KB", 
    "Mi:SU 5790, Yt:SU 5476, Sg:SU 5832", "Mi:SU 5790KB", "Mini Sheet", 
    "Multicolor", "Offset lithography", "Olympic Games | Skiing | Slalom | Sports | Winter Sports", 
    "Olympic Games | Skiing | Sports", "Olympic Games | Slalom | Sports | Winter Sports", 
    "ordinary", "Photogravure", "Sable (Martes zibellina), Cedar", 
    "Se-tenant", "Severing the chain of bondage", "Sheet of 8 x SU5789", 
    "Sheet of 8 x SU5790", "Slalom", "Stamp", "Winter Olympic Games 1988, Calgary", 
    "XV Winter Olympic Games in Calgary."), class = "factor")), .Names = c("Etiqueta", 
"Valor"), row.names = c(NA, 20L), class = "data.frame")

As you can see, it is a data frame with 2 columns and 20 rows.如您所见,它是一个有 2 列和 20 行的数据框。 This is the data frame:这是数据框:

                                 Etiqueta                                                                          Valor
1         Sable (Martes zibellina), Cedar                                                Sable (Martes zibellina), Cedar
2                                 Series:                                                         Definitive Issue No.12
3                          Catalog codes:                                                                  Mi:SU 5427AwI
4                               Variants:                                                          Click to see variants
5                                 Themes:                                                      Animals (Fauna) | Mammals
6                              Issued on:                                                                        1988-03
7                                   Size:                                                                     26 x 37 mm
8                                 Colors:                                                                    Brown black
9                                 Format:                                                                          Stamp
10                              Emission:                                                                     Definitive
11                           Perforation:                                                                  comb 12½ x 12
12                              Printing:                                                             Offset lithography
13                                 Paper:                                                                       ordinary
14                            Face value:                                                               35 Russian kopek
15                             Print run:                                                                      4,000,000
16                                 Score:                                                          53%\tAccuracy: Medium
17 70th Anniversary of First Soviet Stamp                                         70th Anniversary of First Soviet Stamp
18                                Series:                                         70th Anniversary of First Soviet Stamp
19                         Catalog codes: Mi:SU 5786-5787, Sn:SU 5626A, Yt:SU 5472-5473, Sg:SU 5836-5837, AFA:SU 5726-27
20                              Variants:                                                          Click to see variants

By looking atthe table, you can see that rows 1 and 17 contain the same value in both columns, so that their values are repeated.通过查看表格,您可以看到第 1 行和第 17 行在两列中包含相同的值,因此它们的值是重复的。 In these cases, I would like to convert the value of the left column in Title .在这些情况下,我想转换Title左列的值。

Notice this is an example, and I could do it manually.请注意,这是一个示例,我可以手动完成。 However, the original dataframe is significantly larger.但是,原始数据框要大得多。

So, how can I convert left column name in title in those repeated values in the rows?那么,如何在行中的那些重复值中转换标题中的左列名称? The resulting data frame migh be identical to the next one:结果数据帧可能与下一个相同:

                                 Etiqueta                                                                          Valor
1                                   Title                                                Sable (Martes zibellina), Cedar
2                                 Series:                                                         Definitive Issue No.12
3                          Catalog codes:                                                                  Mi:SU 5427AwI
4                               Variants:                                                          Click to see variants
5                                 Themes:                                                      Animals (Fauna) | Mammals
6                              Issued on:                                                                        1988-03
7                                   Size:                                                                     26 x 37 mm
8                                 Colors:                                                                    Brown black
9                                 Format:                                                                          Stamp
10                              Emission:                                                                     Definitive
11                           Perforation:                                                                  comb 12½ x 12
12                              Printing:                                                             Offset lithography
13                                 Paper:                                                                       ordinary
14                            Face value:                                                               35 Russian kopek
15                             Print run:                                                                      4,000,000
16                                 Score:                                                          53%\tAccuracy: Medium
17                                  Title                                         70th Anniversary of First Soviet Stamp
18                                Series:                                         70th Anniversary of First Soviet Stamp
19                         Catalog codes: Mi:SU 5786-5787, Sn:SU 5626A, Yt:SU 5472-5473, Sg:SU 5836-5837, AFA:SU 5726-27
20                              Variants:                                                          Click to see variants

Using == to check for identity and conditional replacement.使用==检查身份和条件替换。 You'll need to add "Title" to the factor levels beforehand.您需要事先将"Title"添加到因子水平。

levels(dat$Etiqueta) <- c(levels(dat$Etiqueta), "Title")
dat[apply(dat, 1, function(x) x[1] == x[2]), 1] <- "Title"
#          Etiqueta                                                                          Valor
# 1           Title                                                Sable (Martes zibellina), Cedar
# 2         Series:                                                         Definitive Issue No.12
# 3  Catalog codes:                                                                  Mi:SU 5427AwI
# 4       Variants:                                                          Click to see variants
# 5         Themes:                                                      Animals (Fauna) | Mammals
# 6      Issued on:                                                                        1988-03
# 7           Size:                                                                     26 x 37 mm
# 8         Colors:                                                                    Brown black
# 9         Format:                                                                          Stamp
# 10      Emission:                                                                     Definitive
# 11   Perforation:                                                                  comb 12½ x 12
# 12      Printing:                                                             Offset lithography
# 13         Paper:                                                                       ordinary
# 14    Face value:                                                               35 Russian kopek
# 15     Print run:                                                                      4,000,000
# 16         Score:                                                           53%\tAccuracy: Medium
# 17          Title                                         70th Anniversary of First Soviet Stamp
# 18        Series:                                         70th Anniversary of First Soviet Stamp
# 19 Catalog codes: Mi:SU 5786-5787, Sn:SU 5626A, Yt:SU 5472-5473, Sg:SU 5836-5837, AFA:SU 5726-27
# 20      Variants:                                                          Click to see variants

If you don't have any special reason to keep data as factors, turn them to character and then you can assign value directly.如果你没有什么特别的理由把数据作为因子,把它们变成字符,然后你就可以直接赋值了。

df[] <- lapply(df, as.character)
df$Etiqueta[df$Etiqueta == df$Valor] <- 'Title'

We can use tidyverse methods我们可以使用 tidyverse 方法

library(dplyr)
df %>%
      mutate_all(as.character) %>%
      mutate(Etiqueta = case_when(Etiqueta == Valor ~ "Title", TRUE ~ Etiqueta))

Or if the factor column needs to remain as factor use forcats或者如果factor列需要保留为factor使用forcats

library(forcats)
i1 <- with(df, as.character(Etiqueta) == as.character(Valor))
newlvl <- setNames(as.character(df$Etiqueta[i1]), rep("Title", sum(i1)))
df <- df %>%
         mutate(Etiqueta = fct_recode(Etiqueta, !!!newlvl))
df
#Etiqueta                                                                          Valor
#1           Title                                                Sable (Martes zibellina), Cedar
#2         Series:                                                         Definitive Issue No.12
#3  Catalog codes:                                                                  Mi:SU 5427AwI
#4       Variants:                                                          Click to see variants
#5         Themes:                                                      Animals (Fauna) | Mammals
#6      Issued on:                                                                        1988-03
#7           Size:                                                                     26 x 37 mm
#8         Colors:                                                                    Brown black
#9         Format:                                                                          Stamp
#10      Emission:                                                                     Definitive
#11   Perforation:                                                                  comb 12½ x 12
#12      Printing:                                                             Offset lithography
#13         Paper:                                                                       ordinary
#14    Face value:                                                               35 Russian kopek
#15     Print run:                                                                      4,000,000
#16         Score:                                                          53%\tAccuracy: Medium
#17          Title                                         70th Anniversary of First Soviet Stamp
#18        Series:                                         70th Anniversary of First Soviet Stamp
#19 Catalog codes: Mi:SU 5786-5787, Sn:SU 5626A, Yt:SU 5472-5473, Sg:SU 5836-5837, AFA:SU 5726-27
#20      Variants:                                                          Click to see variants

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM