In each data frame row, match each column to a key in another data frame, and sums the values of the key in a new data frame

I'm not quite sure how to word my question. I think that what I want to do is create a loop that takes each value in a data frame row, matches it to a key in another data frame, and sums the key values in each column of that row, storing it in a new data frame with the same dimensions of the key.

It should be much easier to explain using an example. I'm a complete novice to R and programming and am still learning the vocabulary.

I have a dataframe of words where each column corresponds to a phoneme (unique speech sound).

Words_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), Phoneme1 = c("K", "B", "AE"), Phoneme2 = c("AE", "AE", "P"), Phoneme3 = c("T", "T", "AH"), Phoneme4 = c("Null", "Null", "L"))

    word Phoneme1 Phoneme2 Phoneme3 Phoneme4
 1   CAT        K       AE        T       Null
 2   BAT        B       AE        T       Null
 3 APPLE       AE        P       AH        L

I have another data frame where each phoneme corresponds to a series of binary values.

 Phoneme_DF <- data.frame( phoneme = c("AE", "AH", "B", "K", "T", "P", "L"), is_consonant = c(0, 0, 1, 1, 1, 1, 1), is_labial = c(0, 0, 0, 0, 0, 1, 0))

   phoneme is_consonant is_labial
1      AE            0         0
2      AH            0         0
3       B            1         1
4       K            1         0
5       T            1         0
6       P            1         1
7       L            1         0

I'm trying to figure out a way go through each row of my Words_DF, and look up the the value in each phoneme column in my Phoneme_DF and sum them in a new data frame that looks like this:

New_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), consonants_in_word = c(2, 2, 3), labials_in_word = c(0, 1, 1))

    word consonants_in_word labials_in_word
1   CAT                  2               0
2   BAT                  2               1
3 APPLE                  2               1

I have tried writing some kind of loop that goes through each row of Words_DF and within each row goes through each column and looks up that value in the Phoneme_DF, then sums

   New_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), consonants_in_word =      c(0, 0 , 0 ), labials_in_word = c(0, 0, 0))

  for(i in 1:length(SAMPLE_Words)){
    for(j in 1:length(where(SAMPE_Words[[j]]) %in% SAMPLE_Phoneme_DF[i])) {
    rbind(New_DF, sum(Phoneme_DF[i, ]))

I hope my question made sense. Thanks for your help! :)

I think you're desired output is off, Apple should only have 2 consonants. Try this:


Words_DF %>% 
  gather(value, key, -word) %>% 
  left_join(Phoneme_DF, by = c("key" = "phoneme")) %>% 
  group_by(word) %>% 
  mutate(consonants_in_word = sum(is_consonant, na.rm = TRUE),
         labials_in_word = sum(is_labial, na.rm = TRUE)) %>% 
  distinct(word, .keep_all = TRUE) %>% 
  select(word, consonants_in_word, labials_in_word)

Which returns:

# A tibble: 3 x 3
# Groups:   word [3]
   word consonants_in_word labials_in_word
  <chr>              <int>           <int>
1   CAT                  2               0
2   BAT                  2               1
3 APPLE                  2               1

I have the data.table counterpart, for anyone interested:

Phoneme_DF[melt(Words_DF,id.vars = "word", value.name = "phoneme"), on = "phoneme"][
,lapply(.SD,function(x){sum(x,na.rm = TRUE)}),
.SDcols = c("is_consonant","is_labial"),by = word]


    word is_consonant is_labial
1:   CAT            2         0
2:   BAT            2         1
3: APPLE            2         1

Procedure is similar as what tyluRp proposed: you reshape the wordDF data table in long format, join it with the other, and then sum the values of consonant and labelial by word.

