简体   繁体   中英

R Programming - Prevent “apply” from repeating results beyone the range of data

I have a data set (in a data frame). I am using apply to add a new column to the data set in which the rows of the new column execute a function using elements from other columns within that row. Apply works, but after it has applied the function to every row, it continues beyond the range and just keeps applying the values over and over again.

Here is the data I begin with:

Abbreviation    Name    X    Y    Z     A    B    C
JM              Jim     3    4    5     6    7    8
JS              Jess    5    6    7     8    9    10

Using the below command, I get the following results: Command:

df_new$Test <- apply(df_new,1, function(row) (df_new[,8]/df_new[,6])/(df_new[,5]/df_new[,3]))

Returned Data (from View(df_new))

Abbreviation    Name    X    Y    Z     A    B    C     Test
JM              Jim     3    4    5     6    7    8     .8
JS              Jess    5    6    7     8    9    10    .89
                                                        .8
                                                        .89
                                                        .8
                                                        .89

Also, when I write this data to a csv using the below command I get the following output: Command:

write.csv(df_new,file="Df_new.csv", row.names=FALSE)

Abbreviation    Name    X    Y    Z     A    B    C     Test Test.1  Test.2    Test.3
JM              Jim     3    4    5     6    7    8     .8   .8      .8        .8
JS              Jess    5    6    7     8    9    10    .89  .89     .89       .89 

Ideally, from the above, I just want df_new[1:2,1:9]; however, even trying to create an object that retains only that information, still results in the extra rows (in View(df_new)), or extra columns (when writing to a .csv).

Notice that you supply a function to apply that takes a parameter "row" but you never use that in the function. I also don't see why you would need to use apply as I think that

df_new$Test <- (df_new[,8]/df_new[,6])/(df_new[,5]/df_new[,3])

should give you what you want

You don't really need to use apply in this case. Take advantage of the fact that R is vectorised and simply do:

df_new$Test <- (df_new$C / df_new$A) / (df_new$Z / df_new$X)
# Abbreviation Name X Y Z A B  C      Test
# 1           JM  Jim 3 4 5 6 7  8 0.8000000
# 2           JS Jess 5 6 7 8 9 10 0.8928571

R will treat each column in the sum as a vector and operate on them element-wise. It uses first element from all vectors to return the first value, then the second element from all vectors to return the second value, and then there are no more elements in any vectors, so returns a vector of two numbers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM