[英]Apply a function to a certain column in a list of data frames
I'm trying to convert numeric months (1,2,3,4..12) to month abbreviations (see mymonths
) in a list of data frames df_list
by using lapply
and can't seem to get it to output properly. 我试图通过使用lapply
将数字月(1,2,3,4..12)转换为数据帧df_list
列表中的月份缩写(请参阅mymonths
),似乎无法使其正确输出。 All data frames in the list have the same variables. 列表中的所有数据帧具有相同的变量。
Using the code below, the new df_list2
contains only the new months column, and no other data from the original frames. 使用下面的代码,新的df_list2
仅包含新的months列, df_list2
包含原始帧中的其他数据。 Sorry for the poor example data, but I think I'm just missing a simple command for getting the whole original data set, rather than just the months column. 不好意思的示例数据,但是我想我只是缺少一个简单的命令来获取整个原始数据集,而不仅仅是月份列。
# create example data
d1 <- data.frame(month = c(1:3), val = c(1,2,5))
d2 <- data.frame(month = c(1:5), val = c(1,2,5,6,8))
df_list <- list(d1, d2)
> df_list
[[1]]
month val
1 1 1
2 2 2
3 3 5
[[2]]
month val
1 1 1
2 2 2
3 3 5
4 4 6
5 5 8
mymonths <- c("JAN","FEB","MAR",
"APR","MAY","JUN",
"JUL","AUG","SEP",
"OCT","NOV","DEC")
df_list2 <- lapply(df_list , function(x) {
x[,1] <- mymonths [ x[,1] ]
})
> df_list2
[[1]]
[1] "JAN" "FEB" "MAR"
[[2]]
[1] "JAN" "FEB" "MAR" "APR" "MAY"
Just need to output the data frame within your lapply
bit 只需要在您的lapply
位中输出数据帧
# create example data
d1 <- data.frame(month = c(1:3), val = c(1,2,5))
d2 <- data.frame(month = c(1:5), val = c(1,2,5,6,8))
df_list <- list(d1, d2)
mymonths <- c("JAN","FEB","MAR",
"APR","MAY","JUN",
"JUL","AUG","SEP",
"OCT","NOV","DEC")
If the month column refers to the month then... 如果月份列涉及月份,则...
df_list2 <- lapply(df_list , function(x) {
x[,1] <- mymonths[ x[,1] ]
x
})
df_list2
[[1]]
month val
1 JAN 1
2 FEB 2
3 MAR 5
[[2]]
month val
1 JAN 1
2 FEB 2
3 MAR 5
4 APR 6
5 MAY 8
If the value column refers to the month then... 如果值列涉及月份,则...
df_list2 <- lapply(df_list , function(x) {
x[,1] <- mymonths[ x[,2] ]
x
})
df_list2
[[1]]
month val
1 JAN 1
2 FEB 2
3 MAY 5
[[2]]
month val
1 JAN 1
2 FEB 2
3 MAY 5
4 JUN 6
5 AUG 8
But you have to output each data.frame within the function defined in lapply
但是您必须在lapply
定义的函数中输出每个lapply
There is very minor mistake in your 'lapply` usase. 您的'lapply'使用情况中存在一个非常小的错误。 Please change the code as: 请更改代码为:
df_list2 <- lapply(df_list , function(x) {
x[,2] <- mymonths [ x[,2] ]
x
})
The actual value of the month
column should be passed to mymonths
vector. month
列的实际值应传递到mymonths
向量。 Hence please pass x[,2]
. 因此,请传递x[,2]
。
One more point is that x
should be returned from the function. 还有一点是应该从函数返回x
。 Hence additional lines have been added. 因此,添加了其他行。
Now the output of df_list2
will be: 现在, df_list2
的输出将是:
> df_list2
[[1]]
month val
1 1 JAN
2 2 FEB
3 3 MAY
[[2]]
month val
1 1 JAN
2 2 FEB
3 3 MAY
4 4 JUN
5 5 AUG
Isn't that word you are looking for called join
? 您要查找的单词不是join
吗?
library(dplyr)
library(purrr)
# create example data
df_list <- list(data.frame(month = c(1:3), val = c(1,2,5)),
data.frame(month = c(1:5), val = c(1,2,5,6,8)))
mymonths <- data.frame(month_name=c("JAN","FEB","MAR",
"APR","MAY","JUN",
"JUL","AUG","SEP",
"OCT","NOV","DEC"),
month=seq(12))
map(df_list,left_join, mymonths)
We get list of dataframes back 我们得到数据框列表
[[1]]
month val month_name
1 1 1 JAN
2 2 2 FEB
3 3 5 MAR
[[2]]
month val month_name
1 1 1 JAN
2 2 2 FEB
3 3 5 MAR
4 4 6 APR
5 5 8 MAY
simply use the transform
function: Depending on the name you want to assign to the new variable you can rewrite the existing variable or create a totally new variable: 只需使用transform
函数:根据您要分配给新变量的名称,您可以重写现有变量或创建一个全新的变量:
rewriting an existing variable: 重写现有变量:
lapply(df_list,transform,month=mymonths[month])
[[1]]
month val
1 JAN 1
2 FEB 2
3 MAR 5
[[2]]
month val
1 JAN 1
2 FEB 2
3 MAR 5
4 APR 6
5 MAY 8
creating a new variable:
lapply(df_list,transform,newcolumn=mymonths[month])
[[1]]
month val newcolumn
1 1 1 JAN
2 2 2 FEB
3 3 5 MAR
[[2]]
month val newcolumn
1 1 1 JAN
2 2 2 FEB
3 3 5 MAR
4 4 6 APR
5 5 8 MAY
Using tidyverse
package, map
function from purrr
package and month.abb
constant in base R: 使用tidyverse
包,从purrr
包中map
函数,并在基数R中使用month.abb
常量:
library(tidyverse)
d1 <- data.frame(month = c(1:3), val = c(1,2,5))
d2 <- data.frame(month = c(1:5), val = c(1,2,5,6,8))
df_list <- list(d1, d2)
month_abbreviation <- function(x)
transform(x, MonthAbb = month.abb[month])
Let's use map function from purrr package to run iteratively your function without using for loops 让我们使用purrr包中的map函数来迭代运行函数,而无需使用for循环
list_of_df <- map(df_list, month_abbreviation)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.