[英]for-while ifelse loop? (R-Programming)
To be honest, I am completely stuck, I'm not quite sure how to phrase the title either.老实说,我完全被卡住了,我也不太确定如何用标题来表达。 I have two datasets, lets say it looks something like this:
我有两个数据集,假设它看起来像这样:
Dataset1 (ie GDP related): Dataset1(即GDP相关):
Year![]() |
Country![]() |
---|---|
2000 ![]() |
Austria![]() |
2001 ![]() |
Austria![]() |
2000 ![]() |
Belgium![]() |
2001 ![]() |
Belgium![]() |
Dataset2 (TAX-related):数据集 2(与税收相关):
Year![]() |
Austria![]() |
Belgium![]() |
---|---|---|
2000 ![]() |
55 ![]() |
48 ![]() |
2001 ![]() |
51 ![]() |
45 ![]() |
So what I would like, is to generate some sort of function/loop that essentially says:所以我想要的是生成某种函数/循环,本质上说:
if our country variable in dataset1 has a name that is a column name in dataset2, use these observations如果我们在数据集 1 中的国家/地区变量的名称是数据集 2 中的列名,请使用这些观察结果
Then, conditional on the year and country, I want to create a new variable in dataset1 called tax, apply the country's tax rate from dataset two into dataset1.然后,以年份和国家为条件,我想在数据集 1 中创建一个名为税的新变量,将数据集二中的国家税率应用到数据集 1 中。
So for instance, we know Austria (observation) is also a name of a variable, then I want to get this tax rate from dataset2, and apply 55 for year 2000 and 56 for 2001, for dataset1.因此,例如,我们知道奥地利(观察)也是一个变量的名称,那么我想从数据集 2 中获取此税率,并将 2000 年的 55 和 2001 年的 56 应用于数据集 1。 And this will go on for all countries and years.
这将为所有国家和年份提供 go。
And should thus look like Dataset1 (ie GDP related):因此应该看起来像 Dataset1(即与 GDP 相关):
Year![]() |
Country![]() |
Tax![]() |
---|---|---|
2000 ![]() |
Austria![]() |
55 ![]() |
2001 ![]() |
Austria![]() |
51 ![]() |
2000 ![]() |
Belgium![]() |
48 ![]() |
2001 ![]() |
Belgium![]() |
45 ![]() |
My dataset is quite big, so it is much preferred if I have some sort of algorithm for this我的数据集很大,所以如果我对此有某种算法,那将是更好的选择
Thanks!谢谢!
Assuming the first data have more columns, then after reshaping the second data to long with pivot_longer
, do a join with the first data ( left_join
) which matches the 'Year', 'Country'假设第一个数据有更多列,然后在使用
pivot_longer
将第二个数据重塑为 long 之后,与匹配“Year”、“Country”的第一个数据( left_join
)进行连接
library(dplyr)
library(tidyr)
df2 %>%
pivot_longer(cols = -Year, names_to = 'Country', values_to = 'Tax') %>%
left_join(df1, .)
-output -输出
Year Country Tax
1 2000 Austria 55
2 2001 Austria 51
3 2000 Belgium 48
4 2001 Belgium 45
df1 <- structure(list(Year = c(2000L, 2001L, 2000L, 2001L), Country = c("Austria",
"Austria", "Belgium", "Belgium")), class = "data.frame", row.names = c(NA,
-4L))
df2 <- structure(list(Year = 2000:2001, Austria = c(55L, 51L), Belgium = c(48L,
45L)), class = "data.frame", row.names = c(NA, -2L))
This should also work:这也应该有效:
library(dplyr)
library(tidyr)
df2 %>%
# pivot_longer(-Year) %>% first solution
pivot_longer(cols = -Year, names_to = 'Country', values_to = 'Tax') %>% # taken from @akrun
arrange(Country)
Year Country Tax
<int> <chr> <int>
1 2000 Austria 55
2 2001 Austria 51
3 2000 Belgium 48
4 2001 Belgium 45
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.