在 R 中创建数据框的嵌套循环

Question

Trying to create / store data.frames using a nested for loop.尝试使用嵌套的 for 循环创建/存储data.frames 。

I have some data on countries in a variable called countries , so USA, UK, Germany etc. which I have labeled them 1,2,3 respectively.我在一个叫做countries的变量中有一些关于国家的数据，所以USA, UK, Germany etc.我分别将它们标记为1,2,3 。

I also have data on specific industries in a variable industries for example textiles, retail, other .我还拥有可变industries特定行业的数据，例如textiles, retail, other . Again I have labeled these industries 1,2,3 .我再次将这些行业标记为1,2,3 。

What I am trying to do is to create a new data.frame which will take;我想要做的是创建一个新的data.frame这将需要；

country 1, industry 1
country 1, industry 2
country 1, industry 3

country 2, industry 1
country 2, industry 2
country 2, industry 3

country 3, industry 1
country 3, industry 2
country 3, industry 3

etc.等等。

I am hopeing to carry out analysis on each data.frame我希望对每个data.frame进行分析

what I am currently working with is the following;我目前正在使用的是以下内容；

m <- 3 # m countries
k <- 3 # k industries

    for(i in 1:length(m)){
      country.ID <- m[i]
      for(j in 1:length(k)){
        sector.ID <- k[j]
        S1 <- which(DF$COUNTRY.id == country.ID)
        S2 <- which(DF$INDUSTRY.id == sector.ID)
        rows.2.consider <- intersect(S1, S2)

# Here is where I am trying to save the data.frames for analysis

    }
}

If I have gone wrong at any point please point this out.如果我在任何时候出错，请指出这一点。 But I am trying to create many data.frames for each country and for each region, ie 3 countries * 3 industries in this example would give 9 data.frames但是我正在尝试为每个国家和每个地区创建许多data.frames ，即在这个例子中3 countries * 3 industries将提供9 data.frames

Here some sample code (I am actually using regional data not country data etc but the same pricipal still applies.这里有一些示例代码（我实际上使用的是区域数据而不是国家数据等，但相同的主要数据仍然适用。

# #

 ratios <- structure(list(IDVar = 1:40, Major.sectors = structure(c(5L, 9L, 3L, 15L, 11L, 7L, 18L, 18L, 18L, 3L, 3L, 3L, 3L, 17L, 3L, 11L, 7L, 17L, 3L, 11L, 3L, 18L, 3L, 17L, 9L, 18L, 9L, 19L, 3L, 11L, 11L, 2L, 5L, 3L, 18L, 17L, 4L, 2L, 3L, 3L), .Label = c("Banks", "Chemicals, rubber, plastics, non-metallic products", "Construction", "Education, Health", "Food, beverages, tobacco", "Gas, Water, Electricity", "Hotels & restaurants", "Insurance companies", "Machinery, equipment, furniture, recycling", "Metals & metal products", "Other services", "Post & telecommunications", "Primary sector", "Public administration & defense", "Publishing, printing", "Textiles, wearing apparel, leather", "Transport", "Wholesale & retail trade", "Wood, cork, paper"), class = "factor"), Region.in.country = structure(c(15L, 8L, 8L, 8L, 10L, 15L, 19L, 10L, 8L, 10L, 3L, 18L, 4L, 12L, 4L, 15L, 13L, 4L, 15L, 15L, 7L, 15L, 12L, 1L, 7L, 10L, 15L, 8L, 13L, 15L, 12L, 8L, 7L, 15L, 15L, 10L, 8L, 10L, 10L, 15L), .Label = c("Andalucia", "Aragon", "Asturias", "Canary Islands", "Cantabria", "Castilla-La Mancha", "Castilla y Leon", "Cataluna", "Ceuta", "Comunidad Valenciana", "Extremadura", "Galicia", "Islas Baleares", "La Rioja", "Madrid", "Melilla", "Murcia", "Navarra", "Pais Vasco"), class = "factor"), EBIT.TA = c(-0.234432635519391, -0.884337466274593, -0.00446559204081373, 0.11109107677028, -0.137203773525798, -0.582114677880617, 0.0190497663203189, -3.04252763094666, 0.113157822682219, -0.0255533180037229, 0.281767142199724, 0.0326641697396841, -0.00879974750993553, 0.0542074697816672, -0.112104697294392, -0.191945591325174, -0.00380586115226597, -0.0363239884169068, -0.273949107908537, 0.435398668004486, -0.00563436099927988, -2.75971618056051, -0.1047327709263, 0.151283793741506, -0.0373197549569126, 0.00912639083178201, -0.0386627754065697, -0.018235399636112, -0.0118104711362467, -0.701299939137125, NA, 0.0191819361175666, -0.0104887983706721, -0.801677105519484, -0.402194475974272, -0.124125227730062, 0.143020458476649, -0.601186271451194, 0.0163269364787831, 5.09955167591238), EBIT.TA_l1 = c(-0.443687074746458, -0.561864166134075, -0.0345769510044604, 0.0282541797531804, -0.0181173929170762, 0.0147211350970115, 0.0588534950162799, -1.14097109926961, 0.060100343733096, -0.0386426338471025, 0.049684095221329, 0.0558174150334904, 0.00214962169435867, 0.0399960114646072, 0.0402934579830171, -0.612359147433149, -0.0115916125659674, 0.00739473610413031, 0.0174576615247567, 0.68624861825246, 0.0305807338940829, -3.88006243913616, 0.0410122725022661, -0.089491343996377, -0.215219123182103, 0.00967853324842811, -0.0336715197882038, 0.362424791356667, 0.221203934329637, -0.654387857513823, 0.0656934439915892, 0.0652005453654772, 0.0339559014267185, 0.0259085077216708, -0.303606048856146, 0.0280113794301873, 0.109307291990628, -0.470048555841697, -0.00157699300508027, -0.350519090107081 ), EBIT.TA_l2 = c(-0.351308186716873, 0.00159428805074234, -0.00604587147802615, 0.0761894448922952, -0.00348378141492824, NA, 0.0346370866793768, -0.552226781084599, 0.00220031803369861, -0.0285840972149053, 0.065316579236306, 0.4090851643341, -0.0188362202518351, 0.0403848986306371, 0.091146090480032, -0.0154168449752466, -0.0694803621032671, 0.0511978643139393, -0.452924037757731, -0.0091835704914724, 0.0119918914092344, 0.0858960833880717, NA, 0.104901526886479, -0.23096183545392, -0.0163058345980967, 0.100643431561465, 0.0527859573541712, 0.250207316117438, NA, 0.00193240515291123, 0.0624210741756767, 0.0178136227732972, -0.0321294913646274, -0.0699629484084657, -0.00417176180400133, 0.209612573099415, 0.0285645570852926, 0.0551624216079071, 0.0172738293439595), Major.sectors.id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 7L, 7L, 3L, 3L, 3L, 3L, 8L, 3L, 5L, 6L, 8L, 3L, 5L, 3L, 7L, 3L, 8L, 2L, 7L, 2L, 9L, 3L, 5L, 5L, 10L, 1L, 3L, 7L, 8L, 11L, 10L, 3L, 3L), Region.in.country.id = c(1L, 2L, 2L, 2L, 3L, 1L, 4L, 3L, 2L, 3L, 5L, 6L, 7L, 8L, 7L, 1L, 9L, 7L, 1L, 1L, 10L, 1L, 8L, 11L, 10L, 3L, 1L, 2L, 9L, 1L, 8L, 2L, 10L, 1L, 1L, 3L, 2L, 3L, 3L, 1L)), .Names = c("IDVar", "Major.sectors", "Region.in.country", "EBIT.TA", "EBIT.TA_l1", "EBIT.TA_l2", "Major.sectors.id", "Region.in.country.id"), row.names = c(NA, 40L), class = "data.frame")

Answer 1

You can do你可以做

m <- 3 # m countries
k <- 3 # k industries
d <- data.frame(country=rep(1:m, each=k), industry=rep(1:k, m) )

for a single data.frame对于单个 data.frame

You can split that into 9 data.frames您可以将其拆分为 9 个 data.frames

split(d,d)

Answer 2

One option could be using expand.grid .一种选择是使用expand.grid 。 Prepare data.frame with desired country and industry and then expand the same using expand.grid to generate all possible combinations.准备具有所需country和industry data.frame ，然后使用expand.grid对其进行expand.grid以生成所有可能的组合。

df <- data.frame(c= c("country1","country2", "country3"), 
            i = c("industry1", "industry2","industry3"))

library(dplyr)
expand.grid(df) %>% arrange(c)

         c         i
1 country1 industry1
2 country1 industry2
3 country1 industry3
4 country2 industry1
5 country2 industry2
6 country2 industry3
7 country3 industry1
8 country3 industry2
9 country3 industry3

Answer 3

You don't actually need to split data nor create indexes.您实际上不需要拆分数据或创建索引。 Can do like this to run analysis for each industry and country:可以这样做对每个行业和国家进行分析：

YourAnalysis <- function(x) mean(x$EBIT.TA)

by(data = ratios, INDICES = list(ratios$Region.in.country, ratios$Major.sectors), FUN = YourAnalysis)

在 R 中创建数据框的嵌套循环

问题描述

3 个解决方案

解决方案1
1 2018-03-01 22:28:06

解决方案2
0 2018-03-01 22:41:08

解决方案3
0 2018-03-01 22:44:36

在 R 中创建数据框的嵌套循环

问题描述

3 个解决方案

解决方案1 1 2018-03-01 22:28:06

解决方案2 0 2018-03-01 22:41:08

解决方案3 0 2018-03-01 22:44:36

解决方案1
1 2018-03-01 22:28:06

解决方案2
0 2018-03-01 22:41:08

解决方案3
0 2018-03-01 22:44:36