简体   繁体   English

从 R 中的列表中提取数据帧

[英]Extract Data Frames from List in R

I am trying to extract individual counties as a dataframe from a list (which consists of data from all the counties) as separate data frames in R.我正在尝试从列表(由所有县的数据组成)中将单个县提取为 dataframe 作为 R 中的单独数据框。 My code is as below and for the sake of illustration, I am breaking into step 1 (extract data from URL to a list - this part works well) and step 2 (extract individual data.frames from list - this is not working well and gives an individual list with only the last list item)我的代码如下,为了说明起见,我进入第 1 步(从 URL 提取数据到列表 - 这部分效果很好)第 2 步(从列表中提取单个 data.frames - 这效果不好,给出一个只有最后一个列表项的单独列表)

## Step 1: Extract data from URL 
library(data.table)

# List of counties (just a sample here)
x <- data.frame(county = c("12001", "12003", "12005"))

idx <- x$county

#Extract data from URL for list of counties
  qcew_q1 <- lapply((1:nrow(x)),function(area) {
  url <- "http://data.bls.gov/cew/data/api/YEAR/QTR/area/AREA.csv"
  url <- sub("YEAR", 2020, url, ignore.case=FALSE)
  url <- sub("QTR", 1, url, ignore.case=FALSE)
  url <- sub("AREA",idx[area] , url, ignore.case=FALSE)
  fread(url, header = TRUE, sep = ",", quote="\"", dec=".", na.strings="", skip=0)
  
}
)

Once I extract the data from the URL to a list, I am trying to extract the individual counties as separate data frames.一旦我将 URL 中的数据提取到列表中,我就会尝试将各个县提取为单独的数据框。 This the part that is causing issues where it gives only the last item and writes it to a list instead of a data.frame.这是导致问题的部分,它只给出最后一项并将其写入列表而不是 data.frame。 Any insights would be much appreciated.任何见解将不胜感激。

## Step 2: Extract data from step 1 as separate data frames. 
## Writes only last list (12005) to another list.

#Using For statement
for(c in 1:nrow(x)){
  for(i in 1:3){
  q1_idx[c] <- qcew_q1[i]
}
}

# Using lapply
lapply(1:nrow(x),function(cnty){
  for(i in 1:3){
    q1_idx[cnty] <- qcew_q1[i]
  
  }
})

Any insights on how to fix this would be much appreciated.任何有关如何解决此问题的见解将不胜感激。

TIA, TIA,

Krishnan克里希南

You have list of dataframes in qcew_q1 and it is almost always better to keep them as such if you want to perform any further analysis on it.您在qcew_q1中有数据框列表,如果您想对其执行任何进一步的分析,几乎总是最好保留它们。 It is easier to manage and does not pollute the global environment.更易于管理,不污染全球环境。 Maybe to clarify which dataframe is from which county you can assign them names.也许要澄清哪个 dataframe 来自哪个县,您可以为其命名。

names(qcew_q1) <- x$county

If you want a specific county dataframe you can extract it as qcew_q1[['12001']] or qcew_q1[['12003']] .如果您想要一个特定的县 dataframe 您可以将其提取为qcew_q1[['12001']]qcew_q1[['12003']] You can use lapply to iterate over the list and apply a function to each individual dataframe.您可以使用lapply遍历列表并将 function 应用于每个单独的 dataframe。

If you still want individual dataframes in the global environment, assign them names as per your choice and use list2env .如果您仍然需要全局环境中的单个数据框,请根据您的选择为其分配名称并使用list2env

names(qcew_q1) <- paste0('county_', x$county)
list2env(qcew_q1, .GlobalEnv)

The individual dataframes are now called county_12001 , county_12003 etc.各个数据框现在称为county_12001county_12003等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM