使用 R 中的 read.csv 從 URL 讀取 Corona 數據錯誤

Question

我曾經從 github 讀取數據沒有問題，現在我使用相同的簡單代碼出現錯誤。

x <- getURL("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
y <- read.csv(x, header = FALSE)

文件中的錯誤（文件，“rt”）：無法打開連接

此外：

警告消息：在文件中（文件，“rt”）：無法打開文件 'HTTP/1.1 200 OK

Answer 1

使用data.table::fread ，它工作我剛剛檢查

data.table::fread("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")

作品

例子：

df <- data.table::fread("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
df[,1:5]
 [0%] Downloaded 0 bytes...
                Province/State        Country/Region       Lat       Long 1/22/20
  1:                                     Afghanistan  33.00000  65.000000       0
  2:                                         Albania  41.15330  20.168300       0
  3:                                         Algeria  28.03390   1.659600       0
  4:                                         Andorra  42.50630   1.521800       0
  5:                                          Angola -11.20270  17.873900       0
 ---                                                                             
260: Saint Pierre and Miquelon                France  46.88520 -56.315900       0
261:                                     South Sudan   6.87700  31.307000       0
262:                                  Western Sahara  24.21550 -12.885800       0
263:                           Sao Tome and Principe   0.18636   6.613081       0
264:                                           Yemen  15.55273  48.516388       0

Answer 2

直接使用 URL 作為文件並設置header=TRUE ：

file <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"

y <- read.csv(file, header = TRUE)

或使用read_delim閱讀器中的read_delim ：

library("readr")
dat <- read_delim(file, delim=",")

covid 數據的格式不適合在 R 中進行進一步分析，但可以使用 package reshape2或dplyr輕松轉換：

names(dat)[1:2] <- c("Province_State", "Country_Region")

library("dplyr")
dat2 <-
  dat %>%
  ## summarize Country/Region duplicates
  group_by(Country_Region) %>% summarise_at(vars(-(1:4)), sum) %>%
  ## make it a long table
  pivot_longer(cols = -Country_Region, names_to = "time") %>%
  ## convert to ISO 8601 date
  mutate(time = as.POSIXct(time, format="%m/%e/%y"))

使用 R 中的 read.csv 從 URL 讀取 Corona 數據錯誤

問題描述

2 個解決方案

解決方案1
1 2020-04-23 11:53:24

解決方案2
0 2020-04-23 15:55:02

使用 R 中的 read.csv 從 URL 讀取 Corona 數據錯誤

問題描述

2 個解決方案

解決方案1 1 2020-04-23 11:53:24

解決方案2 0 2020-04-23 15:55:02

解決方案1
1 2020-04-23 11:53:24

解決方案2
0 2020-04-23 15:55:02