简体   繁体   中英

Scraping a Table into R using XML package

I am trying to scrape this table into R.

I am reading in the data using the XML library with the following command.

acsi <- htmlParse("https://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Wireless+Telephone+Service")

However, I immediately get this: Warning: XML content does not seem to be XML: 'ss+Telephone+Service' . What am I doing wrong? Why isn't my table reading in properly?

Not sure about the package you tried, but here's a way to do it using rvest .

library(rvest)
raw <- read_html("https://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Wireless+Telephone+Service")
df <- raw %>% html_nodes("table") %>% html_table()
head(df)
> head(df)
[[1]]
                           X1        X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
1                             Base-line 95 96 97 98 99  0  1   2   3  04  05  06  07
2                  All Others           NA NA NA NA NA NA NA  NA  NA  70  65  68  68
3           TracFone Wireless           NA NA NA NA NA NA NA  NA  NA  NM  NM  NM  NM
4                    T-Mobile           NA NA NA NA NA NA NA  NA  NA  NM  64  69  70
5            Verizon Wireless           NA NA NA NA NA NA NA  NA  NA  68  67  69  71
6  Wireless Telephone Service           NA NA NA NA NA NA NA  NA  NA  65  63  66  68
7                        AT&T           NA NA NA NA NA NA NA  NA  NA  63  62  63  68
8               U.S. Cellular           NA NA NA NA NA NA NA  NA  NA  NM  NM  NM  NM
9           Sprint (T-Mobile)           NA NA NA NA NA NA NA  NA  NA  59  63  63  61
10      Nextel Communications           NA NA NA NA NA NA NA  NA  NA  NM  59   #    
11              AT&T Wireless           NA NA NA NA NA NA NA  NA  NA  61   #        
12                     Sprint           NA NA NA NA NA NA NA  NA  NA  59  63  63  61
   X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29                 X30
1   08  09  10  11  12  13  14  15  16  17  18  19  20  21 PreviousYear%Change
2   71  73  76  77  76  78  78  79  77  79  80  81  77  NA                -4.9
3   NM  NM  NM  NM  NM  NM  NM  77  75  77  78  78  76  NA                -2.6
4   71  71  73  70  69  68  69  70  74  73  76  76  75  NA                -1.3
5   72  74  73  72  70  73  75  71  71  74  74  74  74  NA                 0.0
6   68  69  72  71  70  72  72  70  71  73  74  75  74  NA                -1.3
7   71  67  69  66  69  70  68  70  71  72  74  74  74  NA                 0.0
8   NM  NM  NM  NM  NM  NM  NM  NM  72  74  74  74  71  NA                -4.1
9   56  63  70  72  71  71  68  65  70  73  70  69  70  NA                 1.4
10                                  NA  NA  NA  NA  NA  NA                 N/A
11                                  NA  NA  NA  NA  NA  NA                 N/A
12  56  63  70  72  71  71  68  65  70  73  70  69  NA  NA                -1.4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM