简体   繁体   中英

How to scrape all pages of PDF in R?

I would like to scrape information in the first and second page of this pdf: https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf

I managed to obtain a (messy) dataframe from the first page's table but was unable to scrape the second page.

Here's the code that I use to obtain the dataframe from the first page:

library(tabulizer)
library(tidyverse)

link <- "https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf"
df_page1 = extract_tables(link, output = "data.frame", header = FALSE)

I have no idea why the second page's table won't be able to be obtained. Can someone help me in this regard?

You can get tables from all pages, including the second page, when you explicitly set method="stream" :

library("tabulizer")
link <- "https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf"
extract_tables(link, output = "data.frame", header = FALSE, method="stream")
#> [[1]]
#>                      V1        V2          V3          V4
#> 1                       Geography       Total            
#> 2                                                   White
#> 3                                                        
#> 4                                                        
#> 5         United States           308,745,538 241,937,061
#> 6               Alabama             4,779,736   3,362,877
#> 7                Alaska               710,231     483,873
#> 8               Arizona             6,392,017   5,418,483
#> 9              Arkansas             2,915,918   2,342,403
#> 10           California            37,253,956  27,636,403
#> 11             Colorado             5,029,196   4,450,623
#> 12          Connecticut             3,574,097   2,950,820
#> 13             Delaware               897,934     645,770
#> 14 District of Columbia               601,723     251,265
#> 15              Florida            18,801,310  14,808,867
#> 16              Georgia             9,687,653   6,144,931
#> 17               Hawaii             1,360,301     349,051
#> 18                Idaho             1,567,582   1,476,097
#> 19             Illinois            12,830,632  10,030,587
#> 20              Indiana             6,483,802   5,638,833
#> 21                 Iowa             3,046,355   2,839,615
#> 22               Kansas             2,853,118   2,501,057
#> 23             Kentucky             4,339,367   3,864,193
#> 24            Louisiana             4,533,372   2,902,875
#> 25                Maine             1,328,361   1,269,764
#> 26             Maryland             5,773,552   3,541,379
#> 27        Massachusetts             6,547,629   5,524,937
#> 28             Michigan             9,883,640   7,949,497
#> 29            Minnesota             5,303,925   4,623,461
#> 30          Mississippi             2,967,297   1,789,391
#> 31             Missouri             5,988,927   5,038,407
#> 32              Montana               989,415     891,529
#> 33             Nebraska             1,826,341   1,649,264
#> 34               Nevada             2,700,551   2,106,494
#> 35        New Hampshire             1,316,470   1,248,321
#> 36           New Jersey             8,791,894   6,546,498
#> 37           New Mexico             2,059,179   1,720,992
#> 38             New York            19,378,102  13,901,661
#> 39       North Carolina             9,535,483   6,898,296
#> 40         North Dakota               672,591     609,136
#> 41                 Ohio            11,536,504   9,664,524
#> 42             Oklahoma             3,751,351   2,851,510
#> 43               Oregon             3,831,074   3,403,252
#> 44         Pennsylvania            12,702,379  10,663,774
#>                                  V5         V6
#> 1                        Race Alone           
#> 2  Black or African American Indian      Asian
#> 3               American and Alaska           
#> 4                            Native           
#> 5              40,250,635 3,739,506 15,159,516
#> 6                  1,259,224 32,903     55,240
#> 7                    24,441 106,268     38,882
#> 8                   280,905 335,278    188,456
#> 9                    454,021 26,134     37,537
#> 10                2,486,549 622,107  5,038,123
#> 11                   214,919 78,144    144,819
#> 12                   392,131 16,734    140,516
#> 13                    196,281 5,929     29,342
#> 14                    310,379 3,264     21,705
#> 15                 3,078,067 89,119    474,199
#> 16                 2,993,927 48,599    323,459
#> 17                     22,473 4,960    531,633
#> 18                    10,950 25,782     20,034
#> 19                 1,903,458 73,846    604,399
#> 20                   603,797 24,487    105,535
#> 21                    91,695 13,563     54,232
#> 22                   173,298 33,044     69,628
#> 23                   342,804 12,105     50,177
#> 24                 1,462,969 33,037     71,829
#> 25                     16,269 8,771     13,783
#> 26                 1,731,513 30,885    326,655
#> 27                   504,365 29,944    359,673
#> 28                 1,416,067 68,396    243,062
#> 29                   280,949 67,325    217,792
#> 30                 1,103,101 16,837     26,477
#> 31                   700,178 30,595    100,213
#> 32                     4,215 63,495      6,379
#> 33                    85,971 23,418     33,322
#> 34                   231,224 42,965    203,478
#> 35                     16,365 3,530     28,933
#> 36                 1,282,005 49,907    746,212
#> 37                   49,006 208,890     31,253
#> 38                3,378,047 183,046  1,481,555
#> 39                2,088,362 147,566    215,952
#> 40                     8,248 36,948      7,032
#> 41                 1,426,861 29,674    196,693
#> 42                  284,332 335,664     67,126
#> 43                    74,414 66,784    145,009
#> 44                 1,431,826 39,735    358,195
#> 
#> [[2]]
#>                V1        V2         V3         V4
#> 1                 Geography      Total           
#> 2                                           White
#> 3                                                
#> 4                                                
#> 5    Rhode Island            1,052,567    910,253
#> 6  South Carolina            4,625,364  3,164,143
#> 7    South Dakota              814,180    706,690
#> 8       Tennessee            6,346,105  5,056,311
#> 9           Texas           25,145,561 20,389,793
#> 10           Utah            2,763,885  2,547,329
#> 11        Vermont              625,741    598,592
#> 12       Virginia            8,001,024  5,725,432
#> 13     Washington            6,724,540  5,535,262
#> 14  West Virginia            1,852,994  1,746,513
#> 15      Wisconsin            5,686,986  5,036,923
#> 16        Wyoming              563,626    529,110
#>                                  V5        V6
#> 1                        Race Alone          
#> 2  Black or African American Indian     Asian
#> 3               American and Alaska          
#> 4                            Native          
#> 5                      75,073 9,173    31,768
#> 6                  1,302,865 24,665    61,247
#> 7                     10,533 72,782     7,775
#> 8                  1,068,010 26,256    93,897
#> 9                 3,070,440 251,209 1,000,473
#> 10                    33,864 40,729    57,800
#> 11                      6,456 2,308     8,069
#> 12                 1,579,414 41,525   449,149
#> 13                  252,333 122,649   491,685
#> 14                     63,885 3,975    12,637
#> 15                   367,021 60,100   131,828
#> 16                     5,135 14,457     4,649
#> 
#> [[3]]
#>                      V1        V2                V3          V4
#> 1                       Geography        Race Alone Two or More
#> 2                                                         Races
#> 3                                   Native Hawaiian            
#> 4                                 and Other Pacific            
#> 5                                          Islander            
#> 6         United States                     674,625   6,984,195
#> 7               Alabama                       5,208      64,284
#> 8                Alaska                       7,662      49,105
#> 9               Arizona                      16,112     152,783
#> 10             Arkansas                       6,685      49,138
#> 11           California                     181,431   1,289,343
#> 12             Colorado                       8,420     132,271
#> 13          Connecticut                       3,491      70,405
#> 14             Delaware                         690      19,922
#> 15 District of Columbia                         770      14,340
#> 16              Florida                      18,790     332,268
#> 17              Georgia                      10,454     166,283
#> 18               Hawaii                     138,292     313,892
#> 19                Idaho                       2,786      31,933
#> 20             Illinois                       7,436     210,906
#> 21              Indiana                       3,532     107,618
#> 22                 Iowa                       2,419      44,831
#> 23               Kansas                       2,864      73,227
#> 24             Kentucky                       3,199      66,889
#> 25            Louisiana                       2,588      60,074
#> 26                Maine                         377      19,397
#> 27             Maryland                       5,391     137,729
#> 28        Massachusetts                       5,971     122,739
#> 29             Michigan                       3,442     203,176
#> 30            Minnesota                       2,958     111,440
#> 31          Mississippi                       1,700      29,791
#> 32             Missouri                       7,178     112,356
#> 33              Montana                         734      23,063
#> 34             Nebraska                       2,061      32,305
#> 35               Nevada                      19,307      97,083
#> 36        New Hampshire                         532      18,789
#> 37           New Jersey                       7,731     159,541
#> 38           New Mexico                       3,132      45,906
#> 39             New York                      24,000     409,793
#> 40       North Carolina                      10,309     174,998
#> 41         North Dakota                         334      10,893
#> 42                 Ohio                       5,336     213,416
#> 43             Oklahoma                       5,354     207,365
#> 44               Oregon                      14,649     126,966
#> 45         Pennsylvania                       7,115     201,734
#> 46         Rhode Island                       1,602      24,698
#> 47       South Carolina                       3,957      68,487
#> 48         South Dakota                         517      15,883
#> 49            Tennessee                       5,426      96,205
#> 50                Texas                      31,242     402,404
#> 51                 Utah                      26,049      58,114
#> 52              Vermont                         175      10,141
#> 53             Virginia                       8,201     197,303
#> 54           Washington                      43,505     279,106
#> 55        West Virginia                         485      25,499
#> 56            Wisconsin                       2,505      88,609
#> 57              Wyoming    ``` r
library("tabulizer")
link <- "https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf"
extract_tables(link, output = "data.frame", header = FALSE, method="stream")
#> [[1]]
#>                      V1        V2          V3          V4
#> 1                       Geography       Total            
#> 2                                                   White
#> 3                                                        
#> 4                                                        
#> 5         United States           308,745,538 241,937,061
#> 6               Alabama             4,779,736   3,362,877
#> 7                Alaska               710,231     483,873
#> 8               Arizona             6,392,017   5,418,483
#> 9              Arkansas             2,915,918   2,342,403
#> 10           California            37,253,956  27,636,403
#> 11             Colorado             5,029,196   4,450,623
#> 12          Connecticut             3,574,097   2,950,820
#> 13             Delaware               897,934     645,770
#> 14 District of Columbia               601,723     251,265
#> 15              Florida            18,801,310  14,808,867
#> 16              Georgia             9,687,653   6,144,931
#> 17               Hawaii             1,360,301     349,051
#> 18                Idaho             1,567,582   1,476,097
#> 19             Illinois            12,830,632  10,030,587
#> 20              Indiana             6,483,802   5,638,833
#> 21                 Iowa             3,046,355   2,839,615
#> 22               Kansas             2,853,118   2,501,057
#> 23             Kentucky             4,339,367   3,864,193
#> 24            Louisiana             4,533,372   2,902,875
#> 25                Maine             1,328,361   1,269,764
#> 26             Maryland             5,773,552   3,541,379
#> 27        Massachusetts             6,547,629   5,524,937
#> 28             Michigan             9,883,640   7,949,497
#> 29            Minnesota             5,303,925   4,623,461
#> 30          Mississippi             2,967,297   1,789,391
#> 31             Missouri             5,988,927   5,038,407
#> 32              Montana               989,415     891,529
#> 33             Nebraska             1,826,341   1,649,264
#> 34               Nevada             2,700,551   2,106,494
#> 35        New Hampshire             1,316,470   1,248,321
#> 36           New Jersey             8,791,894   6,546,498
#> 37           New Mexico             2,059,179   1,720,992
#> 38             New York            19,378,102  13,901,661
#> 39       North Carolina             9,535,483   6,898,296
#> 40         North Dakota               672,591     609,136
#> 41                 Ohio            11,536,504   9,664,524
#> 42             Oklahoma             3,751,351   2,851,510
#> 43               Oregon             3,831,074   3,403,252
#> 44         Pennsylvania            12,702,379  10,663,774
#>                                  V5         V6
#> 1                        Race Alone           
#> 2  Black or African American Indian      Asian
#> 3               American and Alaska           
#> 4                            Native           
#> 5              40,250,635 3,739,506 15,159,516
#> 6                  1,259,224 32,903     55,240
#> 7                    24,441 106,268     38,882
#> 8                   280,905 335,278    188,456
#> 9                    454,021 26,134     37,537
#> 10                2,486,549 622,107  5,038,123
#> 11                   214,919 78,144    144,819
#> 12                   392,131 16,734    140,516
#> 13                    196,281 5,929     29,342
#> 14                    310,379 3,264     21,705
#> 15                 3,078,067 89,119    474,199
#> 16                 2,993,927 48,599    323,459
#> 17                     22,473 4,960    531,633
#> 18                    10,950 25,782     20,034
#> 19                 1,903,458 73,846    604,399
#> 20                   603,797 24,487    105,535
#> 21                    91,695 13,563     54,232
#> 22                   173,298 33,044     69,628
#> 23                   342,804 12,105     50,177
#> 24                 1,462,969 33,037     71,829
#> 25                     16,269 8,771     13,783
#> 26                 1,731,513 30,885    326,655
#> 27                   504,365 29,944    359,673
#> 28                 1,416,067 68,396    243,062
#> 29                   280,949 67,325    217,792
#> 30                 1,103,101 16,837     26,477
#> 31                   700,178 30,595    100,213
#> 32                     4,215 63,495      6,379
#> 33                    85,971 23,418     33,322
#> 34                   231,224 42,965    203,478
#> 35                     16,365 3,530     28,933
#> 36                 1,282,005 49,907    746,212
#> 37                   49,006 208,890     31,253
#> 38                3,378,047 183,046  1,481,555
#> 39                2,088,362 147,566    215,952
#> 40                     8,248 36,948      7,032
#> 41                 1,426,861 29,674    196,693
#> 42                  284,332 335,664     67,126
#> 43                    74,414 66,784    145,009
#> 44                 1,431,826 39,735    358,195
#> 
#> [[2]]
#>                V1        V2         V3         V4
#> 1                 Geography      Total           
#> 2                                           White
#> 3                                                
#> 4                                                
#> 5    Rhode Island            1,052,567    910,253
#> 6  South Carolina            4,625,364  3,164,143
#> 7    South Dakota              814,180    706,690
#> 8       Tennessee            6,346,105  5,056,311
#> 9           Texas           25,145,561 20,389,793
#> 10           Utah            2,763,885  2,547,329
#> 11        Vermont              625,741    598,592
#> 12       Virginia            8,001,024  5,725,432
#> 13     Washington            6,724,540  5,535,262
#> 14  West Virginia            1,852,994  1,746,513
#> 15      Wisconsin            5,686,986  5,036,923
#> 16        Wyoming              563,626    529,110
#>                                  V5        V6
#> 1                        Race Alone          
#> 2  Black or African American Indian     Asian
#> 3               American and Alaska          
#> 4                            Native          
#> 5                      75,073 9,173    31,768
#> 6                  1,302,865 24,665    61,247
#> 7                     10,533 72,782     7,775
#> 8                  1,068,010 26,256    93,897
#> 9                 3,070,440 251,209 1,000,473
#> 10                    33,864 40,729    57,800
#> 11                      6,456 2,308     8,069
#> 12                 1,579,414 41,525   449,149
#> 13                  252,333 122,649   491,685
#> 14                     63,885 3,975    12,637
#> 15                   367,021 60,100   131,828
#> 16                     5,135 14,457     4,649
#> 
#> [[3]]
#>                      V1        V2                V3          V4
#> 1                       Geography        Race Alone Two or More
#> 2                                                         Races
#> 3                                   Native Hawaiian            
#> 4                                 and Other Pacific            
#> 5                                          Islander            
#> 6         United States                     674,625   6,984,195
#> 7               Alabama                       5,208      64,284
#> 8                Alaska                       7,662      49,105
#> 9               Arizona                      16,112     152,783
#> 10             Arkansas                       6,685      49,138
#> 11           California                     181,431   1,289,343
#> 12             Colorado                       8,420     132,271
#> 13          Connecticut                       3,491      70,405
#> 14             Delaware                         690      19,922
#> 15 District of Columbia                         770      14,340
#> 16              Florida                      18,790     332,268
#> 17              Georgia                      10,454     166,283
#> 18               Hawaii                     138,292     313,892
#> 19                Idaho                       2,786      31,933
#> 20             Illinois                       7,436     210,906
#> 21              Indiana                       3,532     107,618
#> 22                 Iowa                       2,419      44,831
#> 23               Kansas                       2,864      73,227
#> 24             Kentucky                       3,199      66,889
#> 25            Louisiana                       2,588      60,074
#> 26                Maine                         377      19,397
#> 27             Maryland                       5,391     137,729
#> 28        Massachusetts                       5,971     122,739
#> 29             Michigan                       3,442     203,176
#> 30            Minnesota                       2,958     111,440
#> 31          Mississippi                       1,700      29,791
#> 32             Missouri                       7,178     112,356
#> 33              Montana                         734      23,063
#> 34             Nebraska                       2,061      32,305
#> 35               Nevada                      19,307      97,083
#> 36        New Hampshire                         532      18,789
#> 37           New Jersey                       7,731     159,541
#> 38           New Mexico                       3,132      45,906
#> 39             New York                      24,000     409,793
#> 40       North Carolina                      10,309     174,998
#> 41         North Dakota                         334      10,893
#> 42                 Ohio                       5,336     213,416
#> 43             Oklahoma                       5,354     207,365
#> 44               Oregon                      14,649     126,966
#> 45         Pennsylvania                       7,115     201,734
#> 46         Rhode Island                       1,602      24,698
#> 47       South Carolina                       3,957      68,487
#> 48         South Dakota                         517      15,883
#> 49            Tennessee                       5,426      96,205
#> 50                Texas                      31,242     402,404
#> 51                 Utah                      26,049      58,114
#> 52              Vermont                         175      10,141
#> 53             Virginia                       8,201     197,303
#> 54           Washington                      43,505     279,106
#> 55        West Virginia                         485      25,499
#> 56            Wisconsin                       2,505      88,609
#> 57              Wyoming                         521       9,754

Created on 2020-06-13 by the reprex package (v0.3.0) 521 9,754


<sup>Created on 2020-06-13 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM