简体   繁体   中英

Coordinates extracted from PDF are not exact

I'm working on rendering a georeferenced pdf within a map, I was able to retrieve the geolocation information from the pdf, but the coordinates I receive are not correct, they are a few meters apart from the places they really should be.

Opening the same PDF in Avenza Maps, it indicates this list of coordinates, and these are correct:

[-26.413082, -51.561534, -26.435838, -51.561643, -26.435909, -51.543773,-26.413152, -51.543667]

In the format I'm doing (reading the PDF as a String and doing a RegEx) I get these values:

[-26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]
[-26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]

But unfortunately none of the two reflect in the correct place (as in avenza). That said, I opened the PDF in Notepad and found other values (more related to conversion and information), and I believe that maybe there is some way to convert the coordinates that I got through this other information, to the correct coordinates.

Follow the informations:

<?xpacket end="w"?>
endstream
endobj
294 0 obj
3495
endobj
295 0 obj
/DeviceRGB
endobj
296 0 obj
<</Length 297 0 R>>stream
/GS_init gs
/Group_6 Do

endstream
endobj
297 0 obj
24
endobj
298 0 obj
<</ExtGState 2 0 R/ColorSpace << /CS_P 295 0 R >>/XObject << /Group_6 6 0 R >>>>endobj
299 0 obj
<</Type /Group/S /Transparency/CS 295 0 R/I false/K false>>endobj
300 0 obj
<</Type /Page/Parent 301 0 R/Contents 296 0 R/Resources 298 0 R/MediaBox [0 0 841.88808 1190.5488]/ArtBox [0 0 841.88808 1190.5488]/UserUnit 1/Group 299 0 R/VP[<</Type /Viewport/BBox [14.1732 147.400915455 822.0456 1133.350548016]/Name (þÿ T S B I I)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>><</Type /Viewport/BBox [14.1732 14.1732 239.961243463 122.688692878]/Name (þÿ R e f e r e n c i a _ M a p a)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>>]>>endobj
301 0 obj
<</Type /Pages/Kids [ 300 0 R ]/Count 1>>endobj
302 0 obj
<<>>endobj
303 0 obj
<</Type /Catalog/Pages 301 0 R/PageMode /UseNone/PageLayout /SinglePage/ViewerPreferences <</PrintScaling /None /FitWindow true /DisplayDocTitle true>>/OpenAction [300 0 R /Fit]/OCProperties<</OCGs [ 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 35 0 R 36 0 R 43 0 R 44 0 R 47 0 R 50 0 R 53 0 R 56 0 R 59 0 R 62 0 R 63 0 R 64 0 R 65 0 R 66 0 R 67 0 R 68 0 R 69 0 R 76 0 R 77 0 R 80 0 R 83 0 R 90 0 R 93 0 R 96 0 R 99 0 R 102 0 R 105 0 R 108 0 R 111 0 R 114 0 R 117 0 R 120 0 R 123 0 R 126 0 R 129 0 R 132 0 R 135 0 R 138 0 R 141 0 R 148 0 R 149 0 R 152 0 R 155 0 R 158 0 R 161 0 R 176 0 R ]/D<</Name (Layers Tree)/Order [ 176 0 R 161 0 R 158 0 R 148 0 R [ 155 0 R 152 0 R 149 0 R ] 141 0 R 138 0 R 135 0 R 132 0 R 129 0 R 126 0 R 123 0 R 120 0 R 117 0 R 114 0 R 111 0 R 108 0 R 105 0 R 102 0 R 99 0 R 96 0 R 93 0 R 90 0 R 83 0 R 80 0 R 62 0 R [ 76 0 R [ 77 0 R ] 63 0 R [ 69 0 R 68 0 R 67 0 R 64 0 R [ 66 0 R 65 0 R ] ] ] 10 0 R [ 59 0 R 43 0 R [ 56 0 R 53 0 R 50 0 R 47 0 R 44 0 R ] 11 0 R [ 36 0 R 35 0 R 22 0 R 21 0 R 12 0 R [ 20 0 R 19 0 R 18 0 R 17 0 R 16 0 R 15 0 R 14 0 R 13 0 R ] ] ] ]/ListMode /VisiblePages>>>>/Metadata 293 0 R>>endobj
304 0 obj
<</Type/XRef/Size 305/W[1 4 2]/Filter/FlateDecode/Info 292 0 R/Root 303 0 R/ID [<c9167b70223726438d277b1b4409c053> <c9167b70223726438d277b1b4409c053>]/Length 923>>stream

I needed someone to tell me some way to get the correct coordinates, I hope this information helps to find

The PDF content in your question includes two ViewPort dictionaries. These dictionaries map a location on the page ("BBox") onto the GPTS referencing the specified WKT.

This is covered in the PDF 2.0 reference ISO-32000-2 section 12.9 & 12.10. Unfortunately, this spec is not freely available, and it's not cheap.

Here are some definitions from the spec:


BBox:

A rectangle in default user space coordinates specifying the location of the viewport on the page. The two coordinate pairs of the rectangle shall be specified in normalised form; that is, lower-left followed by upper-right, relative to the measuring coordinate system. This ordering shall determine the orientation of the measuring coordinate system (that is, the direction of the positive x and y axes) in this viewport, which may have a different rotation from the page.

GPTS:

(Required; PDF 2.0) An array of numbers that shall be taken pairwise, defining points in geographic space as degrees of latitude and longitude, respectively when defining a geographic coordinate system. These values shall be based on the geographic coordinate system described in the GCS dictionary. When defining a projected coordinate system, this array contains values in a planar projected coordinate space as eastings and northings. For Geospatial3D, when Geospatial feature information is present (requirement type Geospatial3D) in a 3D annotation, the GPTS array is required to hold 3D point coordinates as triples rather than pairwise where the third value of each tripe is an elevation value.

NOTE 2 Any projected coordinate system includes an underlying geographic coordinate system.

WKT:

A string of Well Known Text describing the geographic coordinate system.


The assumption is, if you're interested in Geospatial coordinates, then you know what a WKT is, and what the projection means.

This may be enough information for you to map the geo coordinates for the separate viewports to their locations on the page. Here are the PDF Viewports in more readable form:

/VP [
    <<
        /Type
        /Viewport
        /BBox [14.1732 147.400915455 822.0456 1133.350548016]
        /Name (TSBII)
        /Measure <<
            /Type
            /Measure
            /Subtype
            /GEO
            /Bounds [0 0 0 1 1 1 1 0 0 0]
            /GPTS [ -26.43302 -51.56133 -26.41418 -51.56124
                    -26.41424 -51.54409 -26.43309 -51.54418]
            /LPTS [ 0 0 0 1 1 1 1 0]
            /GCS<<
                /Type
                /PROJCS
                /WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
    GEOGCS["GCS_SIRGAS_2000",
        DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
        PRIMEM["Greenwich",0.0],
        UNIT["Degree",0.0174532925199433]
    ],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["False_Easting",500000.0],
    PARAMETER["False_Northing",10000000.0],
    PARAMETER["Central_Meridian",-51.0],
    PARAMETER["Scale_Factor",0.9996],
    PARAMETER["Latitude_Of_Origin",0.0],
    UNIT["Meter",1.0]
]
)
            >>
        >>
    >>
    <<
        /Type
        /Viewport
        /BBox [14.1732 14.1732 239.961243463 122.688692878]
        /Name (Referencia_Mapa)
        /Measure <<
            /Type
            /Measure
            /Subtype
            /GEO
            /Bounds [0 0 0 1 1 1 1 0 0 0]
            /GPTS [ -26.45579 -51.59842 -26.41777 -51.59822
                    -26.41811 -51.51036 -26.45613 -51.51053]
            /LPTS [ 0 0 0 1 1 1 1 0]
            /GCS<<
                /Type
                /PROJCS
                /WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
    GEOGCS["GCS_SIRGAS_2000",
        DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
        PRIMEM["Greenwich",0.0],
        UNIT["Degree",0.0174532925199433]
    ],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["False_Easting",500000.0],
    PARAMETER["False_Northing",10000000.0],
    PARAMETER["Central_Meridian",-51.0],
    PARAMETER["Scale_Factor",0.9996],
    PARAMETER["Latitude_Of_Origin",0.0],
    UNIT["Meter",1.0]])
            >>
        >>
    >>
]
>>

Note that a PDF file is a structured document and not parsable as a string. These specific elements could be compressed, or might occur multiple times for different pages. You'll need a toolkit that can access Pages and Resources and Dictionaries in order to locate the ViewPorts.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM