简体   繁体   English

ST_GeogFromGeoJSON 在 bigquery 中失败而在 postgres 中成功

[英]ST_GeogFromGeoJSON fails in bigquery while successful in postgres

We have geojson polygons we would like to convert to a geo object in bigquery using ST_GeogFromGeoJSON.我们有 geojson 多边形,我们希望使用 ST_GeogFromGeoJSON 在 bigquery 中将其转换为 geo object。 The conversion fails in bigquery while is successful in postgres using the equivalent command ST_GeomFromGeoJSON.转换在 bigquery 中失败,而在 postgres 中使用等效命令 ST_GeomFromGeoJSON 成功。

I am familiar with the SAFE prefix that can be added to the the bigquery call, but we would like to use the object and not just ignore it in case the conversion fails.我熟悉可以添加到 bigquery 调用的 SAFE 前缀,但我们希望使用 object 而不是在转换失败时忽略它。 I tried converting the object using ST_CONVEXHULL but wasn't able to make it work.我尝试使用 ST_CONVEXHULL 转换 object 但无法使其工作。

Is there some work around in bigquery? bigquery 中有一些解决方法吗?

Example:例子:

Running the following command in bigquery在 bigquery 中运行以下命令

select ST_GeogFromGeoJSON('{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}')

returns回报

Query failed: ST_GeogFromGeoJSON failed: Invalid polygon loop: Edge 4 crosses edge 9

While runs successfully in postgres虽然在 postgres 中成功运行

select ST_GeomFromGeoJSON('{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}')

October 2020 Update for this post这篇文章的 2020 年 10 月更新

No more any tricks needed - ST_GEOGFROMGEOJSON and ST_GEOGFROMTEXT geographic functions now support a new make_valid parameter.不再需要任何技巧 - ST_GEOGFROMGEOJSON 和 ST_GEOGFROMTEXT 地理函数现在支持新的make_valid参数。 If set to TRUE, the function attempts to correct polygon issues when importing geography data.如果设置为 TRUE,function 会尝试在导入地理数据时更正多边形问题。

So, below simple statement works perfectly now...所以,下面的简单语句现在可以完美地工作......

select ST_GeogFromGeoJSON(
  '{"type":"Polygon","coordinates":[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}' 
  , make_valid => true
) 

and returns expected output和预期回报 output

在此处输入图像描述

Below is for BigQuery Standard SQL以下是 BigQuery 标准 SQL

Query failed: ST_GeogFromGeoJSON failed: Invalid polygon loop: Edge 4 crosses edge 9
... Is there some work around in bigquery? ... 在 bigquery 中有一些解决方法吗? ... ...

Proposed workaround is obviously naive and simple way of fixing specific issue while easily can be extended to more generic cases.提议的解决方法显然是解决特定问题的幼稚和简单方法,同时可以轻松扩展到更一般的情况。 The idea here is to extract coordinates and reorder them to eliminate the problem...这里的想法是提取坐标并重新排序以消除问题......

WITH test AS (
  SELECT '{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}' AS geojson
)
SELECT ST_GEOGFROMGEOJSON('{"type":"Polygon","coordinates":' || fixed_coordinates || '}') AS geo
FROM (
  SELECT '[[[' || STRING_AGG(lat_lon, '],[') || '],[' || ANY_VALUE(ordered_coordinates[OFFSET(0)]) || ']]]' fixed_coordinates
  FROM (
    SELECT
      ARRAY( SELECT lon_lat
        FROM UNNEST(REGEXP_EXTRACT_ALL(JSON_EXTRACT(geojson, '$.coordinates'), r'\[+(.*?)\]+')) lon_lat
        ORDER BY CAST( SPLIT(lon_lat)[OFFSET(0)] AS FLOAT64), CAST(SPLIT(lon_lat)[OFFSET(1)] AS FLOAT64)
      ) ordered_coordinates
    FROM test
    ) t, t.ordered_coordinates lat_lon
)

This produces correct output这会产生正确的 output

POLYGON((-82.022982 26.69785, -81.999574 26.109253, -81.8073135 26.1074055, -81.615053 26.105558, -81.606813 26.710698, -81.8148975 26.704274, -82.022982 26.69785))    

and respective visualization is和相应的可视化是

在此处输入图像描述

Below is for BigQuery Standard SQL以下是 BigQuery 标准 SQL

My previous answer is based on oversimplified logic of re-ordering coordinates.我之前的回答是基于重新排序坐标的过于简化的逻辑。 Obviously it will not work in more complex cases like below one显然它不会在像下面这样的更复杂的情况下工作

{‘type’:‘Polygon’,‘coordinates’:[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}

Is there some more advanced sorting logic that can be applied?是否可以应用一些更高级的排序逻辑?

So more complex logic can be used to address this因此可以使用更复杂的逻辑来解决这个问题

#standardSQL
WITH test AS (
  SELECT '{"type":"Polygon","coordinates":[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}' geojson
), coordinates AS (
  SELECT CAST(SPLIT(lon_lat)[OFFSET(0)] AS FLOAT64) lon, CAST(SPLIT(lon_lat)[OFFSET(1)] AS FLOAT64) lat
  FROM test, UNNEST(REGEXP_EXTRACT_ALL(JSON_EXTRACT(geojson, '$.coordinates'), r'\[+(.*?)\]+')) lon_lat), stats AS (
  SELECT ST_CENTROID(ST_UNION_AGG(ST_GEOGPOINT(lon, lat))) centroid FROM coordinates
) 
SELECT ST_MAKEPOLYGON(ST_MAKELINE(ARRAY_AGG(point ORDER BY sequence))) AS polygon
FROM (
  SELECT point, 
    CASE 
      WHEN ST_X(point) > ST_X(centroid) AND ST_Y(point) > ST_Y(centroid) THEN 3.14 - angle
      WHEN ST_X(point) > ST_X(centroid) AND ST_Y(point) < ST_Y(centroid) THEN 3.14 + angle
      WHEN ST_X(point) < ST_X(centroid) AND ST_Y(point) < ST_Y(centroid) THEN 6.28 - angle
      ELSE angle
    END sequence
  FROM (
    SELECT point, centroid, 
      ACOS(ST_DISTANCE(centroid, anchor) / ST_DISTANCE(centroid, point)) angle
    FROM (
      SELECT centroid, 
        ST_GEOGPOINT(lon, lat) point, 
        ST_GEOGPOINT(lon, ST_Y(centroid)) anchor
      FROM coordinates, stats
    )
  )
) 

This approach produces correct output这种方法产生正确的 output

POLYGON((-0.49075 51.46989, -0.48664 51.46987, -0.48664 51.47341, -0.48923 51.47336, -0.48921 51.4737, -0.49072 51.47462, -0.49114 51.47446, -0.49044 51.4737, -0.4907 51.4737, -0.49075 51.46989))

which is visualized as below如下图所示

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 BigQuery 中的 ST_EXTENT 或 ST_ENVELOPE? - ST_EXTENT or ST_ENVELOPE in BigQuery? 如何在 BigQuery 中使用 ST_UNION - How to use ST_UNION in BigQuery 使用 ssh 隧道从 Java jdbc 连接到 aws 远程 Postgres,面临连接超时,而 JsCH session.connect() 运行成功 - Connect to aws remote Postgres from Java jdbc with ssh tunnel, Facing connection timeout, while JsCH session.connect() ran successful 如何提高 BigQuery 中 ST_INTERSECT 的性能? - How to improve performance of ST_INTERSECT in BigQuery? GCP BigQuery - 验证存储过程是否成功执行 - GCP BigQuery - Verify successful execution of stored procedure 无法从 BigQuery 作业连接到不同项目中的 Cloud SQL Postgres - Unable to connect from BigQuery job to Cloud SQL Postgres in different project 使用联合查询将 bigquery 表与 google cloud postgres 表合并 - Merge bigquery table with google cloud postgres table with federated query Postgres SQL 在 BigQuery 中聚合查询? - Postgres SQL aggregates query in BigQuery? 运行 DataFlow 作业时在 BigQuery 中记录重复 - Record Duplication in BigQuery while Running a DataFlow Job 使用数据流 Kafka 到 bigquery 模板时出错 - Error while using dataflow Kafka to bigquery template
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM