简体   繁体   English

对 Databricks Delta Live Tables 中字段 StructField 的不明确引用

[英]Ambiguous reference to fields StructField in Databricks Delta Live Tables

I have set up Auto Loader to regularly read json files and store them in a "bronze" table called fixture_raw using Delta Live Tables in Databricks.我已经设置了自动加载程序来定期读取 json 文件,并将它们存储在一个名为 fixture_raw 的“青铜”表中,使用 Databricks 中的 Delta Live Tables。 This works fine and the json data is stored in the specified table, but when I add a "silver" table called fixture_prepared and try to extract some of the json elements from the bronze table, I get an error:这工作正常,json 数据存储在指定的表中,但是当我添加一个名为 fixture_prepared 的“银”表并尝试从青铜表中提取一些 json 元素时,我收到一个错误:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(id,LongType,true), StructField(id,LongType,true)

How can I get around this?我怎样才能解决这个问题?

Delta Live Table code: Delta Live Table 代码:

CREATE OR REFRESH STREAMING LIVE TABLE fixture_raw AS 
SELECT *, input_file_name() AS InputFile, now() AS LoadTime FROM cloud_files(
  "/mnt/input/fixtures/", 
  "json",
  map(
    "cloudFiles.inferColumnTypes", "true",
    "cloudFiles.schemaLocation", "/mnt/dlt/schema/fixture",
    "cloudFiles.schemaEvolutionMode", "addNewColumns"
  )
);

CREATE OR REFRESH LIVE TABLE fixture_prepared AS
WITH FixtureData (
  SELECT 
    explode(response) AS FixtureJson
  FROM live.fixture_raw
)
SELECT
  FixtureJson.fixture.id AS FixtureID,
  FixtureJson.fixture.date AS StartTime,
  FixtureJson.fixture.venue.name AS Venue,
  FixtureJson.teams.home.id AS HomeTeamID,
  FixtureJson.teams.home.name AS HomeTeamName,
  FixtureJson.teams.away.id AS AwayTeamID,
  FixtureJson.teams.away.name AS AwayTeamName
FROM FixtureData;

Json data: Json 数据:

{
    "get": "fixtures",
    "parameters": {
        "league": "39",
        "season": "2022"
    },
    "response": [
        {
            "fixture": {
                "id": 867946,
                "date": "2022-08-05T19:00:00+00:00",
                "venue": {
                    "id": 525,
                    "name": "Selhurst Park"
                }
            },
            "teams": {
                "home": {
                    "id": 52,
                    "name": "Crystal Palace"
                },
                "away": {
                    "id": 42,
                    "name": "Arsenal"
                }
            }
        },
        {
            "fixture": {
                "id": 867947,
                "date": "2022-08-06T11:30:00+00:00",
                "venue": {
                    "id": 535,
                    "name": "Craven Cottage"
                }
            },
            "teams": {
                "home": {
                    "id": 36,
                    "name": "Fulham"
                },
                "away": {
                    "id": 40,
                    "name": "Liverpool"
                }
            }
        }
    ]
}

There is a difference between assigning the size of the data frame and calling the dataframe.分配数据帧的大小和调用 dataframe 是有区别的。 Kindly check the assigning the dataframe size and calling the dataframe before joining.请在加入前检查分配 dataframe 尺寸并致电 dataframe。 Kindly go through the official documentation.请通过官方文档 go。 I followed the same scenario with the sample code in my environment.我在我的环境中使用示例代码遵循了相同的场景。 I added a silver table it's working fine for me without error.我添加了一张银色桌子,它对我来说工作正常,没有错误。 Follow this GitHub reference it has detailed information.按照这个GitHub参考它有详细的信息。

Reference:参考:

https://docs.microsoft.com/en-us/azure/databricks/data-engineering/delta-live-tables/delta-live-tables-quickstart#sql https://docs.microsoft.com/en-us/azure/databricks/data-engineering/delta-live-tables/delta-live-tables-quickstart#sql

Delta Live Tables Demo: Modern software engineering for ETL processing. Delta Live Tables Demo:用于 ETL 处理的现代软件工程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM