[英]Ambiguous reference to fields StructField in Databricks Delta Live Tables
我已經設置了自動加載程序來定期讀取 json 文件,並將它們存儲在一個名為 fixture_raw 的“青銅”表中,使用 Databricks 中的 Delta Live Tables。 這工作正常,json 數據存儲在指定的表中,但是當我添加一個名為 fixture_prepared 的“銀”表並嘗試從青銅表中提取一些 json 元素時,我收到一個錯誤:
org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(id,LongType,true), StructField(id,LongType,true)
我怎樣才能解決這個問題?
Delta Live Table 代碼:
CREATE OR REFRESH STREAMING LIVE TABLE fixture_raw AS
SELECT *, input_file_name() AS InputFile, now() AS LoadTime FROM cloud_files(
"/mnt/input/fixtures/",
"json",
map(
"cloudFiles.inferColumnTypes", "true",
"cloudFiles.schemaLocation", "/mnt/dlt/schema/fixture",
"cloudFiles.schemaEvolutionMode", "addNewColumns"
)
);
CREATE OR REFRESH LIVE TABLE fixture_prepared AS
WITH FixtureData (
SELECT
explode(response) AS FixtureJson
FROM live.fixture_raw
)
SELECT
FixtureJson.fixture.id AS FixtureID,
FixtureJson.fixture.date AS StartTime,
FixtureJson.fixture.venue.name AS Venue,
FixtureJson.teams.home.id AS HomeTeamID,
FixtureJson.teams.home.name AS HomeTeamName,
FixtureJson.teams.away.id AS AwayTeamID,
FixtureJson.teams.away.name AS AwayTeamName
FROM FixtureData;
Json 數據:
{
"get": "fixtures",
"parameters": {
"league": "39",
"season": "2022"
},
"response": [
{
"fixture": {
"id": 867946,
"date": "2022-08-05T19:00:00+00:00",
"venue": {
"id": 525,
"name": "Selhurst Park"
}
},
"teams": {
"home": {
"id": 52,
"name": "Crystal Palace"
},
"away": {
"id": 42,
"name": "Arsenal"
}
}
},
{
"fixture": {
"id": 867947,
"date": "2022-08-06T11:30:00+00:00",
"venue": {
"id": 535,
"name": "Craven Cottage"
}
},
"teams": {
"home": {
"id": 36,
"name": "Fulham"
},
"away": {
"id": 40,
"name": "Liverpool"
}
}
}
]
}
分配數據幀的大小和調用 dataframe 是有區別的。 請在加入前檢查分配 dataframe 尺寸並致電 dataframe。 請通過官方文檔 go。 我在我的環境中使用示例代碼遵循了相同的場景。 我添加了一張銀色桌子,它對我來說工作正常,沒有錯誤。 按照這個GitHub參考它有詳細的信息。
參考:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.