簡體   English   中英

當Hawq投訴以下內容時,我該如何解決錯誤:缺少“ SoldToAddr2”列的數據

[英]How do i resolve error when Hawq complaints for : missing data for column “SoldToAddr2”

我們有關鍵的hadoop-hawq系統的小型集群。 我們必須讀一張外部表。

即從ext_table選擇*

但是當我在Hawq中發出查詢時,抱怨以下錯誤:

Error Hawq complaints for : missing data for column "SoldToAddr2" 

我們嘗試了以下方法:

我們在ext_table定義的format子句中嘗試使用不同的特殊字符:

for ex:
CREATE READABLE EXTERNAL TABLE ext_table
(
 "ID" INTEGER,
 time timestamp,
 "Customer" char(7),
 "Name" varchar,
 "ShortName" char(10),
 "ExemptFinChg" char(1),
 "MaintHistory" char(1),
 "CustomerType" char(1),
 "MasterAccount" char(7),
 "StoreNumber" char(7),
 "PrtMasterAdd" char(1),
 "CreditStatus" char(1),
 "CreditLimit" decimal(14),
 "InvoiceCount" decimal(7),
 "Salesperson" char(3),
 "Salesperson1" char(3),
 "Salesperson2" char(3),
 "Salesperson3" char(3),
 "PriceCode" char(2),
 "CustomerClass" char(2),
 "Branch" char(2),
 "TermsCode" char(2),
 "InvDiscCode" char(2),
 "BalanceType" char(1),
 "Area" char(2),
 "LineDiscCode" char(2),
 "TaxStatus" char(1),
 "TaxExemptNumber" char(30),
 "SpecialInstrs" char(30),
 "PriceCategoryTable" char(52),
 "DateLastSale" date,
 "DateLastPay" date,
 "OutstOrdVal" decimal(16),
 "NumOutstOrd" decimal(6),
 "Telephone" char(20),
 "Contact" varchar,
 "AddTelephone" char(20),
 "Fax" char(20),
 "Telex" char(10),
 "TelephoneExtn" char(5),
 "Currency" text,
 "UserField1" char(10),
 "UserField2" decimal(16),
 "GstExemptFlag" char(1),
 "GstExemptNum" char(15),
 "GstLevel" char(1),
 "DetailMoveReqd" char(1),
 "InterfaceFlag" char(1),
 "ContractPrcReqd" char(1),
 "BuyingGroup1" char(2),
 "BuyingGroup2" char(2),
 "BuyingGroup3" char(2),
 "BuyingGroup4" char(2),
 "BuyingGroup5" char(2),
 "StatementReqd" char(1),
 "BackOrdReqd" char(1),
 "ShippingInstrs" char(30),
 "StateCode" char(3),
 "DateCustAdded" date,
 "StockInterchange" char(1),
 "MaintLastPrcPaid" char(1),
 "IbtCustomer" char(1),
 "SoDefaultDoc" char(1),
 "CounterSlsOnly" char(1),
 "PaymentStatus" char(1),
 "Nationality" char(3),
 "HighestBalance" decimal(16),
 "CustomerOnHold" char(1),
 "InvCommentCode" char(3),
 "EdiSenderCode" char(40),
 "RelOrdOsValue" decimal(16),
 "EdiFlag" char(1),
 "SoDefaultType" char(1),
 "Email" char(50),
 "ApplyOrdDisc" char(1),
 "ApplyLineDisc" char(1),
 "FaxInvoices" char(1),
 "FaxStatements" char(1),
 "HighInvDays" decimal(5),
 "HighInv" char(6),
 "DocFax" char(20),
 "DocFaxContact" char(40),
 "SoldToAddr1" char(150),
 "SoldToAddr2" char(80),
 "SoldToAddr3" char(40),
 "SoldToAddr4" char(40),
 "SoldToAddr5" char(40),
 "SoldPostalCode" char(9),
 "ShipToAddr1" char(40),
 "ShipToAddr2" char(40),
 "ShipToAddr3" char(40),
 "ShipToAddr4" char(40),
 "ShipToAddr5" char(40),
 "ShipPostalCode" char(9),
 "State" char(2),
 "CountyZip" char(5),
 "City" char(3),
 "State1" char(2),
 "CountyZip1" char(5),
 "City1" char(3),
 "DefaultOrdType" char(2),
 "PoNumberMandatory" char(1),
 "CreditCheckFlag" char(1),
 "CompanyTaxNumber" char(15),
 "DeliveryTerms" char(3),
 "TransactionNature" decimal(5),
 "DeliveryTermsC" char(3),
 "TransactionNatureC" decimal(5),
 "RouteCode" char(10),
 "FaxQuotes" char(1),
 "RouteDistance" decimal(6),
 "TpmCustomerFlag" char(1),
 "SalesWarehouse" text,
 "TpmPricingFlag" char(1),
 "ArStatementNo" char(2),
 "TpmCreditCheck" char(1),
 "WholeOrderShipFlag" char(1),
 "MinimumOrderValue" decimal(12),
 "MinimumOrderChgCod" char(6),
 "UkVatFlag" char(1),
 "UkCurrency" char(3),
 "TimeStamp" bytea
 )
LOCATION (
    'pxf://hostname/path/to/hdfs?profile=HdfsTextSimple')
FORMAT 'CSV' (delimiter '^' null 'null' quote '"')
ENCODING 'UTF8';

錯誤詳情:

即在第20行遇到壞行

error detail :DETAIL:  External table tablename, line 20 of pxf://hostname/path/to/hdfs?profile=HdfsTextSimple: "23020^2015-12-02 11:14:26.52^0023482^Carlos iglesias               ^          ^N^Y^ ^       ^       ..."

解決關鍵的hadoop-hawq系統中的錯誤的更好方法是什么?

任何幫助將非常感激 ?

由於您使用的是CSV格式,並且以“作為字符串終止符”,因此您的數據中很可能會有多余的“”,從而錯誤地終止了字段。 您將需要刪除多余的引號或使用“ \\”將其轉義。

您可以定義外部表來記錄錯誤,然后使用gp_read_error_log()讀取這些錯誤。

有關記錄錯誤的更多詳細信息,請參見管理指南: http : //gpdb.docs.pivotal.io/4360/admin_guide/load/topics/g-define-an-external-table-with-single-row-error-isolation .html

嘗試使用帶有1個文本列的pxf外部表,然后使用regex解析(調試)hawq內部的外部數據。
錯誤表也可能會有所幫助。 CSV(“逗號”分隔的向量)很棘手。 在您的陳述中,您具有null'null'。 如果有空字段,例如“”,則pxf將引發錯誤。
我的建議:使用“”作為null,然后發布進程“ null”。 例如

CREATE READABLE EXTERNAL TABLE ext_table
(raw_text text)
LOCATION ('pxf://hostname/path/to/hdfs?profile=HdfsTextSimple')
FORMAT 'text' (delimiter '^' null '' quote '"')
LOGS ERRORS INTO pxf_errors SEGMENT REJECT LIMIT 1000 ROWS
ENCODING 'UTF8';

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM