[英]How to identify the rows with missing data in the column due to hidden # in the .txt file
I have a below .txt files exported from the source system. 我有一个下面的.txt文件,从源系统导出。 Due to the
#
in one field in source system while exporting the .txt file some of the data after #
fields do not have any data in the .txt file when exported. 由于
#
在源系统中的一个领域,而导出的.txt文件后的一些数据#
字段出口时不必在.txt文件的任何数据。
For example below.. 例如下面
LINE|PANO| INOW|DEL|EASLN|EBSAP|LIM1IT|NOMIT|VALUE|KTE1|
1|7870|1000000||40500369|10|25624.0||0.00|SERVI TORNG|33277|
2|294|1000000||500324|10|590.84 ||0.00|REFUDIAL GATNGWAM|30448|
3|9410|1000000||200500325|10|5905.61||0.00|SUPLIVER EXTRACNS|37478|
4|573|1000000||600004075|10||||||||
5|739|1000000||700500290|10|40917.37|||||||
6|741|1000000||50500289|10|2782.53 ||0.00|SECUERVIC LUWE|29161|
7|948|1000000||||||||||||
8|996|1000000||960050035|10|7497.3||0.00|SCOUOUT URBISH IDM647 |38271|
9|1320|1000000||800500319|10|1395.93||0.00|TUATO AIRS|36427|
10|12054|1000000||9000287|10|458.42||0.00|SECURICE GOLA|||||
In the above example line 4, 5, 7 and 10 data is missing after certain fields due to the #
in the source system field. 在上面的示例中,由于源系统字段中的
#
,某些字段之后缺少第4、5、7和10行数据。 But there is data in the source system for these line items. 但是源系统中有这些订单项的数据。
How to recognize these line items as the missing information / records issue, if I have a large volume of .txt file for 10 Million-line items. 如果我有大量的.txt文件用于1000万个订单项,则如何将这些订单项识别为缺少的信息/记录问题。
Please kindly share the SQL query/ any other way to identify these line items with the missing data. 请与其他人共享SQL查询/以其他方式来识别这些订单项中缺少的数据。
another example 另一个例子
LINE|PANO| INOW|DEL|EASLN|EBSAP|LIM1IT|NOMIT|VALUE|KTE1|
1|7870|1000000||40500369|10|25624.0||0.00|SERVI TORNG|33277|
2|294|1000000||500324|10|590.84 ||0.00|REFUDIAL GATNGWAM|30448|
3|9410|1000000||200500325|10|5905.61||0.00|SUPLIVER EXTRACNS|37478|
4|573|1000000||600004075|10
5|739|1000000||700500290|10|40917.37
6|741|1000000||50500289|10|2782.53 ||0.00|SECUERVIC LUWE|29161|
7|948|1000000
8|996|1000000||960050035|10|7497.3||0.00|SCOUOUT URBISH IDM647 |38271|
9|1320|1000000||800500319|10|1395.93||0.00|TUATO AIRS|36427|
10|12054|1000000||9000287|10|458.42||0.00|SECURICE GOLA
data truncated if # exists. 如果#存在,则数据将被截断。
Would the following do what you require? 以下内容将满足您的要求吗?
I created a temporary table #HiddenHash and populated it with some of your example data, you will obviously have the data from a BULK INSERT or whatever mechanism you are using. 我创建了一个临时表#HiddenHash,并用您的一些示例数据填充了该表,您显然将从BULK INSERT或使用的任何机制中获取数据。
CREATE TABLE
#HiddenHash
(
LINE VARCHAR (2)
,PANO VARCHAR (25)
,INOW VARCHAR (25)
,DEL VARCHAR (25)
,EASLN VARCHAR (25)
,EBSAP VARCHAR (25)
,LIM1IT VARCHAR (25)
,NOMIT VARCHAR (25)
,VALUE VARCHAR (25)
,KTE1 VARCHAR (25)
)
INSERT INTO #HiddenHash
VALUES
('1','7870','1000000','','40500369','10','25624.0','0.00','SERVI TORNG','33277')
,('2','294','1000000','',' 500324','10','590.84 ','0.00','REFUDIAL GATNGWAM','30448')
,('3','9410','1000000','','200500325','10','5905.61','0.00','SUPLIVER EXTRACNS','37478')
,('4','573','1000000','','600004075','10','','','','')
,('5','739','1000000','','700500290','10','40917.37','','','')
,('6','741','1000000','','50500289','10','2782.53 ','0.00','SECUERVIC LUWE','29161')
,('7','948','1000000','','','','','','','')
,('8','996','1000000','','960050035','10','7497.3','0.00','SCOUOUT URBISH IDM647 ','38271')
,('9','1320','1000000','','800500319','10','1395.93','0.00','TUATO AIRS','36427')
,('10','12054','1000000','','9000287','10','458.42','0.00','SECURICE GOLA','')
Then I count how many columns there are in the table. 然后,我计算表中有多少列。
DECLARE @CountColumns INT
SET @CountColumns = (SELECT COUNT (*)
FROM TEMPDB.SYS.COLUMNS
WHERE NAME <> 'DEL' AND
object_id = object_id('tempdb.dbo.#HiddenHash')
)
Then count those rows where the columns are blank and show those where they do not match the number of columns contained in the variable. 然后计算那些列为空白的行,并显示那些与变量中包含的列数不匹配的行。
SELECT LINE,PANO,INOW,EASLN,EBSAP,LIM1IT,NOMIT,VALUE,KTE1
FROM (
SELECT
LINE,PANO,INOW,EASLN,EBSAP,LIM1IT,NOMIT,VALUE,KTE1,
(
SELECT COUNT(*)
FROM (VALUES (LINE),(PANO),(INOW),(EASLN),(EBSAP),(LIM1IT),(NOMIT),
(VALUE),(KTE1)) AS Cnt(col)
WHERE Cnt.Col <> ''
) AS NotBlank
FROM #HiddenHash)cc
WHERE cc.NotBlank <> @CountColumns
Which gives the following result 得到以下结果
LINE PANO INOW EASLN EBSAP LIM1IT NOMIT VALUE KTE1
4 573 1000000 600004075 10
5 739 1000000 700500290 10 40917.37
7 948 1000000
10 12054 1000000 9000287 10 458.42 0.00 SECURICE GOLA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.