[英]Classifying the values present in a single column using SQL into their respective datatypes?
I have been trying to classify the a set values present in a single column (just like above) into datatypes. 我一直试图将单列中存在的set值(就像上面一样)分类为数据类型。 The problem being that i am using Aster SQL environment (availability of function and the environment as a whole is very limited).
问题是我正在使用Aster SQL环境(功能的可用性和整个环境非常有限)。 Another problem is there are a lot of junk values in the column, a lot of symbols, characters etc. which makes it hard to even hard-code the problem.
另一个问题是,该列中有很多垃圾值,许多符号,字符等,这使得很难对问题进行硬编码。 The structure is something like:
结构类似于:
FeatureValue
123
24
15.6
17:15
abc
12/18/2014
17/222222
abc1200
001001oo
positve+
+1
I would like the solution to be a SQL query. 我希望解决方案是SQL查询。 The end result should be something like:
最终结果应类似于:
FeatureValue Type
123 Numeric
24 Numeric
15.6 Numeric
17:15 String (?time)
abc String
12/18/2014 Date
17/222222 String
abc1200 String
001001oo String
positve+ String
+1 String
I coded a little, but this solution is not very reliable. 我编写了一些代码,但是这种解决方案不是很可靠。 What I did was:
我所做的是:
case
when upper(trim(feature_value)) not like '%A%' and
upper(trim(feature_value)) not like '%B%' and
upper(trim(feature_value)) not like '%C%' and
upper(trim(feature_value)) not like '%D%' and
upper(trim(feature_value)) not like '%E%' and
upper(trim(feature_value)) not like '%F%' and
upper(trim(feature_value)) not like '%G%' and
upper(trim(feature_value)) not like '%H%' and
upper(trim(feature_value)) not like '%I%' and
upper(trim(feature_value)) not like '%J%' and
upper(trim(feature_value)) not like '%K%' and
upper(trim(feature_value)) not like '%L%' and
upper(trim(feature_value)) not like '%M%' and
upper(trim(feature_value)) not like '%N%' and
upper(trim(feature_value)) not like '%O%' and
upper(trim(feature_value)) not like '%P%' and
upper(trim(feature_value)) not like '%Q%' and
upper(trim(feature_value)) not like '%R%' and
upper(trim(feature_value)) not like '%S%' and
upper(trim(feature_value)) not like '%T%' and
upper(trim(feature_value)) not like '%U%' and
upper(trim(feature_value)) not like '%V%' and
upper(trim(feature_value)) not like '%W%' and
upper(trim(feature_value)) not like '%X%' and
upper(trim(feature_value)) not like '%Y%' and
upper(trim(feature_value)) not like '%Z%' and
upper(trim(feature_value)) <>'' and
upper(trim(feature_value)) not like '%+%' and
upper(trim(feature_value)) is not null and
--upper(trim(feature_value))<>'-' and
upper(trim(feature_value))<>'NULL' and
upper(trim(feature_value)) not like '%/%' and
upper(trim(feature_value)) not like '%-%' and
upper(trim(feature_value)) not like '%:%' and
feature_value is not null
then 'NUMERIC'
else 'STRING'
end as value_type
You could try to get the CASE-nightmare a bit more under control with a character range in the LIKE-statement: 您可以尝试在LIKE语句中使用字符范围来控制CASE噩梦:
CASE WHEN upper(trim(feature_value)) NOT LIKE '%[A-Z/-+:]%'
AND upper(trim(feature_value)) NOT LIKE ''
AND upper(trim(feature_value)) IS NOT NULL
THEN 'NUMERIC'
ELSE 'STRING'
END AS value_type
Modify/extend as needed. 根据需要修改/扩展。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.