[英]Is there a way to convert a String to a Java type using Jackson and/or one of its associated libraries (csv, json, etc.)
Is there a mechanism to apply a standard set of checks to detect and then transform a String to the detected type, using one of Jackson's standard text related libs (csv, json, or even jackson-core)?是否有一种机制可以应用一组标准检查来检测,然后使用 Jackson 的标准文本相关库之一(csv、json 甚至 jackson-core)将字符串转换为检测到的类型? I can imagine using it along with a label associated with that value (CSV header, for example) to do something sorta like the following:
我可以想象将它与与该值关联的标签(例如 CSV 标头)一起使用来执行如下操作:
JavaTypeAndValue typeAndValue = StringToJavaType.fromValue(Object x, String label);
typeAndValue.type() // FQN of Java type, maybe
typeAndValue.label() // where label might be a column header value, for example
typeAndValue.value() // returns Object of typeAndValue.type()
A set of 'extractors' would be required to apply the transform, and the consumer of the class would have to be aware of the 'ambiguity' of the 'Object' return type, but still capable of consuming and using the information, given its purpose.需要一组“提取器”来应用转换,类的使用者必须知道“对象”返回类型的“歧义”,但仍然能够消费和使用信息,因为它的目的。
The example I'm currently thinking about involves constructing SQL DDL or DML, like a CREATE Table statement using the information from a List derived from evaluating a row from a csv file.我目前正在考虑的示例涉及构建 SQL DDL 或 DML,例如使用从评估 csv 文件中的行得出的列表中的信息的 CREATE Table 语句。
After more digging, hoping to find something out there, I wrote the start of what I had in mind.经过更多的挖掘,希望能在那里找到一些东西,我写下了我心中所想的开始。
Please keep in mind that my intention here isn't to present something 'complete', as I'm sure there are several things missing here, edge cases not addressed, etc.请记住,我在这里的目的不是要呈现“完整”的东西,因为我确信这里缺少一些东西,没有解决边缘情况等。
The pasrse(List<Map<String, String>> rows, List<String> headers
comes from the idea that this could be a sample of rows from a CSV file read in from Jackson, for example. pasrse(List<Map<String, String>> rows, List<String> headers
来自这样的想法,例如,这可能是从 Jackson 读取的 CSV 文件中的行样本。
Again, this isn't complete, so I'm not looking to pick at everything that's wrong with the following.同样,这并不完整,所以我不想挑出以下所有错误。 The question isn't 'how would we write this?', it's 'is anyone familiar with something that exists that does something like the following?'.
问题不是“我们将如何写这个?”,而是“有人熟悉已经存在的东西并执行以下操作吗?”。
import gms.labs.cassandra.sandbox.extractors.Extractor;
import gms.labs.cassandra.sandbox.extractors.Extractors;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import lombok.experimental.Accessors;
@Accessors(fluent=true, chain=true)
public class TypeAndValue
{
@Builder
TypeAndValue(Class<?> type, String rawValue){
this.type = type;
this.rawValue = rawValue;
label = "NONE";
}
@Getter
final Class<?> type;
@Getter
final String rawValue;
@Setter
@Getter
String label;
public Object value(){
return Extractors.extractorFor(this).value(rawValue);
}
static final String DEFAULT_LABEL = "NONE";
}
A simple parser, where the parse
came from a context where I have a List<Map<String,String>>
from a CSVReader.一个简单的解析器,其中
parse
来自一个上下文,其中我有一个来自 CSVReader 的List<Map<String,String>>
。
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;
import java.util.*;
import java.util.function.BiFunction;
public class JavaTypeParser
{
public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers)
{
List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
for (Map<String, String> row : rows) {
for (String header : headers) {
String val = row.get(header);
TypeAndValue typeAndValue =
// isNull, isBoolean, isNumber
isNull(val).orElse(isBoolean(val).orElse(isNumber(val).orElse(_typeAndValue.apply(String.class, val).get())));
typesAndVals.add(typeAndValue.label(header));
}
}
}
public static Optional<TypeAndValue> isNumber(String val)
{
if (!NumberUtils.isCreatable(val)) {
return Optional.empty();
} else {
return _typeAndValue.apply(NumberUtils.createNumber(val).getClass(), val);
}
}
public static Optional<TypeAndValue> isBoolean(String val)
{
boolean bool = (val.equalsIgnoreCase("true") || val.equalsIgnoreCase("false"));
if (bool) {
return _typeAndValue.apply(Boolean.class, val);
} else {
return Optional.empty();
}
}
public static Optional<TypeAndValue> isNull(String val){
if(Objects.isNull(val) || val.equals("null")){
return _typeAndValue.apply(ObjectUtils.Null.class,val);
}
else{
return Optional.empty();
}
}
static final BiFunction<Class<?>, String, Optional<TypeAndValue>> _typeAndValue = (type, value) -> Optional.of(
TypeAndValue.builder().type(type).rawValue(value).build());
}
Extractors.提取器。 Just an example of how the 'extractors' for the values (contained in strings) might be registered somewhere for lookup.
只是一个示例,说明如何在某处注册值(包含在字符串中)的“提取器”以进行查找。 They could be referenced any number of other ways, too.
它们也可以通过任何数量的其他方式被引用。
import gms.labs.cassandra.sandbox.TypeAndValue;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;
import java.math.BigDecimal;
import java.math.BigInteger;
import java.util.Arrays;
import java.util.List;
public class Extractors
{
private static final List<Class> NUMS = Arrays.asList(
BigInteger.class,
BigDecimal.class,
Long.class,
Integer.class,
Double.class,
Float.class);
public static final Extractor<?> extractorFor(TypeAndValue typeAndValue)
{
if (NUMS.contains(typeAndValue.type())) {
return (Extractor<Number>) value -> NumberUtils.createNumber(value);
} else if(typeAndValue.type().equals(Boolean.class)) {
return (Extractor<Boolean>) value -> Boolean.valueOf(value);
} else if(typeAndValue.type().equals(ObjectUtils.Null.class)) {
return (Extractor<ObjectUtils.Null>) value -> null; // should we just return the raw value. some frameworks coerce to null.
} else if(typeAndValue.type().equals(String.class)) {
return (Extractor<String>) value -> typeAndValue.rawValue(); // just return the raw value. some frameworks coerce to null.
}
else{
throw new RuntimeException("unsupported");
}
}
}
I ran this from within the JavaTypeParser class, for reference.我从 JavaTypeParser 类中运行它,以供参考。
public static void main(String[] args)
{
Optional<TypeAndValue> num = isNumber("-1230980980980980980980980980980988009808989080989809890808098292");
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // BigInteger
});
num = isNumber("-123098098097987");
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // Long
});
num = isNumber("-123098.098097987"); // Double
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass());
});
num = isNumber("-123009809890898.0980979098098908080987"); // BigDecimal
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass());
});
Optional<TypeAndValue> bool = isBoolean("FaLse");
bool.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // Boolean
});
Optional<TypeAndValue> nulll = isNull("null");
nulll.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
//System.out.println(typeAndVal.value().getClass()); would throw null pointer exception
System.out.println(typeAndVal.type()); // ObjectUtils.Null (from apache commons lang3)
});
}
I don't know of any library to do this, and never seen anything working in this way on an open set of possible types.我不知道有任何图书馆可以做到这一点,也从未见过任何以这种方式在一组开放的可能类型上工作的东西。
For closed set of types (you know all the possible output types) the easier way would be to have the class FQN written in the string (from your description I didn't get if you are in control of the written string).对于封闭的类型集(您知道所有可能的输出类型),更简单的方法是将类 FQN 写入字符串(如果您控制写入的字符串,我无法从您的描述中得到)。
The complete FQN, or an alias to it .完整的 FQN, 或它的别名。
Otherwise I think there is no escape to not write all the checks.否则,我认为不写所有支票是没有办法的。
Furthermore it will be very delicate as I'm thinking of edge use case.此外,当我考虑边缘用例时,它会非常微妙。
Suppose you use json as serialization format in the string, how would you differentiate between a String
value like Hello World
and a Date
written in some ISO format (eg. 2020-09-22
).假设您使用 json 作为字符串中的序列化格式,您将如何区分像
Hello World
这样的String
值和以某种 ISO 格式(例如2020-09-22
)编写的Date
。 To do it you would need to introduce some priority in the checks you do (first try to check if it is a date using some regex, if not go with the next and the simple string one be the last one)要做到这一点,你需要在你所做的检查中引入一些优先级(首先尝试使用一些正则表达式检查它是否是日期,如果不是,则使用下一个,简单的字符串是最后一个)
What if you have two objects:如果你有两个对象怎么办:
String name;
String surname;
}
class Employee {
String name;
String surname;
Integer salary
}
And you receive a serialization value of the second type, but with a null salary (null or the property missing completely).并且您收到第二种类型的序列化值,但工资为空(空或属性完全丢失)。
How can you tell the difference between a set or a list?如何区分集合或列表?
I don't know if what you intended is so dynamic, or you already know all the possible deserializable types, maybe some more details in the question can help.我不知道您的意图是否如此动态,或者您已经知道所有可能的可反序列化类型,也许问题中的更多细节可以提供帮助。
UPDATE更新
Just saw the code, now it seems more clear.刚刚看到代码,现在看起来更清楚了。 If you know all the possible output, that is the way.
如果你知道所有可能的输出,那就是方法。
The only changes I would do, would be to ease the increase of types you want to manage abstracting the extraction process.我会做的唯一更改是减轻您想要管理抽象提取过程的类型的增加。
To do this I think a small change should be done, like:为此,我认为应该进行一些小的更改,例如:
interface Extractor {
Boolean match(String value);
Object extract(String value);
}
Then you can define an extractor per type:然后您可以为每种类型定义一个提取器:
class NumberExtractor implements Extractor<T> {
public Boolean match(String val) {
return NumberUtils.isCreatable(val);
}
public Object extract(String value) {
return NumberUtils.createNumber(value);
}
}
class StringExtractor implements Extractor {
public Boolean match(String s) {
return true; //<-- catch all
}
public Object extract(String value) {
return value;
}
}
And then register and automatize the checks:然后注册并自动化检查:
public class JavaTypeParser {
private static final List<Extractor> EXTRACTORS = List.of(
new NullExtractor(),
new BooleanExtractor(),
new NumberExtractor(),
new StringExtractor()
)
public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers) {
List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
for (Map<String, String> row : rows) {
for (String header : headers) {
String val = row.get(header);
typesAndVals.add(extract(header, val));
}
}
}
public static final TypeAndValue extract(String header, String value) {
for (Extractor<?> e : EXTRACTOR) {
if (e.match(value) {
Object v = extractor.extract(value);
return TypeAndValue.builder()
.label(header)
.value(v) //<-- you can put the real value here, and remove the type field
.build()
}
}
throw new IllegalStateException("Can't find an extractor for: " + header + " | " + value);
}
To parse CSV I would suggest https://commons.apache.org/proper/commons-csv as CSV parsing can incur in nasty issues.要解析 CSV,我建议使用https://commons.apache.org/proper/commons-csv,因为 CSV 解析可能会导致令人讨厌的问题。
What you actually trying to do is to write a parser .你真正想做的是编写一个解析器。 You translate a fragment into a parse tree.
您将片段转换为解析树。 The parse tree captures the type as well as the value.
解析树捕获类型和值。 For hierarchical types like arrays and objects, each tree node contains child nodes.
对于像数组和对象这样的分层类型,每个树节点都包含子节点。
One of the most commonly used parsers (albeit a bit overkill for your use case) is Antlr .最常用的解析器之一(尽管对您的用例来说有点矫枉过正)是Antlr 。 Antlr brings out-of-the-box support for Json .
Antlr 为Json带来了开箱即用的支持。
I recommend to take the time to ingest all the involved concepts.我建议花点时间吸收所有涉及的概念。 Even though it might seem overkill initially, it quickly pays off when you do any kind of extension.
尽管最初可能看起来有点矫枉过正,但当您进行任何类型的扩展时,它很快就会得到回报。 Changing a grammar is relatively easy;
改变语法相对容易; the generated code is quite complex.
生成的代码相当复杂。 Additionally, all parser generator verify your grammars to show logic errors.
此外,所有解析器生成器都会验证您的语法以显示逻辑错误。
Of course, if you are limiting yourself to just parsing CSV or JSON (and not both at the same time), you should rather take the parser of an existing library.当然,如果您限制自己只解析 CSV 或 JSON(而不是同时解析两者),您应该使用现有库的解析器。 For example, jackson has ObjectMapper.readTree to get the parse tree.
例如,jackson 有ObjectMapper.readTree来获取解析树。 You could also use
ObjectMapper.readValue(<fragment>, Object.class)
to simply get the canonical java classes.您还可以使用
ObjectMapper.readValue(<fragment>, Object.class)
来简单地获取规范的 Java 类。
Try this :尝试这个 :
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
String j = // json string;
JsonFactory jsonFactory = new JsonFactory();
ObjectMapper jsonMapper = new ObjectMapper(jsonFactory);
JsonNode jsonRootNode = jsonMapper.readTree(j);
Iterator<Map.Entry<String,JsonNode>> jsonIterator = jsonRootNode.fields();
while (jsonIterator.hasNext()) {
Map.Entry<String,JsonNode> jsonField = jsonIterator.next();
String k = jsonField.getKey();
String v = jsonField.getValue().toString();
...
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.