简体   繁体   English

有没有办法使用 Jackson 和/或其关联库之一(csv、json 等)将 String 转换为 Java 类型?

[英]Is there a way to convert a String to a Java type using Jackson and/or one of its associated libraries (csv, json, etc.)

Is there a mechanism to apply a standard set of checks to detect and then transform a String to the detected type, using one of Jackson's standard text related libs (csv, json, or even jackson-core)?是否有一种机制可以应用一组标准检查来检测,然后使用 Jackson 的标准文本相关库之一(csv、json 甚至 jackson-core)将字符串转换为检测到的类型? I can imagine using it along with a label associated with that value (CSV header, for example) to do something sorta like the following:我可以想象将它与与该值关联的标签(例如 CSV 标头)一起使用来执行如下操作:

JavaTypeAndValue typeAndValue = StringToJavaType.fromValue(Object x, String label);  
typeAndValue.type() // FQN of Java type, maybe
typeAndValue.label() // where label might be a column header value, for example
typeAndValue.value() // returns Object  of typeAndValue.type()

A set of 'extractors' would be required to apply the transform, and the consumer of the class would have to be aware of the 'ambiguity' of the 'Object' return type, but still capable of consuming and using the information, given its purpose.需要一组“提取器”来应用转换,类的使用者必须知道“对象”返回类型的“歧义”,但仍然能够消费和使用信息,因为它的目的。

The example I'm currently thinking about involves constructing SQL DDL or DML, like a CREATE Table statement using the information from a List derived from evaluating a row from a csv file.我目前正在考虑的示例涉及构建 SQL DDL 或 DML,例如使用从评估 csv 文件中的行得出的列表中的信息的 CREATE Table 语句。

After more digging, hoping to find something out there, I wrote the start of what I had in mind.经过更多的挖掘,希望能在那里找到一些东西,我写下了我心中所想的开始。

Please keep in mind that my intention here isn't to present something 'complete', as I'm sure there are several things missing here, edge cases not addressed, etc.请记住,我在这里的目的不是要呈现“完整”的东西,因为我确信这里缺少一些东西,没有解决边缘情况等。

The pasrse(List<Map<String, String>> rows, List<String> headers comes from the idea that this could be a sample of rows from a CSV file read in from Jackson, for example. pasrse(List<Map<String, String>> rows, List<String> headers来自这样的想法,例如,这可能是从 Jackson 读取的 CSV 文件中的行样本。

Again, this isn't complete, so I'm not looking to pick at everything that's wrong with the following.同样,这并不完整,所以我不想挑出以下所有错误。 The question isn't 'how would we write this?', it's 'is anyone familiar with something that exists that does something like the following?'.问题不是“我们将如何写这个?”,而是“有人熟悉已经存在的东西并执行以下操作吗?”。

import gms.labs.cassandra.sandbox.extractors.Extractor;
import gms.labs.cassandra.sandbox.extractors.Extractors;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import lombok.experimental.Accessors;

@Accessors(fluent=true, chain=true)
public class TypeAndValue
{

    @Builder
    TypeAndValue(Class<?> type, String rawValue){
        this.type = type;
        this.rawValue = rawValue;
        label = "NONE";
    }

    @Getter
    final Class<?> type;

    @Getter
    final String rawValue;

    @Setter
    @Getter
    String label;

    public Object value(){
        return Extractors.extractorFor(this).value(rawValue);
    }

    static final String DEFAULT_LABEL = "NONE";

}

A simple parser, where the parse came from a context where I have a List<Map<String,String>> from a CSVReader.一个简单的解析器,其中parse来自一个上下文,其中我有一个来自 CSVReader 的List<Map<String,String>>

import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;

import java.util.*;
import java.util.function.BiFunction;

public class JavaTypeParser
{
public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers)
{
    List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
    for (Map<String, String> row : rows) {
        for (String header : headers) {
            String val = row.get(header);
            TypeAndValue typeAndValue =
                    //  isNull, isBoolean, isNumber
                    isNull(val).orElse(isBoolean(val).orElse(isNumber(val).orElse(_typeAndValue.apply(String.class, val).get())));
            typesAndVals.add(typeAndValue.label(header));
        }
    }
  
}

public static Optional<TypeAndValue> isNumber(String val)
{
    if (!NumberUtils.isCreatable(val)) {
        return Optional.empty();
    } else {
        return _typeAndValue.apply(NumberUtils.createNumber(val).getClass(), val);
    }
}

public static Optional<TypeAndValue> isBoolean(String val)
{
    boolean bool = (val.equalsIgnoreCase("true") || val.equalsIgnoreCase("false"));
    if (bool) {
        return _typeAndValue.apply(Boolean.class, val);
    } else {
        return Optional.empty();
    }
}

public static Optional<TypeAndValue> isNull(String val){
    if(Objects.isNull(val) || val.equals("null")){
        return _typeAndValue.apply(ObjectUtils.Null.class,val);
    }
    else{
        return Optional.empty();
    }
}

static final BiFunction<Class<?>, String, Optional<TypeAndValue>> _typeAndValue = (type, value) -> Optional.of(
        TypeAndValue.builder().type(type).rawValue(value).build());

}

Extractors.提取器。 Just an example of how the 'extractors' for the values (contained in strings) might be registered somewhere for lookup.只是一个示例,说明如何在某处注册值(包含在字符串中)的“提取器”以进行查找。 They could be referenced any number of other ways, too.它们也可以通过任何数量的其他方式被引用。

import gms.labs.cassandra.sandbox.TypeAndValue;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;

import java.math.BigDecimal;
import java.math.BigInteger;
import java.util.Arrays;
import java.util.List;

public class Extractors
{

private static final List<Class> NUMS = Arrays.asList(
        BigInteger.class,
        BigDecimal.class,
        Long.class,
        Integer.class,
        Double.class,
        Float.class);

public static final Extractor<?> extractorFor(TypeAndValue typeAndValue)
{
    if (NUMS.contains(typeAndValue.type())) {
        return (Extractor<Number>) value -> NumberUtils.createNumber(value);
    } else if(typeAndValue.type().equals(Boolean.class)) {
        return  (Extractor<Boolean>) value -> Boolean.valueOf(value);
    } else if(typeAndValue.type().equals(ObjectUtils.Null.class)) {
        return  (Extractor<ObjectUtils.Null>) value -> null; // should we just return the raw value.  some frameworks coerce to null.
    } else if(typeAndValue.type().equals(String.class)) {
        return  (Extractor<String>) value -> typeAndValue.rawValue(); // just return the raw value.  some frameworks coerce to null.
    }
    else{
        throw new RuntimeException("unsupported");
    }
}
}

I ran this from within the JavaTypeParser class, for reference.我从 JavaTypeParser 类中运行它,以供参考。

public static void main(String[] args)
{

    Optional<TypeAndValue> num = isNumber("-1230980980980980980980980980980988009808989080989809890808098292");
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());  // BigInteger
    });
    num = isNumber("-123098098097987");
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass()); // Long
    });
    num = isNumber("-123098.098097987"); // Double
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());
    });
    num = isNumber("-123009809890898.0980979098098908080987"); // BigDecimal
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());
    });

    Optional<TypeAndValue> bool = isBoolean("FaLse");
    bool.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass()); // Boolean
    });

    Optional<TypeAndValue> nulll = isNull("null");
    nulll.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        //System.out.println(typeAndVal.value().getClass());  would throw null pointer exception
        System.out.println(typeAndVal.type()); // ObjectUtils.Null (from apache commons lang3)
    });

}

I don't know of any library to do this, and never seen anything working in this way on an open set of possible types.我不知道有任何图书馆可以做到这一点,也从未见过任何以这种方式在一组开放的可能类型上工作的东西。

For closed set of types (you know all the possible output types) the easier way would be to have the class FQN written in the string (from your description I didn't get if you are in control of the written string).对于封闭的类型集(您知道所有可能的输出类型),更简单的方法是将类 FQN 写入字符串(如果您控制写入的字符串,我无法从您的描述中得到)。
The complete FQN, or an alias to it .完整的 FQN, 或它的别名

Otherwise I think there is no escape to not write all the checks.否则,我认为不写所有支票是没有办法的。

Furthermore it will be very delicate as I'm thinking of edge use case.此外,当我考虑边缘用例时,它会非常微妙。

Suppose you use json as serialization format in the string, how would you differentiate between a String value like Hello World and a Date written in some ISO format (eg. 2020-09-22 ).假设您使用 json 作为字符串中的序列化格式,您将如何区分像Hello World这样的String值和以某种 ISO 格式(例如2020-09-22 )编写的Date To do it you would need to introduce some priority in the checks you do (first try to check if it is a date using some regex, if not go with the next and the simple string one be the last one)要做到这一点,你需要在你所做的检查中引入一些优先级(首先尝试使用一些正则表达式检查它是否是日期,如果不是,则使用下一个,简单的字符串是最后一个)

What if you have two objects:如果你有两个对象怎么办:

   String name;
   String surname;
}

class Employee {
   String name;
   String surname;
   Integer salary
}

And you receive a serialization value of the second type, but with a null salary (null or the property missing completely).并且您收到第二种类型的序列化值,但工资为空(空或属性完全丢失)。

How can you tell the difference between a set or a list?如何区分集合或列表?

I don't know if what you intended is so dynamic, or you already know all the possible deserializable types, maybe some more details in the question can help.我不知道您的意图是否如此动态,或者您已经知道所有可能的可反序列化类型,也许问题中的更多细节可以提供帮助。

UPDATE更新

Just saw the code, now it seems more clear.刚刚看到代码,现在看起来更清楚了。 If you know all the possible output, that is the way.如果你知道所有可能的输出,那就是方法。
The only changes I would do, would be to ease the increase of types you want to manage abstracting the extraction process.我会做的唯一更改是减轻您想要管理抽象提取过程的类型的增加。
To do this I think a small change should be done, like:为此,我认为应该进行一些小的更改,例如:

interface Extractor {
    Boolean match(String value);
    Object extract(String value);
}

Then you can define an extractor per type:然后您可以为每种类型定义一个提取器:

class NumberExtractor implements Extractor<T> {
    public Boolean match(String val) {
        return NumberUtils.isCreatable(val);
    }
    public Object extract(String value) {
        return NumberUtils.createNumber(value);
    }
}
class StringExtractor implements Extractor {
    public Boolean match(String s) {
        return true; //<-- catch all
    }
    public Object extract(String value) {
        return value;
    }
}

And then register and automatize the checks:然后注册并自动化检查:

public class JavaTypeParser {
  private static final List<Extractor> EXTRACTORS = List.of(
      new NullExtractor(),
      new BooleanExtractor(),
      new NumberExtractor(),
      new StringExtractor()
  )

  public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers) {
    List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
    for (Map<String, String> row : rows) {
        for (String header : headers) {
            String val = row.get(header);
            
            typesAndVals.add(extract(header, val));
        }
    }
}
  public static final TypeAndValue extract(String header, String value) {
       for (Extractor<?> e : EXTRACTOR) {
           if (e.match(value) {
               Object v = extractor.extract(value);
               return TypeAndValue.builder()
                         .label(header)
                         .value(v) //<-- you can put the real value here, and remove the type field
                         .build()
           }
       }
       throw new IllegalStateException("Can't find an extractor for: " + header + " | " + value);

  }

To parse CSV I would suggest https://commons.apache.org/proper/commons-csv as CSV parsing can incur in nasty issues.要解析 CSV,我建议使用https://commons.apache.org/proper/commons-csv,因为 CSV 解析可能会导致令人讨厌的问题。

What you actually trying to do is to write a parser .你真正想做的是编写一个解析器 You translate a fragment into a parse tree.您将片段转换为解析树。 The parse tree captures the type as well as the value.解析树捕获类型和值。 For hierarchical types like arrays and objects, each tree node contains child nodes.对于像数组和对象这样的分层类型,每个树节点都包含子节点。

One of the most commonly used parsers (albeit a bit overkill for your use case) is Antlr .最常用的解析器之一(尽管对您的用例来说有点矫枉过正)是Antlr Antlr brings out-of-the-box support for Json . Antlr 为Json带来了开箱即用的支持。

I recommend to take the time to ingest all the involved concepts.我建议花点时间吸收所有涉及的概念。 Even though it might seem overkill initially, it quickly pays off when you do any kind of extension.尽管最初可能看起来有点矫枉过正,但当您进行任何类型的扩展时,它很快就会得到回报。 Changing a grammar is relatively easy;改变语法相对容易; the generated code is quite complex.生成的代码相当复杂。 Additionally, all parser generator verify your grammars to show logic errors.此外,所有解析器生成器都会验证您的语法以显示逻辑错误。

Of course, if you are limiting yourself to just parsing CSV or JSON (and not both at the same time), you should rather take the parser of an existing library.当然,如果您限制自己只解析 CSV 或 JSON(而不是同时解析两者),您应该使用现有库的解析器。 For example, jackson has ObjectMapper.readTree to get the parse tree.例如,jackson 有ObjectMapper.readTree来获取解析树。 You could also use ObjectMapper.readValue(<fragment>, Object.class) to simply get the canonical java classes.您还可以使用ObjectMapper.readValue(<fragment>, Object.class)来简单地获取规范的 Java 类。

Try this :尝试这个 :

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

String j = // json string;

            JsonFactory jsonFactory = new JsonFactory();
            ObjectMapper jsonMapper = new ObjectMapper(jsonFactory);
            JsonNode jsonRootNode = jsonMapper.readTree(j);
            Iterator<Map.Entry<String,JsonNode>> jsonIterator = jsonRootNode.fields();

            while (jsonIterator.hasNext()) {
                Map.Entry<String,JsonNode> jsonField = jsonIterator.next();
                String k = jsonField.getKey();
                String v = jsonField.getValue().toString();
                ...

            }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java解析具有不同对象类型(Gson或Jackson等等)的数组的Json - Java Parse Json with array with different object types (Gson or Jackson or etc.) How to convert CSV to JSON with value in double quotes using com.fasterxml.jackson.dataformat.csv.CsvMapper in Java? - How to convert CSV to JSON with value in double quotes using com.fasterxml.jackson.dataformat.csv.CsvMapper in Java? 将 JSON 转换为 Map<string, string> 使用 Jackson</string,> - Convert JSON to Map<String, String> using Jackson 如何将Java类转换为Map <String, String> 并使用jackson将非字符串成员转换为json? - How to convert Java class to Map<String, String> and convert non-string members to json using jackson? 在 JSON Schema 和 Jackson Schema 解析器库中查找任何 JSON 元素的类型 - Find the TYPE of any JSON element using its JSON Schema and Jackson Schema parser library in Java 在一个 JAR 文件中包含一个 Java 程序的所有库、音频等 - Including all libraries, audio, etc. of a Java program in one JAR file 如何使用jackson将json转换为java中的POJO - How to convert json into POJO in java using jackson Java-使用Jackson转换为正确的json格式 - Java - Convert to correct json format using Jackson 通过Jackson将JSON转换为JAVA - Convert JSON to JAVA by jackson 使用Jackson库从JSON文件创建JAVA Map - create JAVA Map out of JSON file using Jackson libraries
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM