簡體   English   中英

Gson:解析非標准的 JSON 格式

[英]Gson: parsing a non-standard JSON format

Gson 有辦法讀取非標准 JSON 文件嗎?

而不是一個典型的文件,如:

[{obj1},{objN}]

我有一個這樣的文件:

{obj1}
{objN}

沒有方括號或逗號且每個對象由換行符分隔的地方。

是的,它有。 Gson 支持寬松閱讀。 例如,以下 JSON 文檔 ( non-standard.json ):

{
    "foo": 1
}
{
    "bar": 1
}

您可以使用以下閱讀方式:

private static final Gson gson = new Gson();
private static final TypeAdapter<JsonElement> jsonElementTypeAdapter = gson.getAdapter(JsonElement.class);

public static void main(final String... args)
        throws IOException {
    try ( final Reader reader = getPackageResourceReader(Q43528208.class, "non-standard.json") ) {
        final JsonReader jsonReader = new JsonReader(reader);
        jsonReader.setLenient(true); // this makes it work
        while ( jsonReader.peek() != END_DOCUMENT ) {
            final JsonElement jsonElement = jsonElementTypeAdapter.read(jsonReader);
            System.out.println(jsonElement);
        }
    }
}

輸出:

{"foo":1}  
{"bar":1}  

不過,我不確定您是否可以通過這種方式編寫強大的解串器。

更新

為了簡化Gson的支持,我們可以實現幾個方便的讀取方法:

// A shortcut method for the below implementation: aggregates the whole result into a single list
private static <T> List<T> parseToListLenient(final JsonReader jsonReader, final IMapper<? super JsonReader, ? extends T> mapper)
        throws IOException {
    final List<T> list = new ArrayList<>();
    parseLenient(jsonReader, in -> list.add(mapper.map(in)));
    return list;
}

// A convenient strategy-accepting method to configure a JsonReader instance to make it lenient and do read
// The consumer defines the strategy what to do with the current JsonReader token
private static void parseLenient(final JsonReader jsonReader, final IConsumer<? super JsonReader> consumer)
        throws IOException {
    final boolean isLenient = jsonReader.isLenient();
    try {
        jsonReader.setLenient(true);
        while ( jsonReader.peek() != END_DOCUMENT ) {
            consumer.accept(jsonReader);
        }
    } finally {
        jsonReader.setLenient(isLenient);
    }
}

// Since Java 8 Consumer inteface does not allow checked exceptions to be rethrown
private interface IConsumer<T> {

    void accept(T value)
            throws IOException;

}

private interface IMapper<T, R> {

    R map(T value)
            throws IOException;

}

那么簡單的閱讀真的很簡單,我們只要使用上面的方法即可:

final Gson gson = new Gson();
final TypeToken<Map<String, Integer>> typeToken = new TypeToken<Map<String, Integer>>() {
};
final TypeAdapter<Map<String, Integer>> typeAdapter = gson.getAdapter(typeToken);
try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
    final List<Map<String, Integer>> maps = parseToListLenient(jsonReader, typeAdapter::read);
    System.out.println(maps);
}

直接通過 Gson 反序列化需要更復雜的實現:

// This is just a marker not meant to be instantiated but to create a sort of "gateway" to dispatch types in Gson
@SuppressWarnings("unused")
private static final class LenientListMarker<T> {
    private LenientListMarker() {
        throw new AssertionError("must not be instantiated");
    }
}

private static void doDeserialize()
        throws IOException {
    final Gson gson = new GsonBuilder()
            .registerTypeAdapterFactory(new TypeAdapterFactory() {
                @Override
                public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
                    // Check if the given type is the lenient list marker class
                    if ( !LenientListMarker.class.isAssignableFrom(typeToken.getRawType()) ) {
                        // Not the case? Just delegate the job to Gson
                        return null;
                    }
                    final Type listElementType = getTypeParameter0(typeToken.getType());
                    final TypeAdapter<?> listElementAdapter = gson.getAdapter(TypeToken.get(listElementType));
                    @SuppressWarnings("unchecked")
                    final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.getParameterized(List.class, listElementType);
                    final TypeAdapter<List<?>> listAdapter = gson.getAdapter(listTypeToken);
                    final TypeAdapter<List<?>> typeAdapter = new TypeAdapter<List<?>>() {
                        @Override
                        public void write(final JsonWriter out, final List<?> value)
                                throws IOException {
                            // Always write a well-formed list
                            listAdapter.write(out, value);
                        }

                        @Override
                        public List<?> read(final JsonReader in)
                                throws IOException {
                            // Delegate the job to the reading method - we only have to tell how to obtain the list values
                            return parseToListLenient(in, listElementAdapter::read);
                        }
                    };
                    @SuppressWarnings("unchecked")
                    final TypeAdapter<T> castTypeAdapter = (TypeAdapter<T>) typeAdapter;
                    return castTypeAdapter;
                }

                // A simple method to resolve actual type parameter
                private Type getTypeParameter0(final Type type) {
                    if ( !(type instanceof ParameterizedType) ) {
                        // List or List<?>
                        return Object.class;
                    }
                    return ((ParameterizedType) type).getActualTypeArguments()[0];
                }
            })
            .create();
    // This type declares a marker specialization to be used during deserialization
    final Type type = new TypeToken<LenientListMarker<Map<String, Integer>>>() {
    }.getType();
    try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
        // This is where we're a sort of cheating:
        // We tell Gson to deserialize LenientListMarker<Map<String, Integer>> but the type adapter above will return a list
        final List<Map<String, Integer>> maps = gson.fromJson(jsonReader, type);
        System.out.println(maps);
    }
}

輸出現在是Map<String, Integer> s,而不是JsonElement s:

[{foo=1}, {bar=1}]

更新 2

TypeToken.getParameterized解決方法:

@SuppressWarnings("unchecked")
final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.get(new ParameterizedType() {
    @Override
    public Type getRawType() {
        return List.class;
    }

    @Override
    public Type[] getActualTypeArguments() {
        return new Type[]{ listElementType };
    }

    @Override
    public Type getOwnerType() {
        return null;
    }
});

我們可以再有一個程序來引入逗號(,)並構造一個格式良好的JSON

使用 spark 2,我們可以添加多行作為讀取選項。

spark.df.option("multiline","true").json("data.json")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM