[英]Gson: parsing a non-standard JSON format
Gson 有辦法讀取非標准 JSON 文件嗎?
而不是一個典型的文件,如:
[{obj1},{objN}]
我有一個這樣的文件:
{obj1}
{objN}
沒有方括號或逗號且每個對象由換行符分隔的地方。
是的,它有。 Gson 支持寬松閱讀。 例如,以下 JSON 文檔 ( non-standard.json
):
{
"foo": 1
}
{
"bar": 1
}
您可以使用以下閱讀方式:
private static final Gson gson = new Gson();
private static final TypeAdapter<JsonElement> jsonElementTypeAdapter = gson.getAdapter(JsonElement.class);
public static void main(final String... args)
throws IOException {
try ( final Reader reader = getPackageResourceReader(Q43528208.class, "non-standard.json") ) {
final JsonReader jsonReader = new JsonReader(reader);
jsonReader.setLenient(true); // this makes it work
while ( jsonReader.peek() != END_DOCUMENT ) {
final JsonElement jsonElement = jsonElementTypeAdapter.read(jsonReader);
System.out.println(jsonElement);
}
}
}
輸出:
{"foo":1}
{"bar":1}
不過,我不確定您是否可以通過這種方式編寫強大的解串器。
為了簡化Gson的支持,我們可以實現幾個方便的讀取方法:
// A shortcut method for the below implementation: aggregates the whole result into a single list
private static <T> List<T> parseToListLenient(final JsonReader jsonReader, final IMapper<? super JsonReader, ? extends T> mapper)
throws IOException {
final List<T> list = new ArrayList<>();
parseLenient(jsonReader, in -> list.add(mapper.map(in)));
return list;
}
// A convenient strategy-accepting method to configure a JsonReader instance to make it lenient and do read
// The consumer defines the strategy what to do with the current JsonReader token
private static void parseLenient(final JsonReader jsonReader, final IConsumer<? super JsonReader> consumer)
throws IOException {
final boolean isLenient = jsonReader.isLenient();
try {
jsonReader.setLenient(true);
while ( jsonReader.peek() != END_DOCUMENT ) {
consumer.accept(jsonReader);
}
} finally {
jsonReader.setLenient(isLenient);
}
}
// Since Java 8 Consumer inteface does not allow checked exceptions to be rethrown
private interface IConsumer<T> {
void accept(T value)
throws IOException;
}
private interface IMapper<T, R> {
R map(T value)
throws IOException;
}
那么簡單的閱讀真的很簡單,我們只要使用上面的方法即可:
final Gson gson = new Gson();
final TypeToken<Map<String, Integer>> typeToken = new TypeToken<Map<String, Integer>>() {
};
final TypeAdapter<Map<String, Integer>> typeAdapter = gson.getAdapter(typeToken);
try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
final List<Map<String, Integer>> maps = parseToListLenient(jsonReader, typeAdapter::read);
System.out.println(maps);
}
直接通過 Gson 反序列化需要更復雜的實現:
// This is just a marker not meant to be instantiated but to create a sort of "gateway" to dispatch types in Gson
@SuppressWarnings("unused")
private static final class LenientListMarker<T> {
private LenientListMarker() {
throw new AssertionError("must not be instantiated");
}
}
private static void doDeserialize()
throws IOException {
final Gson gson = new GsonBuilder()
.registerTypeAdapterFactory(new TypeAdapterFactory() {
@Override
public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
// Check if the given type is the lenient list marker class
if ( !LenientListMarker.class.isAssignableFrom(typeToken.getRawType()) ) {
// Not the case? Just delegate the job to Gson
return null;
}
final Type listElementType = getTypeParameter0(typeToken.getType());
final TypeAdapter<?> listElementAdapter = gson.getAdapter(TypeToken.get(listElementType));
@SuppressWarnings("unchecked")
final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.getParameterized(List.class, listElementType);
final TypeAdapter<List<?>> listAdapter = gson.getAdapter(listTypeToken);
final TypeAdapter<List<?>> typeAdapter = new TypeAdapter<List<?>>() {
@Override
public void write(final JsonWriter out, final List<?> value)
throws IOException {
// Always write a well-formed list
listAdapter.write(out, value);
}
@Override
public List<?> read(final JsonReader in)
throws IOException {
// Delegate the job to the reading method - we only have to tell how to obtain the list values
return parseToListLenient(in, listElementAdapter::read);
}
};
@SuppressWarnings("unchecked")
final TypeAdapter<T> castTypeAdapter = (TypeAdapter<T>) typeAdapter;
return castTypeAdapter;
}
// A simple method to resolve actual type parameter
private Type getTypeParameter0(final Type type) {
if ( !(type instanceof ParameterizedType) ) {
// List or List<?>
return Object.class;
}
return ((ParameterizedType) type).getActualTypeArguments()[0];
}
})
.create();
// This type declares a marker specialization to be used during deserialization
final Type type = new TypeToken<LenientListMarker<Map<String, Integer>>>() {
}.getType();
try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
// This is where we're a sort of cheating:
// We tell Gson to deserialize LenientListMarker<Map<String, Integer>> but the type adapter above will return a list
final List<Map<String, Integer>> maps = gson.fromJson(jsonReader, type);
System.out.println(maps);
}
}
輸出現在是Map<String, Integer>
s,而不是JsonElement
s:
[{foo=1}, {bar=1}]
TypeToken.getParameterized
解決方法:
@SuppressWarnings("unchecked")
final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.get(new ParameterizedType() {
@Override
public Type getRawType() {
return List.class;
}
@Override
public Type[] getActualTypeArguments() {
return new Type[]{ listElementType };
}
@Override
public Type getOwnerType() {
return null;
}
});
我們可以再有一個程序來引入逗號(,)並構造一個格式良好的JSON
使用 spark 2,我們可以添加多行作為讀取選項。
spark.df.option("multiline","true").json("data.json")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.