简体   繁体   中英

Reading a huge csv file and converting to JSON with Java 8

I am trying to read a csv file with many columns. And the first row is always the header for the csv file. I would like to convert the csv data into JSON. I can read it as String and convert into JSON but I am not able to assign headers to it.

For example input csv looks like:

first_name,last_name
A,A1
B,B1
C,C1

Stream<String> stream = Files.lines(Paths.get("sample.csv"))
List<String[]> readall = stream.map(l -> l.split(",")).collect(Collectors.toList()); 

or

List<String> test1 = readall.stream().skip(0).map(row -> row[1]).collect(Collectors.toList());

And using com.fasterxml.jackson.databind.ObjectMapper's WriteValueAsString only creates JSON with no header.

I would like the output in the format like

{
[{"first_name":"A","last_name":"A1"},{"first_name":"B"....

How do I use stream in Java to prepare this JSON format?

Please help.

I'd tackle this problem in two steps: first, read the headers, then, read the rest of the lines:

static String[] headers(String path) throws IOException {

    try (BufferedReader br = new BufferedReader(new FileReader(path))) {
        return br.readLine().split(",");
    }
}

Now, you can use the method above as follows:

String path = "sample.csv";

// Read headers
String[] headers = headers(path);

List<Map<String, String>> result = null;

// Read data
try (Stream<String> stream = Files.lines(Paths.get(path))) {
    result = stream
        .skip(1) // skip headers
        .map(line -> line.split(","))
        .map(data -> {
            Map<String, String> map = new HashMap<>();
            for (int i = 0; i < data.length; i++) {
               map.put(headers[i], data[i]);
            }
            return map;
        })
        .collect(Collectors.toList());
}

You can change the for loop inside the 2nd map operation:

try (Stream<String> stream = Files.lines(Paths.get(path))) {
    result = stream
        .skip(1) // skip headers
        .map(line -> line.split(","))
        .map(data -> IntStream.range(0, data.length)
            .boxed()
            .collect(Collectors.toMap(i -> headers[i], i -> data[i])))
        .collect(Collectors.toList());
}

EDIT: If instead of collecting to a list, you want to perform an action for the maps read from each line, you can do it as follows:

try (Stream<String> stream = Files.lines(Paths.get(path))) {
    stream
        .skip(1) // skip headers
        .map(line -> line.split(","))
        .map(data -> IntStream.range(0, data.length)
            .boxed()
            .collect(Collectors.toMap(i -> headers[i], i -> data[i])))
        .forEach(System.out::println);
}

(Here the action is to print each map).

This version can be improved, ie it boxes the stream of int s and then unboxes each int again to use it as the index of the headers and data arrays. Also, readability can be improved by extracting the creation of each map to a private method.

Notes: Maybe reading the file twice is not the best approach performance-wise, but the code is simple and expressive. Apart from this, null handling, data transformation (ie to numbers or dates, etc) and border cases (ie no headers, no data lines or different lengths for the arrays of data, etc) are left as an exercise for the reader ;)

I think this is what you are trying to do

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;

public class App {
    public static void main(String[] args) throws JsonProcessingException, IOException {

        Stream<String> stream = Files.lines(Paths.get("src/main/resources/test1.csv"));
        List<Map<String, Object>> readall = stream.map(l -> {
            Map<String, Object> map = new HashMap<String, Object>();
            String[] values = l.split(",");

            map.put("name", values[0]);
            map.put("age", values[1]);

            return map;
        }).collect(Collectors.toList());

        ObjectMapper mapperObj = new ObjectMapper();
        String jsonResp = mapperObj.writeValueAsString(readall);
        System.out.println(jsonResp);

    }
}

Works with Java -8 Streams, with headers, and uses jackson to convert it into json. used CSV

abc,20
bbc,30

Very Simple, Don't convert it into List of Strings. Convert it into List of HashMaps and then use org.json library to convert it into json . Use jackson to convert the CSV to Hashmap

Let the input stream be

InputStream stream = new FileInputStream(new File("filename.csv"));

Example: To convert CSV to HashMap

public List<Map<String, Object>> read(InputStream stream) throws JsonProcessingException, IOException {
 List<Map<String, Object>> response = new LinkedList<Map<String, Object>>();
 CsvMapper mapper = new CsvMapper();
 CsvSchema schema = CsvSchema.emptySchema().withHeader();
 MappingIterator<Map<String, String>> iterator = mapper.reader(Map.class).with(schema).readValues(stream);
 while (iterator.hasNext()) 
 {
       response.add(Collections.<String, Object>unmodifiableMap(iterator.next()));
 }
 return response;
 }

To convert List of Map to Json

JSONArray jsonArray = new JSONArray(response);
System.out.println(jsonArray.toString());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM