简体   繁体   中英

Non-Standard Characters: Strange Interactions With CSV Files [Java]

I'm writing a simple function that removes all blank lines from a csv file:

import java.io.*;
import java.util.*;
import java.io.File;
import java.util.Scanner;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.io.FileWriter;
import java.io.PrintWriter;


public class WhiteSpace{

    static List<String> AggAndGroupIndices = new LinkedList<String>();

    public static void main(String[] args){

        System.out.println("Pre-whitespace");
        removeWhiteSpace("myFile");
        System.out.println("Post-whitespace");
    }

//////////////////////////////////////////////////////////////////////////////

    public static void removeWhiteSpace(String csv_filename){               
        System.out.println("Whitespace removal activated");
        Scanner file;
            PrintWriter writer;

            try {

                file = new Scanner(new File(csv_filename));
                writer = new PrintWriter("WhiteSpaceRemoval.csv");
                System.out.println(file.hasNext());
                System.out.println("About to enter while loop");
                while (file.hasNext()) {                    
                    System.out.println("In while loop");
                        String line = file.nextLine();
                        if (!line.isEmpty()) {
                            writer.write(line);
                                writer.write("\n");
                        }
                      }
                     System.out.println("While loop complete");

            file.close();
            writer.close();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

    }


}

This should be relatively straightforward, however, it breaks whenever a non-standard character such as é or Ç is used. Ie, the following csv file:

province,name,month

Quebec,Franois,February

Works perfectly, but

province,name,month

Quebec,François,February

Simply isn't printed to WhiteSpaceRemoval.csv whatsoever. Furthermore, according to my checks, the while loop isn't entered at all, not even on the first valid line. This is just baffling to me, and I'm really lost on what could be going wrong. Does anyone know what this could be? Running on Linux.

If you want to filter out empty lines you can just:

public List<String> filterNonEmptyLinesOnFile(String fileName) {
    List<String> list = new ArrayList<>();

    try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
        list = stream.filter(line -> !line.isEmpty())
                     .collect(Collectors.toList());

    } catch (IOException e) { e.printStackTrace();
    } finally { return list; }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM