简体   繁体   中英

Read and analyze text files containing both numbers and letters on the same line

I'm given a task to read data from a text file and save it to a Set. Text file represents an imaginary bill containing certain item description and it's price, quantity and total sum. I only need item's name and price. Text file looks like this:

Item_name   Item_price(float value with comma as format symbol)  Quantity(int)  Total(float)
Item_name   Item_price(float value with comma as format symbol)  Quantity(int)  Total(float)

(Text file contains multiple items). Also, items sometimes have numbers in their name, eg. LG 4k TV 1000U).

I tried to solve it like this:

private void readAndSave(Path file) {
    try (BufferedReader br = new BufferedReader(new InputStreamReader(
             new BufferedInputStream(new FileInputStream(file.toString()))))) {


        Set<Item> items = new TreeSet<>();
        String line;
        while ((line = br.readLine()) != null) {


            float price = 0, numb;
            boolean priceFound = false;
            String name = "";
            String[] lineElements;
            lineElements = line.split(" ");

            for(String temp: lineElements) {
                if((numb = getNumberRepresentation(temp)) != -1) {
                    if(!priceFound) {
                        price = numb;
                        priceFound = true;
                    }
                    break;
                }

                name += temp + " ";
            }
            items.add(new Item(name, price));

        }
    } catch (FileNotFoundException fe) {
        System.out.println("File not found!");
    } catch (IOException e) {
        System.out.println("Error while opening/writing files!");
    }

}

Class Item contains two variables(String, float) representing name and price of an item and extends Comparable .

And here is getNumberRepresentation method

private float getNumberRepresentation(String temp) {
    try {

        DecimalFormatSymbols symbols = new DecimalFormatSymbols();
        symbols.setDecimalSeparator(',');
        DecimalFormat format = new DecimalFormat("0.##");
        format.setDecimalFormatSymbols(symbols);
        return format.parse(temp).floatValue();

    } catch(Exception e) {
        return -1;
    }
}

I've tried to use the logic that, if a price is found, then the name must also be already found and all other Strings from the line can be skipped. Problem here is that sometimes I get a number from an item's name as price(1000U, from previous example). Is there a better and more efficient solution to this problem?

Edit: File sample

Escape from Paradise City 70,00 1135 79450,00 Sony ITC60, TV cabel 111,26 111 12349,86

You need to be using java.util.regex.Pattern and get everything before the match of a Regex for the cost and the match of the regex for the cost. I assume there won't be anything in the name that looks like ###,## where # are numerals. (Represented by \\d in regex).

The tutorial can be found here.

It would look something like this:

Before reading the lines:

Pattern p = Pattern.compile("(.*?) (\\d*,\\d*)");

For each line:

Matcher m = p.matcher(line);
if (m.matches() && m.groupCount() == 2) {
    name = m.group(1);
    price = getNumberRepresentation(m.group(2));
} else {
    // line doesn't match the pattern, handle the exception!
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM