简体   繁体   中英

How to parse an SQL query in Java?

If I have a text file called File.txt that contains some data. For instance:

55 90 
10 45
33 23
10 500
5  2

Where the first column is called column C1 and second C2 .

And then I have another file called Input.txt with two SQL queries:

SELECT *
FROM File 
WHERE C2 > 60; 

SELECT C1 
FROM File;

What is one way to parse this file and produce an input that looks like what I would get from a real DBMS?

I've tried this so far:

// 1. Read the file.  
Main obj = new Main();
URL url = obj.getClass().getResource("File.txt");
File file = new File(url.toURI());
FileReader fileReader = new FileReader(file);
BufferedReader bufferReader = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while ((line = bufferReader.readLine()) != null) {
    stringBuffer.append(line);
    stringBuffer.append("\n");
}
fileReader.close();
String data = stringBuffer.toString(); //this contains the data from File.text
String[] list = data.split(" "); //this stores it into a list

// 2. Read the input file. 
Main input = new Main();
URL urlInput = input.getClass().getResource("Input.txt");
File inputFile = new File(urlInput.toURI());
FileReader fileReaderInput = new FileReader(inputFile);
BufferedReader bufferedReaderInput = new BufferedReader(fileReaderInput);
StringBuffer stringBufferInput = new StringBuffer();
String lineInput;
while ((lineInput = bufferedReaderInput.readLine()) != null) {
    stringBufferInput.append(lineInput);
    stringBufferInput.append("\n");
} 

But I get lost here... I don't know how to parse the query. My program manages to read both files, but when it comes to processing the query in the input file, I can't seem to figure out the logic for it.

You are looking for a SQL JDBC Driver for CSV files. If you have the liberty to change the delimiter to comma from space, I would use a library for this. The following code will work with CsvJdbc . The code is opensource so you can take a look and implement if something is not right but at least you don't have to start from scratch. I didn't find an outright way to change the delimiter, I tested with a file like below:

C1,C2
55,90
10,45
33,23
10,500
5,2

Code (download csvjdbc-1.0-23.jar and put in your classpath):

public static void main(String[] args)
{
    try
    {
        // Load the driver.
        Class.forName("org.relique.jdbc.csv.CsvDriver");

        Properties props = new Properties();
        props.put("headerline", "C1,C2");
        props.put("columnTypes", "Int,Int");
        Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + "/home/vinodshukla/tmp", props);

        // Create a Statement object to execute the query with.
        // A Statement is not thread-safe.
        Statement stmt = conn.createStatement();

        // Select the ID and NAME columns from sample.csv
        ResultSet results = stmt.executeQuery("SELECT C1,C2 FROM sample where C2 > 60");
        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        boolean append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        System.out.println("------------");
        results = stmt.executeQuery("SELECT C1 FROM sample");
        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        // Clean up
        conn.close();
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}

Output:

C1,C2
10,500
------------
C1
55
10
33
10
5

First I suggest representing your data as a collection of rows. That's how a DBMS treats the data and it will make other logic easier. You can create your own object type to store the values of c1 and c2 . Loop through the data file and create this collection of rows (maybe a list<row> )

Now to "parse" the SQL. You'll need to tokenize the SQL to get the actual pieces you'll use for logic later. Just use built in Java string splitting functions to get the actual clauses of the query.

I like to think of getting the specific rows (as determined by the Where clauses) first. Then you can worry about the actual data for each row to return from the select .

I'm assuming that the From clause won't change since you only have one data file. But if it did you'd use this clause to did something like choose the actual data source (file name maybe?)

For any SQL without a Where clause, all of your rows are valid and you can return the entire collection of rows. Otherwise you'll need to figure out how to turn the text after the where clause into a Java interpretable predicate (you might want to search for this part separately since it's an entirely separate problem and outside the scope of my answer). Then you just loop through your data rows and return each row that passes the predicate.

The Select statement determines which column(s) to include. Use logic like string.contains to check which column names are included. A * should select all columns. Since you already have the collection of valid rows just loop through them and get all the data you actually need from each row. For example you could just concatenate all the valid data (as determined by the string.contains ) into a long string terminated by a new line character.

That should work for you requirement. Sorry to not include any actual code but this outline should help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM