简体   繁体   English

如何在Java中解析SQL查询?

[英]How to parse an SQL query in Java?

If I have a text file called File.txt that contains some data. 如果我有一个名为File.txt的文本文件,其中包含一些数据。 For instance: 例如:

55 90 
10 45
33 23
10 500
5  2

Where the first column is called column C1 and second C2 . 其中第一列称为C1列,第二列称为C2列。

And then I have another file called Input.txt with two SQL queries: 然后,我还有一个名为Input.txt文件, Input.txt包含两个SQL查询:

SELECT *
FROM File 
WHERE C2 > 60; 

SELECT C1 
FROM File;

What is one way to parse this file and produce an input that looks like what I would get from a real DBMS? 解析此文件并产生类似于真实DBMS的输入的一种方法是什么?

I've tried this so far: 到目前为止,我已经尝试过了:

// 1. Read the file.  
Main obj = new Main();
URL url = obj.getClass().getResource("File.txt");
File file = new File(url.toURI());
FileReader fileReader = new FileReader(file);
BufferedReader bufferReader = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while ((line = bufferReader.readLine()) != null) {
    stringBuffer.append(line);
    stringBuffer.append("\n");
}
fileReader.close();
String data = stringBuffer.toString(); //this contains the data from File.text
String[] list = data.split(" "); //this stores it into a list

// 2. Read the input file. 
Main input = new Main();
URL urlInput = input.getClass().getResource("Input.txt");
File inputFile = new File(urlInput.toURI());
FileReader fileReaderInput = new FileReader(inputFile);
BufferedReader bufferedReaderInput = new BufferedReader(fileReaderInput);
StringBuffer stringBufferInput = new StringBuffer();
String lineInput;
while ((lineInput = bufferedReaderInput.readLine()) != null) {
    stringBufferInput.append(lineInput);
    stringBufferInput.append("\n");
} 

But I get lost here... I don't know how to parse the query. 但是我在这里迷路了...我不知道如何解析查询。 My program manages to read both files, but when it comes to processing the query in the input file, I can't seem to figure out the logic for it. 我的程序设法读取两个文件,但是在处理输入文件中的查询时,我似乎无法弄清楚它的逻辑。

You are looking for a SQL JDBC Driver for CSV files. 您正在寻找CSV文件的SQL JDBC驱动程序。 If you have the liberty to change the delimiter to comma from space, I would use a library for this. 如果您有自由将定界符从空格更改为逗号,那么我将为此使用一个库。 The following code will work with CsvJdbc . 以下代码将与CsvJdbc一起使用 The code is opensource so you can take a look and implement if something is not right but at least you don't have to start from scratch. 该代码是开源的,因此您可以看一下并实施,如果有什么不对的地方,但至少不必从头开始。 I didn't find an outright way to change the delimiter, I tested with a file like below: 我没有找到一种直接的方法来更改定界符,我对以下文件进行了测试:

C1,C2
55,90
10,45
33,23
10,500
5,2

Code (download csvjdbc-1.0-23.jar and put in your classpath): 代码(下载csvjdbc-1.0-23.jar并放入您的类路径中):

public static void main(String[] args)
{
    try
    {
        // Load the driver.
        Class.forName("org.relique.jdbc.csv.CsvDriver");

        Properties props = new Properties();
        props.put("headerline", "C1,C2");
        props.put("columnTypes", "Int,Int");
        Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + "/home/vinodshukla/tmp", props);

        // Create a Statement object to execute the query with.
        // A Statement is not thread-safe.
        Statement stmt = conn.createStatement();

        // Select the ID and NAME columns from sample.csv
        ResultSet results = stmt.executeQuery("SELECT C1,C2 FROM sample where C2 > 60");
        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        boolean append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        System.out.println("------------");
        results = stmt.executeQuery("SELECT C1 FROM sample");
        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        // Clean up
        conn.close();
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}

Output: 输出:

C1,C2
10,500
------------
C1
55
10
33
10
5

First I suggest representing your data as a collection of rows. 首先,我建议将您的数据表示为行的集合。 That's how a DBMS treats the data and it will make other logic easier. 这就是DBMS处理数据的方式,它将使其他逻辑更加容易。 You can create your own object type to store the values of c1 and c2 . 您可以创建自己的对象类型来存储c1c2的值。 Loop through the data file and create this collection of rows (maybe a list<row> ) 遍历数据文件并创建此行集合(可能是list<row>

Now to "parse" the SQL. 现在来“解析” SQL。 You'll need to tokenize the SQL to get the actual pieces you'll use for logic later. 您需要对SQL进行标记化,以获得稍后将用于逻辑的实际片段。 Just use built in Java string splitting functions to get the actual clauses of the query. 只需使用内置的Java字符串拆分功能即可获取查询的实际子句。

I like to think of getting the specific rows (as determined by the Where clauses) first. 我想考虑首先获取特定行(由Where子句确定)。 Then you can worry about the actual data for each row to return from the select . 然后,您可以担心从select返回的每一行的实际数据。

I'm assuming that the From clause won't change since you only have one data file. 我假设From子句不会更改,因为您只有一个数据文件。 But if it did you'd use this clause to did something like choose the actual data source (file name maybe?) 但是,如果这样做的话,您将使用此子句来做类似选择实际数据源的操作(也许是文件名?)。

For any SQL without a Where clause, all of your rows are valid and you can return the entire collection of rows. 对于任何不带Where子句的SQL,所有行都是有效的,您可以返回整个行集合。 Otherwise you'll need to figure out how to turn the text after the where clause into a Java interpretable predicate (you might want to search for this part separately since it's an entirely separate problem and outside the scope of my answer). 否则,您将需要弄清楚如何将where子句之后的文本转换为Java可解释的谓词(您可能希望单独搜索此部分,因为这是一个完全独立的问题,不在我的回答范围内)。 Then you just loop through your data rows and return each row that passes the predicate. 然后,您只需遍历数据行并返回通过谓词的每一行。

The Select statement determines which column(s) to include. Select语句确定要包括的列。 Use logic like string.contains to check which column names are included. 使用类似string.contains逻辑来检查包括哪些列名。 A * should select all columns. *应该选择所有列。 Since you already have the collection of valid rows just loop through them and get all the data you actually need from each row. 由于您已经具有有效行的集合,因此可以遍历它们并从每一行中获取实际需要的所有数据。 For example you could just concatenate all the valid data (as determined by the string.contains ) into a long string terminated by a new line character. 例如,您可以将所有有效数据(由string.contains确定)串联成一个长字符串,该字符串以换行符结尾。

That should work for you requirement. 那应该满足您的要求。 Sorry to not include any actual code but this outline should help. 抱歉,没有提供任何实际代码,但此概述对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM