简体   繁体   English

通过随机选择表执行SELECT sql

[英]Execute SELECT sql by randomly picking tables

I am working on a project in which I have two tables in a different database with different schemas. 我正在一个项目中,在一个具有不同模式的不同数据库中有两个表。 So that means I have two different connection parameters for those two tables to connect using JDBC- 因此,这意味着我有两个不同的连接参数供这两个表使用JDBC-连接

Let's suppose below is the config.property file- 让我们假设下面是config.property文件-

TABLES: table1 table2

#For Table1
table1.url: jdbc:mysql://localhost:3306/garden
table1.user: gardener
table1.password: shavel
table1.driver: jdbc-driver
table1.percentage: 80



#For Table2
table2.url: jdbc:mysql://otherhost:3306/forest
table2.user: forester
table2.password: axe
table2.driver: jdbc-driver
table2.percentage: 20

Below method will read the above config.property file and make a ReadTableConnectionInfo object for each tables. 下面的方法将读取上面的config.property file并为每个表创建一个ReadTableConnectionInfo object

private static HashMap<String, ReadTableConnectionInfo> tableList = new HashMap<String, ReadTableConnectionInfo>();

private static void readPropertyFile() throws IOException {

    prop.load(Read.class.getClassLoader().getResourceAsStream("config.properties"));

    tableNames = Arrays.asList(prop.getProperty("TABLES").split(" "));

    for (String arg : tableNames) {

        ReadTableConnectionInfo ci = new ReadTableConnectionInfo();

        String url = prop.getProperty(arg + ".url");
        String user = prop.getProperty(arg + ".user");
        String password = prop.getProperty(arg + ".password");
        String driver = prop.getProperty(arg + ".driver");
        double percentage = Double.parseDouble(prop.getProperty(arg + ".percentage"));

        ci.setUrl(url);
        ci.setUser(user);
        ci.setPassword(password);
        ci.setDriver(driver);
        ci.setPercentage(percentage);

        tableList.put(arg, ci);
    }

}

Below is the ReadTableConnectionInfo class that will hold all the table connection info for a particular table. 下面是ReadTableConnectionInfo类,该类将保存特定表的所有表连接信息。

public class ReadTableConnectionInfo {

    public String url;
    public String user;
    public String password;
    public String driver;
    public String percentage;

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }

    public String getUser() {
        return user;
    }

    public void setUser(String user) {
        this.user = user;
    }

    public String getPassword() {
        return password;
    }

    public void setPassword(String password) {
        this.password = password;
    }

    public String getDriver() {
        return driver;
    }

    public void setDriver(String driver) {
        this.driver = driver;
    }

    public double getPercentage() {
        return percentage;
    }

    public void setPercentage(double percentage) {
        this.percentage = percentage;
    }
}

Now I am creating ExecutorService for specified number of threads and passing this tableList object to constructor of ReadTask class- 现在,我为指定数量的线程创建ExecutorService,并将此tableList object传递给ReadTask类的构造函数-

        // create thread pool with given size
        ExecutorService service = Executors.newFixedThreadPool(10);

        for (int i = 0; i < 10; i++) {
            service.submit(new ReadTask(tableList));
        }

Below is my ReadTask that implements Runnable interface in which each thread is supposed to make a connection for each tables. 下面是我的ReadTask ,它实现了Runnable interface ,该Runnable interface中的每个线程ReadTask为每个表建立连接。

class ReadTask implements Runnable {

    private final HashMap<String, XMPReadTableConnectionInfo> tableLists;

public ReadTask(HashMap<String, ReadTableConnectionInfo> tableList) {
    this.tableLists = tableList;
}


@Override
public void run() {

    int j = 0;
    dbConnection = new Connection[tableLists.size()];
    statement = new Statement[tableLists.size()];

    //loop around the map values and make the connection list
    for (ReadTableConnectionInfo ci : tableLists.values()) {

        dbConnection[j] = getDBConnection(ci.getUrl(), ci.getUser(), ci.getPassword(), ci.getDriver());
        statement[j] = dbConnection[j].createStatement();

        j++;
    }

    while (System.currentTimeMillis() <= 60 minutes) {

    /* Generate random number and check to see whether that random number
     * falls between 1 and 80, if yes, then choose table1
     * and then use table1 connection and statement that I made above and do a SELECT * on that table.
     * If that random numbers falls between 81 and 100 then choose table2 
     * and then use table2 connection and statement and do a SELECT * on that table
     */

    ResultSet rs = statement[what_table_statement].executeQuery(selectTableSQL);

    }
     }
}

Currently I have two tables, that means each thread will make two connections for each table and then use that particular table connection for doing SELECT * on that table depending on the random generation number. 当前,我有两个表,这意味着每个线程将为每个表建立两个连接,然后使用该特定表连接对表进行SELECT *,具体取决于随机代数。

Algorithm:- 算法:-

  1. Generate Random number between 1 and 100. 生成1到100之间的随机数。
  2. If that random number is less than table1.getPercentage() then choose table1 and then use table1 statement object to make a SELECT sql call to that database. 如果该随机数小于table1.getPercentage()则选择table1 ,然后使用table1 statement object对该数据库进行SELECT sql call
  3. else choose table2 and then use table2 statement object to make a SELECT sql call to that database. 否则,请选择table2 ,然后使用table2 statement object对该数据库进行SELECT sql call

My Question- 我的问题-

I am having hard time in figuring out how should apply the above algorithm and how should I compare the random number with each tables percentage and then decide which table I need to use and after that figure out which table connection and statements I need to use to make a SELECT sql call . 我在弄清楚如何应用上述算法以及如何将random number与每个tables percentage进行比较,然后决定要使用的tables percentage以及确定要使用的table connection and statements遇到困难。进行SELECT sql call

So that means I need to check getPercentage() method of each table and them compare with the Random Number. 因此,这意味着我需要检查每个表的getPercentage()方法,并将它们与随机数进行比较。

Right now I have only two tables, in future I can have three tables, with percentage distribution might be as 80 10 10 . 现在我只有两个表,将来我可以有三个表,百分比分布可能是80 10 10

UPDATE:- 更新: -

class ReadTask implements Runnable {

    private Connection[] dbConnection = null;
    private ConcurrentHashMap<ReadTableConnectionInfo, Connection> tableStatement = new ConcurrentHashMap<ReadTableConnectionInfo, Connection>();

    public ReadTask(LinkedHashMap<String, XMPReadTableConnectionInfo> tableList) {
        this.tableLists = tableList;
    }


    @Override
    public run() {

    int j = 0;
    dbConnection = new Connection[tableLists.size()];

    //loop around the map values and make the connection list
    for (ReadTableConnectionInfo ci : tableLists.values()) {

    dbConnection[j] = getDBConnection(ci.getUrl(), ci.getUser(), ci.getPassword(), ci.getDriver());
    tableStatement.putIfAbsent(ci, dbConnection[j]);

    j++;
    }

      Random random = new SecureRandom();

      while ( < 60 minutes) {

        double randomNumber = random.nextDouble() * 100.0;
        ReadTableConnectionInfo table = selectRandomConnection(randomNumber);

        for (Map.Entry<ReadTableConnectionInfo, Connection> entry : tableStatement.entrySet()) {

            if (entry.getKey().getTableName().equals(table.getTableName())) {

                final String id = generateRandomId(random);
                final String selectSql = generateRandomSQL(table);

                preparedStatement = entry.getValue().prepareCall(selectSql);
                preparedStatement.setString(1, id);

                rs = preparedStatement.executeQuery();
            }
        }
      }
    }



        private String generateRandomSQL(ReadTableConnectionInfo table) {

        int rNumber = random.nextInt(table.getColumns().size());

        List<String> shuffledColumns = new ArrayList<String>(table.getColumns());
        Collections.shuffle(shuffledColumns);

        String columnsList = "";

        for (int i = 0; i < rNumber; i++) {
            columnsList += ("," + shuffledColumns.get(i));
        }

        final String sql = "SELECT ID" + columnsList + "  from "
                + table.getTableName() + " where id = ?";

        return sql;
    }


    private ReadTableConnectionInfo selectRandomConnection(double randomNumber) {

        double limit = 0;
        for (ReadTableConnectionInfo ci : tableLists.values()) {
            limit += ci.getPercentage();
            if (random.nextDouble() < limit) {
                return ci;
            }
            throw new IllegalStateException();
        }
        return null;
    }
    }

You could think of it as a loop over the available connections, something like the following: 您可以将其视为可用连接上的循环,如下所示:

public run() {
  ...
  Random random = new SecureRandom();

  while ( < 60 minutes) {
    double randomNumber = random.nextDouble() * 100.0;
    ReadTableConnectionInfo tableInfo = selectRandomConnection(randomNumber);

    // do query...
  }
}


private ReadTableConnectionInfo selectRandomConnection(double randomNumber) {
  double limit = 0;
  for (ReadTableConnectionInfo ci : tableLists.values()) {
    limit += ci.getPercentage();
    if (randomNumber < limit) {
      return ci;
  }
  throw new IllegalStateException();
}

As long as randomNumber has a maximum value of less then sum(percentage), that'll do the job. 只要randomNumber的最大值小于sum(percentage),就可以完成工作。

One other thing I thought of: if you're going to end up having so many possible queries that the a looping lookup becomes an issue, you could build a lookup table: create an array such that the total size of the array contains enough entries so that the relative weightings of the queries can be represented with integers. 我想到的另一件事:如果最终要查询的内容太多,以至于循环查找成为问题,则可以构建查找表:创建一个数组,使数组的总大小包含足够的条目这样查询的相对权重可以用整数表示。

For your example of three queries, 80:10:10, have a 10-entry array of ReadTableConnectionInfo with eight references pointing to table1, one to table2, and one to table3. 对于您的三个查询的示例(80:10:10),具有10个条目的ReadTableConnectionInfo数组,其中八个引用指向table1,一个指向table2,一个指向ReadTableConnectionInfo Then simply scale your random number to be 0 <= rand < 10 (eg (int)(Math.random() * 10) , and use it to index in to your array. 然后简单地将您的随机数缩放为0 <= rand < 10 (例如(int)(Math.random() * 10) ,并使用它来索引您的数组。

Regardless of how many tables you have, their percentages will always add up to 100. The easiest way to conceptualize how you would choose is to think of each table as representing a range of percentages. 不管您有多少张桌子,它们的百分比总和始终为100。概念化选择方式的最简单方法是将每张桌子视为代表一定百分比的范围。

For instance, with three tables that have the percents you mentioned (80%, 10%, 10%), you could conceptualize them as: 例如,如果三个表具有您提到的百分比(80%,10%,10%),则可以将它们概念化为:

Random Number From To == Table == 0.0000 0.8000 Table_1 0.8000 0.9000 Table_2 0.9000 1.0000 Table_3 从到的随机数==表== 0.0000 0.8000表_1 0.8000 0.9000表_2 0.9000 1.0000表_3

So, generate a Random # between 0.0000 and 1.0000 and then go down the ordered list and see which range fits, and therefore which table to use. 因此,生成一个介于0.0000和1.0000之间的随机数,然后在有序列表中查找适合的范围,从而查看要使用的表。

(BTW: I'm not sure why you have two connections for each table.) (顺便说一句:我不确定为什么每个表都有两个连接。)

You can build a lookup table which contains the table name and its weight: 您可以构建一个包含表名称及其权重的查找表:

class LookupTable {
    private int[]    weights;
    private String[] tables;
    private int      size = 0;

    public LookupTable(int n) {
        this.weights = new int[n];
        this.tables = new String[n];
    }

    public void addTable(String tableName, int r) {
        this.weights[size] = r;
        this.tables[size] = tableName;
        size++;
    }

    public String lookupTable(int n) {
        for (int i = 0; i < this.size; i++) {
            if (this.weights[i] >= n) {
                return this.tables[i];
            }
        }
        return null;
    }
}

The code to initialize the table: 初始化表的代码:

    LookupTable tr = new LookupTable(3);
    // make sure adds the range from lower to upper!
    tr.addTable("table1", 20);
    tr.addTable("table2", 80);
    tr.addTable("table3", 100);

The test code: 测试代码:

    Random r = new Random(System.currentTimeMillis());
    for (int i = 0; i < 10; i++) {
        // r.nextInt(101) + 1 would return a number of range [1~100]. 
        int n = r.nextInt(101) + 1;
        String tableName = tr.lookupTable(n);
        System.out.println(n + ":" + tableName);
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM