简体   繁体   English

使性能更好地检查数据库中是否存在字符串

[英]Make performance better checking if a String exists in database

ps = (PreparedStatement) connection.prepareStatement(
      "SELECT nm.id, nid.key, nm.name, nm.languageCode FROM odds.name as nm JOIN (odds.name_id as nid)\r\n"
    + "ON (nm.id = nid.id) where nm.name like '%' and nid.key not like \"vhc%\" and nid.key not like \"vdr%\" and nid.key not like \"vto%\" and nid.key not like \"vbl%\"\r\n"
    + "and nid.key not like \"vf%\" and nid.key not like \"vfl%\" and nid.key not like \"vsm%\" and nid.key not like \"rgs%\"\r\n"
    + "and nid.key not like \"srrgs%\" and nm.typeId=8 and nm.sourceId=-1 and nm.languageCode = 'en'");

    for(Entry <String, Tag> e : allTags.entrySet()) {
        ResultSet rs = ps.executeQuery();
        while(rs.next()) {
            if(rs.getString("name").equals(e.getValue().getTranslation(Language.EN))) {
                e.getValue().setAlternativeKey(rs.getString("name"));
                break;
            }
        }
    }
);

Do you have any Ideas how I can do this a way faster. 您有任何想法,我将如何更快地做到这一点。 I'll try to find a string in the database and add an extra information to my object. 我将尝试在数据库中找到一个字符串,然后向我的对象添加额外的信息。 But I have to do this for 1265 objects, so the program runs about 80 seconds. 但是我必须对1265个对象执行此操作,因此程序运行了大约80秒。
Thanks in advance! 提前致谢!

Open the DB-client of your choice (eg HeidiSQL) and do an 打开您选择的数据库客户端(例如HeidiSQL)并执行

explain [the select statement that is originally executed]

That way MySQL explains to you what it's doing when trying to create the result and where time gets lost. 这样,MySQL会向您说明在尝试创建结果时会做些什么以及浪费时间。

From there you can go on eg creating indizes or changing your query to make use of existing ones. 从那里,您可以继续进行例如创建索引或更改查询以利用现有查询。

BTW: 顺便说一句:

nm.name like '%' 

looks strange. 看起来很奇怪 Is that a variant of 那是

is not null

The latter might be faster. 后者可能会更快。 If the texts in the other like-statements are always the same, a better performance might be achieved by checking these conditions when inserting the data, add columns of type int or boolean and save the result of this check as integer/boolean in addition to the text itself. 如果其他like语句中的文本始终相同,则在插入数据时检查这些条件,添加int或boolean类型的列,并将此检查的结果另存为整数/布尔值,则可能会获得更好的性能。文字本身。 Checking against a fixed numeric value is way faster than text searches. 检查固定的数字值比文本搜索要快得多。

First of all, when tackling performance problems, get yourself a profiling tool that tells you where you're spending the time, how often a given method is called and so on. 首先,在解决性能问题时,请给自己一个性能分析工具,该工具可以告诉您您在哪里花时间,调用给定方法的频率等。

But I think the case is clear enough to give some more specific hints. 但是我认为这种情况很清楚,可以给出一些更具体的提示。

You're executing your PreparedStatement over and over again, once for every entry in allTags.entrySet() , always giving you the same results, and inside in software you filter out the lines you're interested in. So you're doing the same query 1265 times, correct? 您要一次又一次地执行PreparedStatement,对allTags.entrySet()每个条目执行一次,总是得到相同的结果,然后在软件中过滤出您感兴趣的行。同样查询1265次,对吗?

And it's puzzling me what you're doing inside the while(rs.next()) loop. 这让我感到困惑, while(rs.next())您在while(rs.next())循环中正在做什么。 Effectively, your code does (after introducing some local variables, moving constant values out of loops, ...): 实际上,您的代码可以做到(在引入一些局部变量之后,将常量值移出循环,...):

    for(Entry <String, Tag> e : allTags.entrySet()) {
        Tag tag = e.getValue();
        String translation = tag.getTranslation(Language.EN);
        ResultSet rs = ps.executeQuery();
        while(rs.next()) {
            if(rs.getString("name").equals(translation)) {
                tag.setAlternativeKey(translation);
                break;
            }
        }
    }

So, the only role of the query result seems to be to decide whether the alternative key should be set (if the translation of your Tag shows up as name in the ResultSet ) - the value is already fixed by the result of the method call getTranslation(Language.EN) , independent of any database result. 因此,查询结果的唯一作用似乎是决定是否应设置替代键(如果您的Tag的翻译在ResultSet显示为名称)-该值已由方法调用getTranslation(Language.EN)的结果固定getTranslation(Language.EN) ,独立于任何数据库结果。

I'd suggest to do one execution of your query, collecting the name values in a HashSet names , and after that do the allTags loop setting the translation if the translation is contained in your names set. 我建议执行一次查询,将name值收集在HashSet names ,然后,如果翻译包含在您的names集中,则执行allTags循环设置翻译。 That should give the same result as your code, and probably much faster. 那应该得到与您的代码相同的结果,并且可能更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM