简体   繁体   English

UniData UniQuery-两个WITH

[英]UniData UniQuery - two WITH

Alright I have little to no knowledge of SQL language, and am wondering what are the possible reasons for the slowness of two WITH vs one WITH in unidata. 好吧,我对SQL语言几乎一无所知,并且想知道unidata中两个WITH相对于一个WITH较慢的可能原因是什么?

Database has around ~1 million rows. 数据库大约有100万行。

Ie/ 即/

SELECT somewhere WITH Column1 = "str" AND WITH Column2 = "Int" 5< minutes SELECT somewhere WITH Column1 = "str" AND WITH Column2 = "Int" 5 <分钟

Compared to 相比

SELECT somewhere WITH Column1 = "str" ~1 second SELECT somewhere WITH Column1 = "str" 〜1秒SELECT somewhere WITH Column1 = "str"

somewhere is indexed (from my knowledge) 某处被索引(据我所知)

so is there anything I'm doing wrong? 所以我做错了什么吗?

If more information is required just ask, not sure what to supply. 如果需要更多信息,请问,不确定要提供什么。

Also whats the difference between WITH and WHERE? 另外,WITH和WHERE有什么区别?

This isn't SQL, it is UniQuery. 这不是SQL,而是UniQuery。

To clarify it for you, you can't index the file ( somewhere , in this case), only the columns of the file. 为了给您澄清一下,您无法索引文件(在本例中为somewhere ),只能索引文件的列。 You might find Column1 is indexed and Column2 is not. 您可能会发现Column1已被索引,而Column2没有。 Type in LIST.INDEX somewhere to find out what columns have been indexed. LIST.INDEX somewhere键入LIST.INDEX somewhere以查找已建立索引的列。

For your question, you have only compared selecting on Column1 against selecting on Column1 & Column2 and assumed the vastly slower response is purely because you selected on 2 columns. 对于您的问题,您仅将在Column1上进行选择与在Column1和Column2上进行选择进行了比较,并认为响应速度慢得多纯粹是因为您选择了2列。 Your next text should have been to select only on Column2 and seen how slow that was. 您的下一个文本应该是仅在Column2上进行选择,然后看看这样做有多慢。

There are are many possible reasons to explain the difference in response, aside from indexing. 除了索引之外,还有许多可能的原因来解释响应的差异。 In UniData columns are defined as 'dictionary items' There are different types of dictionary items. 在UniData中,列定义为“字典项”。字典项有不同类型 The most basic is a D-type dictionary item which is just a direct reference to a field in the record. 最基本的是D型字典项,它只是对记录中字段的直接引用。 Another type is the I or V-type, which is a derived field. 另一种类型是I或V型,它是派生字段。 The derived field can be as simple as returning a constant or as complex as performing an equivalent performing a JOIN with another file and/or some form of complex calculation. 派生字段可以像返回常数一样简单,也可以像执行等效操作一样复杂,对另一个文件执行JOIN和/或某种形式的复杂计算。 This this is should be simple to see that different columns can take vastly different amounts of processing to handle. 这很容易看出来,不同的列可能需要处理大量不同的处理。

Other reasons are how deep in the record the column is (first field references will be faster than fields later in the record) as well as potential query caching that can affect the timings of your SELECTs. 其他原因是该列在记录中有多深(第一个字段引用将比记录中后面的字段快)以及可能影响SELECT时序的潜在查询缓存。

For more information, check out the database's manuals at Rocket Software . 有关更多信息,请在Rocket Software上查阅数据库的手册。

A single column SELECT on an indexed field will not even require that any data file records are read. 索引字段上的单列SELECT甚至不需要读取任何数据文件记录。 If you look under the hood, you'll see that the index file is a normal hash file, and the single column SELECT will simply mean that the index file record with the key "str" is read. 如果您仔细看一下,您会发现索引文件是一个普通的哈希文件,并且单列SELECT只是意味着将读取键为“ str”的索引文件记录。 This could return thousands and thousands of keys in less than a second. 这可能会在不到一秒钟的时间内返回成千上万的键。

Once you add the second column, you are probably forcing the system to read all of those thousands and thousands of records, EVEN IF THE SECOND COLUMN IS INDEXED. 一旦添加了第二列,您可能会强制系统读取所有成千上万的记录,即使第二列没有显示也是如此。 This is going to take a measurable amount of more time. 这将花费更多的时间。

In general, an index on a field with a small number of unique values is of dubious use. 通常,在具有少量唯一值的字段上使用索引值得怀疑。 If the second column contains data that has a large number of possible values, leading to a smaller number of records with each particular index value, then it would be best to arrange the SELECT such that the index used is on the second column. 如果第二列包含具有大量可能值的数据,导致每个特定索引值的记录数量较少,则最好安排SELECT,使所使用的索引位于第二列上。 I'm not sure, but it might be possible to simply reverse the order of the columns in the SELECT statement to do this. 我不确定,但是可以简单地反转SELECT语句中列的顺序来执行此操作。 Otherwise you might need to run two SELECT statements back to back. 否则,您可能需要背对背运行两个SELECT语句。

As an example, assume that the file has 600,000 records with Column1 = "str", and 2,000 records with Column2 = "int": 例如,假设文件具有Column1 =“ str”的600,000条记录和Column2 =“ int”的2,000条记录:

>SELECT somewhere WITH Column2 = "int"
>>SELECT somewhere with Column1 = "str"

Will read 2,000 records and should return almost instantly. 将读取2,000条记录,几乎应立即返回。

If the combination of Column1 and Column2 is something that you'll be SELECTing on frequently, then you might want to create a new dictionary item that combines the two, and build an index on that. 如果您经常要选择Column1和Column2的组合,那么您可能想要创建一个将两者结合在一起的新词典项目,并在此基础上建立索引。

That being said, it shouldn't take a U2 system 5 minutes to run through a file of a million records. 话虽如此,U2系统不需要花5分钟就可以浏览一百万条记录的文件。 There's a very good chance that the file has become badly overflowed, and needs to be resized with a larger modulo to improve performance. 该文件很有可能已严重溢出,需要使用较大的模数来调整其大小以提高性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM