简体   繁体   English

如何在 MySQL 中获得条件满足的行的 position

[英]How to get position of rows where a condition meet in MySQL

This is my sample data set...这是我的样本数据集...

CREATE TABLE blockhashtable (
    id SERIAL PRIMARY KEY 
    ,pos int
    ,filehash varchar(35)
    ,blockhash varchar(130) 
);    

insert into blockhashtable 
(pos,filehash,blockhash) values 
(1, "randommd51", "randstr1"),
(2, "randommd51", "randstr2"),
(3, "randommd51", "randstr3"),
(1, "randommd52", "randstr2"),
(2, "randommd52", "randstr2"),
(3, "randommd52", "randstr1"),
(4, "randommd52", "randstr7"),
(1, "randommd53", "randstr2"),
(2, "randommd53", "randstr1"),
(3, "randommd53", "randstr2"),
(4, "randommd53", "randstr3"),
(1, "randommd54", "randstr4"),
(2, "randommd54", "randstr55");

...and fiddle of same http://sqlfiddle.com/#!9/e5b201/14 ...和相同的http://sqlfiddle.com/#!9/e5b201/14的小提琴

This is my current SQL query and output:这是我当前的 SQL 查询和 output:

select pos,filehash,avg( (blockhash in ('randstr1', 'randstr2', 'randstr3') )) as matching_ratio from blockhashtable group by filehash;

pos filehash    matching_ratio
1   randommd51  1
1   randommd52  0.75
1   randommd53  1
1   randommd54  0

My expected output is something like this this:我预期的 output 是这样的:

pos       filehash      matching_ratio
1,2       randommd51    1
1,3       randommd52    0.5
1,2,4     randommd53    0.75
0         randommd54    0

The pos in last row can be 1 also, I can remove it using a custom condition in python later.最后rowpos也可以是1 ,我可以稍后使用 python 中的自定义条件将其删除。

Basically, in my python list, randstr2 only repeat one time, so I want only maximum one match found in the SQL query.基本上,在我的 python 列表中, randstr2只重复一次,所以我只希望在 SQL 查询中找到最多一个匹配项。 That's why matching_ratio is different in my expected output.这就是为什么matching_ratio在我预期的output中不同的原因。

I don't see how your result set corresponds to your data set, but you seem to be after something like this...我看不出你的结果集与你的数据集是如何对应的,但你似乎在追求这样的东西......

SELECT filehash
     , GROUP_CONCAT(pos ORDER BY pos) pos
     , 1-(COUNT(DISTINCT blockhash IN('randstr1','randstr2','randstr3'))/(COUNT(*))) ratio
  FROM blockhashtable
 GROUP
    BY filehash;
+------------+---------+--------+
| filehash   | pos     | ratio  |
+------------+---------+--------+
| randommd51 | 1,2,3   | 0.6667 |
| randommd52 | 1,2,3,4 | 0.5000 |
| randommd53 | 1,2,3,4 | 0.7500 |
| randommd54 | 1,2     | 0.5000 |
+------------+---------+--------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算有多少行满足变化条件 - Counting how many rows meet the changing condition 如何在熊猫中找到满足条件的所有行 - How to find all rows that meet a condition in Panda 删除/编辑 dataframe 中条目不满足条件的行 - Drop/edit rows in dataframe where entry doesn't meet condition 如何有效地获取DataFrame行的索引,这些行符合某些累积条件? - How to efficiently get indices of rows of DataFrame, where these rows meet certain cumulative criteria? 如何在 numpy.where 条件后获取特定数组 position? - How to get a specific array position after numpy.where condition? 计算有多少连续行满足条件 pandas - Count how many consecutive rows meet a condition with pandas 如果某些行部分满足某些条件,如何从 dataframe 中删除某些行 - How to drop certain rows from dataframe if they partially meet certain condition 如何从pyspark的数据框中获取满足条件的列? - How could I get columns that meet a condition from a dataframe in pyspark? 如何在条件满足之前用N行中的一些子集条件行,比我的代码更快? - How to subset row of condition with some of N rows before the condition meet , more faster than my code? 删除满足条件的数据框行的一半 - Deleting half of dataframe rows which meet condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM