简体   繁体   English

PostgreSQL和使用字符变[[]]字段的联接

[英]PostgreSQL and joining using character varying[] fields

I've inherited a PostgreSQL 9.2.4 database and while I have a fairly extensive background in SQL Server I'm having a little trouble wrapping my head around a problem I'm encountering. 我继承了PostgreSQL 9.2.4数据库,虽然我在SQL Server中拥有相当丰富的背景知识,但是在解决我遇到的问题时遇到了一些麻烦。

I have one table that has three fields (among other things) in it. 我有一个表,其中包含三个字段(除其他外)。 "age_years", "age_months", and "age_days". “ age_years”,“ age_months”和“ age_days”。 If someone in the table is 2 months old or younger then they have a value in the "age_days" field for the number of days old they are. 如果表格中的某人年龄小于或等于2个月,则他们在“ age_days”字段中具有其所占天数的值。 If they are less than 3 years old but older than 2 months then they have a value in the "age_months" field. 如果它们小于3岁但大于2个月,则它们在“ age_months”字段中具有一个值。 Anything older than 3 and they have a value in the "age_years" field. 年龄大于3且在“ age_years”字段中具有值的任何值。

A given record only has a non-zero value in one of those three fields. 给定记录在这三个字段之一中仅具有非零值。 There will never be a situation where, for instance, age_days and age_years both have a non-zero value. 永远不会存在age_days和age_years都具有非零值的情况。 These records represent hospital visits and the ages are the age of the individual at the time of the visit. 这些记录代表医院就诊,年龄是就诊时个人的年龄。

In another table I have several character varying[] fields with up to 20 values. 在另一个表中,我有几个最多包含20个值的character changes []字段。 They are ref_age_cd, ref_age, ref_clow, and ref_chigh. 它们是ref_age_cd,ref_age,ref_clow和ref_chigh。 Here is an example record from that table (with fewer values than the max just for display purposes): 这是该表中的示例记录(仅出于显示目的,其值比最大值少):

My apologies for the ugly lines below. 我为以下丑陋的道歉。 I can't seem to get them to format in a very readable condition. 我似乎无法让它们以非常易读的格式进行格式化。

ref_age_cd | ref_age | ref_clow | ref_chigh

[D,D,D,M,M,Y,Y,Y]   [1,4,15,2,7,13,18,199]  [9.1,9.8,5.4,5.5,7.9,5.1,4.8,4.8]   [27.1,27.8,16.4,15.8,15.9,11.1,10.8,10.8]

The ref_age_cd field determines what kind of age you're looking at (days, months, or years). ref_age_cd字段确定您要查看的年龄(天,月或年)。 ref_age determines the value, and then based on those two you get the low and high values from the ref_clow and ref_chigh fields. ref_age确定该值,然后根据这两个值从ref_clow和ref_chigh字段中获得低值和高值。 So for example, if someone has a 13 in the age_months field then you would look at ref_age_cd and find the 'M' values in the array and then look at the corresponding ref_age field and find the largest value that is lower than the value in the age_months field. 因此,例如,如果某人的age_months字段中有13,则您可以查看ref_age_cd并在数组中找到'M'值,然后查看相应的ref_age字段并找到低于该值的最大值age_months字段。 So the array index would be 5. Then you grab the fifth value in the ref_clow and ref_chigh fields for the low and high values. 因此,数组索引将为5。然后在ref_clow和ref_chigh字段中获取低值和高值的第五个值。 (7.9 and 15.9 respectively) (分别为7.9和15.9)

If someone was 10 days old the array index to look at would be 2 (ref_age_cd of 'D' and ref_age of 4). 如果某人年龄为10天,则要查看的数组索引将为2(“ D”的ref_age_cd和ref_age为4)。 This would indicate a low and high value of 9.8 and 27.8. 这将指示9.8和27.8的低值和高值。 If they were 80 years old the index would be 7 (ref_age_cd of 'Y' and ref_age of 18). 如果他们80岁,则索引将为7(“ Y”的ref_age_cd和18的ref_age)。 Low and high values of 4.8 and 10.8. 低值和高值分别为4.8和10.8。

I just can't figure out how to program this so when I join from table A (with the age_days, age_months, or age_years fields) to the reference table I can pull the right array index for ref_clow and ref_chigh. 我只是不知道如何编程,因此当我从表A(具有age_days,age_months或age_years字段)加入参考表时,可以为ref_clow和ref_chigh拉正确的数组索引。

I should also mention that I have no ability to make any changes to this database. 我还应该提到,我无法对该数据库进行任何更改。 I need to make this work with what I've been given. 我需要根据所给的知识进行这项工作。

For a single patient, try something like this: 对于单个患者,请尝试以下操作:

/* Creating test environment
CREATE TABLE refs (
  id serial NOT NULL,
  ref_age_cd character(1)[],
  ref_age integer[],
  ref_clow double precision[],
  ref_chigh double precision[],
  CONSTRAINT refs_pkey PRIMARY KEY (id)
);
INSERT INTO refs(ref_age_cd, ref_age, ref_clow, ref_chigh)
       VALUES ('{"D","D","D","M","M","Y","Y","Y"}',
               '{1,4,15,2,7,13,18,199}',
               '{9.1,9.8,5.4,5.5,7.9,5.1,4.8,4.8}',
               '{27.1,27.8,16.4,15.8,15.9,11.1,10.8,10.8}');
CREATE TABLE pats (
  id serial NOT NULL,
  name varchar(255) NOT NULL,
  age_years integer,
  age_months integer,
  age_days integer,
  CONSTRAINT pats_pkey PRIMARY KEY (id)
);
INSERT INTO pats
       VALUES (DEFAULT, 'newborn', NULL, NULL, 10),
              (DEFAULT, 'baby', NULL, 13, NULL),
              (DEFAULT, 'adult', 80, NULL, NULL);
*/

-- Replace filters here to select only one row...
WITH tt AS ( SELECT * FROM refs WHERE id = 1 )
SELECT w.*, ref_clow, ref_chigh
FROM ( SELECT row_number() OVER () AS nr, unnest AS ref_age_cd
       FROM UNNEST( (SELECT ref_age_cd FROM tt ) ), tt ) q1
JOIN ( SELECT row_number() OVER () AS nr, unnest AS ref_age
       FROM UNNEST( (SELECT ref_age FROM tt ) ), tt ) q2 USING ( nr )
JOIN ( SELECT row_number() OVER () AS nr, unnest AS ref_clow
       FROM UNNEST( (SELECT ref_clow FROM tt ) ), tt ) q3 USING ( nr )
JOIN ( SELECT row_number() OVER () AS nr, unnest AS ref_chigh
       FROM UNNEST( (SELECT ref_chigh FROM tt ) ), tt ) q4 USING ( nr )
JOIN ( SELECT id, name, age_years, age_months, age_days,
              CASE WHEN age_years IS NOT NULL THEN 'Y'
                   WHEN age_months IS NOT NULL THEN 'M'
                   WHEN age_days IS NOT NULL THEN 'D' END AS ref_age_cd,
              CASE WHEN age_years IS NOT NULL THEN age_years
                   WHEN age_months IS NOT NULL THEN age_months
                   WHEN age_days IS NOT NULL THEN age_days END AS age
       -- Replace filters here to select only one row...
       FROM pats WHERE id = 2
     ) w USING (ref_age_cd)
WHERE ref_age <= age
ORDER BY ref_age DESC
LIMIT 1;

Outputs: 输出:

2;"baby";<NULL>;13;<NULL>;"M";13;7.9;15.9

This ended up doing the trick. 最终达到目的。 Posted so others might be able to use it. 发布,以便其他人可以使用它。

--test data in first two "with" statements
with a AS (
  select 1 AS patient_nr, CAST(2 AS INT) AS age_days, CAST(NULL AS INT) AS age_months, CAST(NULL AS INT) AS age_years
  UNION ALL
  select  2 AS patient_nr, CAST(16 AS INT) AS age_days, CAST(NULL AS INT) AS age_months, CAST(NULL AS INT) AS age_years
  UNION ALL
  select  3 AS patient_nr, CAST(NULL AS INT) AS age_days, CAST(13 AS INT) AS age_months, CAST(NULL AS INT) AS age_years
  UNION ALL
  select  4 AS patient_nr, CAST(10 AS INT) AS age_days, CAST(NULL AS INT) AS age_months, CAST(NULL AS INT) AS age_years
  UNION ALL
  select  5 AS patient_nr, CAST(NULL AS INT) AS age_days, CAST(NULL AS INT) AS age_months, CAST(80 AS INT) AS age_years
), b as (
  SELECT ARRAY['D','D','D','M','M','Y','Y','Y'] AS ref_age_cd
       , ARRAY[1,4,15,2,7,13,18,199] AS ref_age
       , ARRAY[9.1,9.8,5.4,5.5,7.9,5.1,4.8,4.8] AS ref_clow
       , ARRAY[27.1,27.8,16.4,15.8,15.9,11.1,10.8,10.8] AS ref_chigh
), refTable AS (
SELECT unnest(ref_age_cd) ref_age_cd
 , unnest(ref_age) ref_age
 , unnest(ref_clow) ref_clow
 , unnest(ref_chigh) ref_chigh
  FROM b
), res AS (
SELECT A.*, rt.*, ROW_NUMBER() OVER(PARTITION BY patient_nr ORDER BY ref_age DESC) AS rn
  FROM A
  LEFT JOIN refTable rt ON (rt.ref_age_cd = 'D' AND a.age_days > rt.ref_age)
                        OR (rt.ref_age_cd = 'M' AND a.age_months > rt.ref_age)
                        OR (rt.ref_age_cd = 'Y' AND a.age_years > rt.ref_age)
 )
 SELECT * 
   FROM res
  WHERE rn = 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM