Merging records with left joins producing too many records
Merging records with left joins producing too many records
PostgreSQL 10.1
I have the following SQL command that is almost good enough -- but not quite. (The title is the best I could come up with :))
The idea is to have a speciality which can have 0,1, or more offices. Each office can have 0,1, or more consultants (i.e., lastname/firstname).
The goal is to always show the speciality name on each record.
If there is any office for that speciality, then that office needs to appear (without duplication) with that speciality. (It is possible that the same office can be listed with more then one speciality).
If there is any consultant with the office listed, then the consultant should be listed with the office.
An office that has a consultant should always be listed with that consultant.
By the above requirements, the following SQL essentially produces too many records. That is, in studying the output:
How can I write the SQL to correctly produce the desired results? (Or would a program approach be better?)
Thanks in advance for any help.
select s.speciality_name, f.office_name, f.street, f.city, f.state, f.zipcode, f.phone, f.fax, cs.lastname, cs.firstname, 0 as status, s.recid
from speciality s
left join office_speciality os on os.speciality_recid = s.recid
left join office f on f.recid = os.office_recid
left join consultant_office co on co.office_recid = f.recid
left join consultant cs on cs.recid = co.consultant_recid and cs.speciality_recid = s.recid
group by speciality_name, office_name, street, city, state, zipcode, phone, fax, lastname, firstname, s.recid
order by s.speciality_name, f.office_name, cs.lastname, cs.firstname;
The resulting table appears as:
Edit: Using the below code still results in too many partial rows being generated when a complete row exists. Line #12 and Line #14:
with office_speciality_consultants AS (
select distinct on (os.office_recid, os.speciality_recid, cs.recid)
os.office_recid, os.speciality_recid,
f.office_name, f.street, f.city, f.state, f.zipcode, f.phone, f.fax,
cs.lastname, cs.firstname
from office_speciality os
join office f on f.recid = os.office_recid
left join consultant_office co on co.office_recid = f.recid
left join consultant cs on cs.recid = co.consultant_recid
and cs.speciality_recid = os.speciality_recid
order by os.office_recid, os.speciality_recid, cs.recid, f.office_name, cs.lastname, cs.firstname
)
select
s.speciality_name,
ofc.office_name, ofc.street, ofc.city, ofc.state, ofc.zipcode, ofc.phone, ofc.fax,
ofc.lastname, ofc.firstname,
0 as status, s.recid
from speciality s
join office_speciality_consultants ofc ON ofc.speciality_recid = s.recid
order by s.speciality_name, ofc.office_name;
3 Answers
3
If we need all offices, but just one consultant from office then I believe following should work
with office_speciality_consultants AS (
select distinct on (os.office_recid, os.speciality_recid, cs.recid)
os.office_recid, os.speciality_recid,
f.office_name, f.street, f.city, f.state, f.zipcode, f.phone, f.fax,
cs.lastname, cs.firstname
from office_speciality os
join office f on f.recid = os.office_recid
left join consultant_office co on co.office_recid = f.recid
left join consultant cs on cs.recid = co.consultant_recid
and cs.speciality_recid = os.speciality_recid
order by os.office_recid, os.speciality_recid, cs.recid, f.office_name, cs.lastname, cs.firstname
)
select
s.speciality_name,
ofc.office_name, ofc.street, ofc.city, ofc.state, ofc.zipcode, ofc.phone, ofc.fax,
ofc.lastname, ofc.firstname,
0 as status, s.recid
from speciality s
join office_speciality_consultants ofc ON ofc.speciality_recid = s.recid
order by ofc.office_name;
Upon further checking, this fails when there is more then one non-null consultant in an office.
– Alan Wayne
Aug 21 at 22:38
Edited my answer to show all consultants, pay attention to
cs.recid
added to distinct on (..)
and order by
– khoroshevj
Aug 22 at 7:31
cs.recid
distinct on (..)
order by
Great effort. The changes do now correctly capture multiple offices. However, it also restores the null records from the original left joins--back to square one. :)
– Alan Wayne
Aug 22 at 7:47
You mean the case when speciality doesn't have an office? If so, just replace
left join
with join
and that's it. Edited the answer– khoroshevj
Aug 22 at 8:17
left join
join
Change your last LEFT JOIN
With INNER JOIN
LEFT JOIN
INNER JOIN
select s.speciality_name, f.office_name, f.street, f.city, f.state, f.zipcode, f.phone, f.fax, cs.lastname, cs.firstname, 0 as status, s.recid
from speciality s
left join office_speciality os on os.speciality_recid = s.recid
left join office f on f.recid = os.office_recid
left join consultant_office co on co.office_recid = f.recid
inner join consultant cs on cs.recid = co.consultant_recid and cs.speciality_recid = s.recid
group by speciality_name, office_name, street, city, state, zipcode, phone, fax, lastname, firstname, s.recid
order by s.speciality_name, f.office_name, cs.lastname, cs.firstname;
Nope...With the inner join, offices that do not have consultants are also lost. I need to be able to show specialities with offices that do not have consultants as well as offices that do have consultants. :)
– Alan Wayne
Aug 21 at 22:09
Thanks to all that helped. The basic problems come down to:
Using CTE's provides a simple step approach to solving the problem.
with t0 as ( -- get all completed records. This table to be used for lookup of consultant_recid.
select s.recid as speciality_recid, f.recid as office_recid, cs.recid as consultant_recid
from speciality s
join office_speciality os on os.speciality_recid = s.recid
join office f on f.recid = os.office_recid
join consultant_office co on co.office_recid = f.recid
join consultant cs on cs.recid = co.consultant_recid and cs.speciality_recid = s.recid
),
t1 as ( -- get speciality and office without duplication.
-- office_speciality is UNIQUE (office_recid, speciality_recid)
select s.recid as speciality_recid, os.office_recid as office_recid
from speciality s
left join office_speciality os on os.speciality_recid = s.recid
group by s.recid, os.office_recid
),
t2 as ( -- add consultants to record when available.
-- left join used to fill in consultant when available, null otherwise.
select t1.speciality_recid, t1.office_recid, t0.consultant_recid
from t1
left join t0 on t0.speciality_recid = t1.speciality_recid and t0.office_recid = t1.office_recid
)
select s.speciality_name, f.office_name, f.street, f.city, f.state, f.zipcode, f.phone, f.fax, cs.lastname, cs.firstname, 0 as status
from t2
join speciality s on s.recid = t2.speciality_recid
join office f on f.recid = t2.office_recid
left join consultant cs on cs.recid = t2.consultant_recid and cs.speciality_recid = s.recid
order by s.speciality_name, f.office_name, cs.lastname;
By using left joins to extend records with information, when available (null otherwise), and by preventing duplication in t1, all the problems are simply resolved.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Offices can have 0,1, or many consultants. I need to show all the consultants for an office, but just the office if no consultants are available. :)
– Alan Wayne
Aug 21 at 22:11