Fill gaps of dates and variable within group - PostgreSQL
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
add a comment |
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
add a comment |
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
sql postgresql gaps-and-islands
asked Nov 12 '18 at 3:17
JuanDJuanD
204
204
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date/client/ean tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1 - although normally done via CROSS JOIN) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN back (again) to the Price_Date_Range CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
add a comment |
You can try to use ROW_NUMBER window function by sales_date DESC in a subquery to get max row data for each channel
then use coalesce and MAX window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255540%2ffill-gaps-of-dates-and-variable-within-group-postgresql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date/client/ean tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1 - although normally done via CROSS JOIN) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN back (again) to the Price_Date_Range CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
add a comment |
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date/client/ean tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1 - although normally done via CROSS JOIN) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN back (again) to the Price_Date_Range CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
add a comment |
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date/client/ean tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1 - although normally done via CROSS JOIN) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN back (again) to the Price_Date_Range CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date/client/ean tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1 - although normally done via CROSS JOIN) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN back (again) to the Price_Date_Range CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
answered Nov 12 '18 at 5:57
Clockwork-MuseClockwork-Muse
10.2k32142
10.2k32142
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
add a comment |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 '18 at 13:06
add a comment |
You can try to use ROW_NUMBER window function by sales_date DESC in a subquery to get max row data for each channel
then use coalesce and MAX window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
You can try to use ROW_NUMBER window function by sales_date DESC in a subquery to get max row data for each channel
then use coalesce and MAX window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
You can try to use ROW_NUMBER window function by sales_date DESC in a subquery to get max row data for each channel
then use coalesce and MAX window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
You can try to use ROW_NUMBER window function by sales_date DESC in a subquery to get max row data for each channel
then use coalesce and MAX window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
answered Nov 12 '18 at 3:38
D-ShihD-Shih
25.9k61531
25.9k61531
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255540%2ffill-gaps-of-dates-and-variable-within-group-postgresql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown