SQL over clause - dividing partition into numbered sub-partitions
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:
AccountId DebitCredit AsOfDate
--------- ----------- ----------
aaa d 2018-11-01
aaa d 2018-11-02
aaa c 2018-11-03
aaa c 2018-11-04
aaa c 2018-11-05
bbb d 2018-11-02
ccc c 2018-11-01
ccc d 2018-11-02
ccc d 2018-11-03
ccc c 2018-11-04
ccc d 2018-11-05
ccc c 2018-11-06
In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:
AccountId DebitCredit AsOfDate PartNo
--------- ----------- ---------- ------
aaa d 2018-11-01 1
aaa d 2018-11-02 1
aaa c 2018-11-03 2
aaa c 2018-11-04 2
aaa c 2018-11-05 2
bbb d 2018-11-02 1
ccc c 2018-11-01 1
ccc d 2018-11-02 2
ccc d 2018-11-03 2
ccc c 2018-11-04 3
ccc d 2018-11-05 4
ccc c 2018-11-06 5
I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.
In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.
The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.
As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.
sql
add a comment |
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:
AccountId DebitCredit AsOfDate
--------- ----------- ----------
aaa d 2018-11-01
aaa d 2018-11-02
aaa c 2018-11-03
aaa c 2018-11-04
aaa c 2018-11-05
bbb d 2018-11-02
ccc c 2018-11-01
ccc d 2018-11-02
ccc d 2018-11-03
ccc c 2018-11-04
ccc d 2018-11-05
ccc c 2018-11-06
In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:
AccountId DebitCredit AsOfDate PartNo
--------- ----------- ---------- ------
aaa d 2018-11-01 1
aaa d 2018-11-02 1
aaa c 2018-11-03 2
aaa c 2018-11-04 2
aaa c 2018-11-05 2
bbb d 2018-11-02 1
ccc c 2018-11-01 1
ccc d 2018-11-02 2
ccc d 2018-11-03 2
ccc c 2018-11-04 3
ccc d 2018-11-05 4
ccc c 2018-11-06 5
I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.
In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.
The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.
As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.
sql
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17
add a comment |
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:
AccountId DebitCredit AsOfDate
--------- ----------- ----------
aaa d 2018-11-01
aaa d 2018-11-02
aaa c 2018-11-03
aaa c 2018-11-04
aaa c 2018-11-05
bbb d 2018-11-02
ccc c 2018-11-01
ccc d 2018-11-02
ccc d 2018-11-03
ccc c 2018-11-04
ccc d 2018-11-05
ccc c 2018-11-06
In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:
AccountId DebitCredit AsOfDate PartNo
--------- ----------- ---------- ------
aaa d 2018-11-01 1
aaa d 2018-11-02 1
aaa c 2018-11-03 2
aaa c 2018-11-04 2
aaa c 2018-11-05 2
bbb d 2018-11-02 1
ccc c 2018-11-01 1
ccc d 2018-11-02 2
ccc d 2018-11-03 2
ccc c 2018-11-04 3
ccc d 2018-11-05 4
ccc c 2018-11-06 5
I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.
In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.
The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.
As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.
sql
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:
AccountId DebitCredit AsOfDate
--------- ----------- ----------
aaa d 2018-11-01
aaa d 2018-11-02
aaa c 2018-11-03
aaa c 2018-11-04
aaa c 2018-11-05
bbb d 2018-11-02
ccc c 2018-11-01
ccc d 2018-11-02
ccc d 2018-11-03
ccc c 2018-11-04
ccc d 2018-11-05
ccc c 2018-11-06
In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:
AccountId DebitCredit AsOfDate PartNo
--------- ----------- ---------- ------
aaa d 2018-11-01 1
aaa d 2018-11-02 1
aaa c 2018-11-03 2
aaa c 2018-11-04 2
aaa c 2018-11-05 2
bbb d 2018-11-02 1
ccc c 2018-11-01 1
ccc d 2018-11-02 2
ccc d 2018-11-03 2
ccc c 2018-11-04 3
ccc d 2018-11-05 4
ccc c 2018-11-06 5
I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.
In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.
The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.
As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.
sql
sql
edited Dec 11 '18 at 10:30
DaveyDaveDave
3,71883757
3,71883757
asked Nov 12 '18 at 7:02
Stanley GadeStanley Gade
686
686
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17
add a comment |
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17
add a comment |
3 Answers
3
active
oldest
votes
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc
At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the PartNoAdd for this Account.
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
add a comment |
you can do this with a recursive cte
; with
-- the purpose of `cte` is to generate running number in the order of AsOfDate
cte as
(
select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate)
from tbl
),
-- this is the recursive CTE
rcte as
(
-- anchor member. Starts with `PartNo 1`
select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1
from cte
where rn = 1
union all
-- recursive member. Incrememt `PartNo` if there is a change in debitcredit
select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn,
PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end
from rcte r
inner join cte c on r.AccountId = c.AccountId
and r.rn = c.rn - 1
)
select *
from rcte
order by AccountId, AsOfDate
add a comment |
you can use dense rank
select *,dense_rank() over(partition by AccountId order by DebitCredit desc) as PartNo
from t
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53257270%2fsql-over-clause-dividing-partition-into-numbered-sub-partitions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc
At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the PartNoAdd for this Account.
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
add a comment |
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc
At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the PartNoAdd for this Account.
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
add a comment |
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc
At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the PartNoAdd for this Account.
If you have sql server 2012+, you can use lag() and a window summation to get this:
select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
select *,
case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
from t
)t2
order by AccountId asc, AsOfDate asc
At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.
Then the outer query sums all the PartNoAdd for this Account.
answered Nov 12 '18 at 8:17
George MenoutisGeorge Menoutis
2,624420
2,624420
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
add a comment |
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
Fantastic - thanks George! I feel like a new, wiser man!!!
– Stanley Gade
Nov 12 '18 at 9:38
add a comment |
you can do this with a recursive cte
; with
-- the purpose of `cte` is to generate running number in the order of AsOfDate
cte as
(
select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate)
from tbl
),
-- this is the recursive CTE
rcte as
(
-- anchor member. Starts with `PartNo 1`
select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1
from cte
where rn = 1
union all
-- recursive member. Incrememt `PartNo` if there is a change in debitcredit
select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn,
PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end
from rcte r
inner join cte c on r.AccountId = c.AccountId
and r.rn = c.rn - 1
)
select *
from rcte
order by AccountId, AsOfDate
add a comment |
you can do this with a recursive cte
; with
-- the purpose of `cte` is to generate running number in the order of AsOfDate
cte as
(
select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate)
from tbl
),
-- this is the recursive CTE
rcte as
(
-- anchor member. Starts with `PartNo 1`
select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1
from cte
where rn = 1
union all
-- recursive member. Incrememt `PartNo` if there is a change in debitcredit
select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn,
PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end
from rcte r
inner join cte c on r.AccountId = c.AccountId
and r.rn = c.rn - 1
)
select *
from rcte
order by AccountId, AsOfDate
add a comment |
you can do this with a recursive cte
; with
-- the purpose of `cte` is to generate running number in the order of AsOfDate
cte as
(
select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate)
from tbl
),
-- this is the recursive CTE
rcte as
(
-- anchor member. Starts with `PartNo 1`
select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1
from cte
where rn = 1
union all
-- recursive member. Incrememt `PartNo` if there is a change in debitcredit
select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn,
PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end
from rcte r
inner join cte c on r.AccountId = c.AccountId
and r.rn = c.rn - 1
)
select *
from rcte
order by AccountId, AsOfDate
you can do this with a recursive cte
; with
-- the purpose of `cte` is to generate running number in the order of AsOfDate
cte as
(
select AccountId, DebitCredit, AsOfDate, rn = row_number() over (partition by AccountId order by AsOfDate)
from tbl
),
-- this is the recursive CTE
rcte as
(
-- anchor member. Starts with `PartNo 1`
select AccountId, DebitCredit, AsOfDate, rn, PartNo = 1
from cte
where rn = 1
union all
-- recursive member. Incrememt `PartNo` if there is a change in debitcredit
select c.AccountId, c.DebitCredit, c.AsOfDate, c.rn,
PartNo = case when r.DebitCredit = c.DebitCredit then r.PartNo else r.PartNo + 1 end
from rcte r
inner join cte c on r.AccountId = c.AccountId
and r.rn = c.rn - 1
)
select *
from rcte
order by AccountId, AsOfDate
answered Nov 12 '18 at 8:16
SquirrelSquirrel
11.8k22127
11.8k22127
add a comment |
add a comment |
you can use dense rank
select *,dense_rank() over(partition by AccountId order by DebitCredit desc) as PartNo
from t
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
add a comment |
you can use dense rank
select *,dense_rank() over(partition by AccountId order by DebitCredit desc) as PartNo
from t
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
add a comment |
you can use dense rank
select *,dense_rank() over(partition by AccountId order by DebitCredit desc) as PartNo
from t
you can use dense rank
select *,dense_rank() over(partition by AccountId order by DebitCredit desc) as PartNo
from t
answered Nov 12 '18 at 7:17
Zaynul Abadin TuhinZaynul Abadin Tuhin
14.4k21032
14.4k21032
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
add a comment |
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
Thanks for the answer, but as i see it not in this case. In case of account with id 'ccc', 2018-11-01, 2018-11-04, and 2018-11-08 would fall into the same "bucket", even though they belong to different "peroids"
– Stanley Gade
Nov 12 '18 at 7:23
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53257270%2fsql-over-clause-dividing-partition-into-numbered-sub-partitions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is it guaranteed that consecutive rows (for a particular account) are for consecutive days?
– Henning Koehler
Nov 12 '18 at 7:13
your output is contradictory.
– saravanatn
Nov 12 '18 at 7:17