Pandas stack date matirx value

Pandas stack date matirx value



My data format is like:


year month 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 .. 31
1998 1 2.5 1 - - -2.5 - - - - - - - - - - - - - - 1.5
1998 2 2.5 1 - - -4.5 - - - - - - - - - - - - - - 1.5
1998 3 2.5 1 - - -3.5 - - - - - - - - - - - - - - 1.5
1998 4 2.5 1 - - -8.5 - - - - - - - - - - - - - - 1.5
1998 5 2.5 1 - - -1.5 - - - - - - - - - - - - - - 1.5
2001 5 2.5 1 - - -1.5 - - - - - - - - - - - - - - 1.5



explanation:



-means the missing value.


-



year column is the year.


year



month column is the month.


month



1 2 3 4 and so on is the day column,so this is the datetime format matrix .


1 2 3 4



Expect ouput:


date value
1998-01-01 2.5
1998-01-02 2.8
1998-01-03 - # when is ismissing and the date is exist it show
1998-01-31 -
...
2008-02-28 - #
2008-02-29 - # this year the Febulary get 29 days
2008-03-01 3.4
...
2008-04-30 - # missing value and the date exist.
2008-05-01 3.0




1 Answer
1



Pretty much what you're asking is to "un-pivot", your DataFrame. The general way to approach these types of problems are using some version of melt, stack, or unstack. Here is an approach using stack.


melt


stack


unstack


stack



Setup


df = pd.DataFrame('year': 0: 1998, 1: 1998, 2: 1998, 3: 1998, 4: 1998, 5: 2001, 'month': 0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 5, '1': 0: 2.5, 1: 2.5, 2: 2.5, 3: 2.5, 4: 2.5, 5: 2.5, '2': 0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, '3': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '4': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '5': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '6': 0: 2.5, 1: 4.5, 2: 3.5, 3: 8.5, 4: 1.5, 5: 1.5, '7': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '8': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '9': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '10': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '11': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '12': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '13': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '14': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-', '15': 0: '-', 1: '-', 2: '-', 3: '-', 4: '-', 5: '-')



Using stack:


stack


out = df.set_index(['year', 'month']).stack().reset_index()

pd.DataFrame(
'Date': pd.to_datetime(out.iloc[:, :3].astype(str).agg('-'.join, 1)),
'Value': out.iloc[:, 3]
)




Date Value
0 1998-01-01 2.5
1 1998-01-02 1
2 1998-01-03 -
3 1998-01-04 -
4 1998-01-05 -
5 1998-01-06 2.5
.. ... ...
60 1998-05-01 2.5
61 1998-05-02 1
83 2001-05-09 -
84 2001-05-10 -
85 2001-05-11 -
86 2001-05-12 -
87 2001-05-13 -
88 2001-05-14 -
89 2001-05-15 -





Thanks for your help and I have to mention that this answer does not validate the date ,code like pd.to_datetime(date_index, infer_datetime_format=True, errors='coerce').notna() may help filter those invaid data.
– ileadall42
Sep 2 at 2:42


pd.to_datetime(date_index, infer_datetime_format=True, errors='coerce').notna()



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

How do I collapse sections of code in Visual Studio Code for Windows?

ャフサォクコ ケウ,コ,ワ メ,ロスョノ゙,クネ,フムカヤヲニ,エコ゚ツ ウイオン゙ケワサネォキモュキォウイノンコチ゚メヌナイゥフュ,カヒウネェ ネ,ホノケ,ムュキ ッボーミュハ,チ ツス ィ メウイマヤ,゙ウチ ヅ ロ,ォジヌェ ャヌット ェ,マャ,チナエヒネソキツテ トホヲヲミーァ