How to flatten a latlong array in bigquery to produce a linestring?

How to flatten a latlong array in bigquery to produce a linestring?



I have a nested table structure, like this:


[

"startTime": "2017-09-02 09:08:00:000",
"endTime": "2017-09-02 09:09:00:000",
"startTimeMillis": "1504343280000",
"endTimeMillis": "1504343340000",
"uuid": "1748750880",
"country": "CI",
"city": "Punta Arenas",
"x": "-70.906904",
"y": "-53.133514"
,

"startTime": "2017-09-02 09:08:00:000",
"endTime": "2017-09-02 09:09:00:000",
"startTimeMillis": "1504343280000",
"endTimeMillis": "1504343340000",
"uuid": "1748750880",
"country": "CI",
"city": "Punta Arenas",
"x": "-70.907353",
"y": "-53.133253"
,

"startTime": "2017-09-02 09:08:00:000",
"endTime": "2017-09-02 09:09:00:000",
"startTimeMillis": "1504343280000",
"endTimeMillis": "1504343340000",
"uuid": "1748750880",
"country": "CI",
"city": "Punta Arenas",
"x": "-70.90771",
"y": "-53.133041"
,

"startTime": "2017-09-02 09:08:00:000",
"endTime": "2017-09-02 09:09:00:000",
"startTimeMillis": "1504343280000",
"endTimeMillis": "1504343340000",
"uuid": "1748750880",
"country": "CI",
"city": "Punta Arenas",
"x": "-70.908979",
"y": "-53.132287"

]



A resulting table is something like this:


Row|startTime|endTime|startTimeMillis|endTimeMillis|uuid|country|city|x|y|
1|2017-09-02 09:08:00:000|2017-09-02 09:09:00:000|1504343280000|1504343340000|1748750880|CI|Punta Arenas|-70.906904|-53.133514|
2|2017-09-02 09:08:00:000|2017-09-02 09:09:00:000|1504343280000|1504343340000|1748750880|CI|Punta Arenas|-70.907353|-53.133253|
3|2017-09-02 09:08:00:000|2017-09-02 09:09:00:000|1504343280000|1504343340000|1748750880|CI|Punta Arenas|-70.90771|-53.133041|
4|2017-09-02 09:08:00:000|2017-09-02 09:09:00:000|1504343280000|1504343340000|1748750880|CI|Punta Arenas|-70.908979|-53.132287|



I'd like to concat the repeated fields x and y to produce a GIS linestring, in a single line, like this:


Row|startTime|endTime|startTimeMillis|endTimeMillis|uuid|country|city|linestring
1|2017-09-02 09:08:00:000|2017-09-02 09:09:00:000|1504343280000|1504343340000|1748750880|CI|Punta Arenas|LINESTRING(-70.906904 -53.133514, -70.907353 -53.133253, -70.90771 -53.133041, -70.908979 -53.132287)



How can I do this? The original x and y values are floats.



Thanks in advanced!






you say I have a nested table structure, like this: but what you show as a schema and example is NOT a nested structure! can you clarify what exactly you have

– Mikhail Berlyant
Sep 7 '17 at 17:19


I have a nested table structure, like this:




4 Answers
4



Below is for BigQuery Standard SQL


#standardSQL
WITH `yourTable` AS (
SELECT '2017-09-02 09:08:00:000' AS startTime, '2017-09-02 09:09:00:000' AS endTime, 1504343280000 AS startTimeMillis, 1504343340000 AS endTimeMillis, 1748750880 AS uuid, 'CI' AS country, 'Punta Arenas' AS city, -70.906904 AS x, -53.133514 AS y UNION ALL
SELECT '2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280000, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.907353, -53.133253 UNION ALL
SELECT '2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280000, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.90771, -53.133041 UNION ALL
SELECT '2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280000, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.908979, -53.132287
)
SELECT startTime, endTime, startTimeMillis, endTimeMillis, uuid, country, city,
STRING_AGG(CONCAT(CAST(x AS STRING), ' ', CAST(y AS STRING)), ',') AS linestring
FROM `yourTable`
GROUP BY startTime, endTime, startTimeMillis, endTimeMillis, uuid, country, city






if answer helped you - also consider voting it up! :o) Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o

– Mikhail Berlyant
Sep 7 '17 at 18:02



You could use the ARRAY_AGG function available in Standard SQL, something like:


#standardSQL
WITH data AS(
SELECT "2017-09-02 09:08:00:000" AS startTime, "2017-09-02 09:09:00:000" endTime, "1504343280000" AS startTimeMillis, "1504343340000" endTimeMillis, "1748750880" AS uuid, "CI" AS country, "Punta Arenas" AS city, "-70.906904" AS x, "-53.133514" AS y UNION ALL
SELECT "2017-09-02 09:08:00:000", "2017-09-02 09:09:00:000", "1504343280000", "1504343340000", "1748750880", "CI", "Punta Arenas", "-70.907353", "-53.133253" UNION ALL
SELECT "2017-09-02 09:08:00:000", "2017-09-02 09:09:00:000", "1504343280000", "1504343340000", "1748750880", "CI", "Punta Arenas", "-70.90771", "-53.133041" UNION ALL
SELECT "2017-09-02 09:08:00:000", "2017-09-02 09:09:00:000", "1504343280000", "1504343340000", "1748750880", "CI", "Punta Arenas", "-70.908979", "-53.132287"
)

SELECT
startTime,
endTime,
startTimeMillis,
endTimeMillis,
uuid,
country,
city,
ARRAY_AGG(STRUCT(x, y)) AS LINESTRING
FROM data
GROUP BY
startTime,
endTime,
startTimeMillis,
endTimeMillis,
uuid,
country,
city



Result:



enter image description here



Even though result is an ARRAY with the elements x and y, notice that they have been structured together as a STRUCT which will allow you to access each field by its respective name.


x


y


STRUCT






Thank you! I found a different solution, but this one is fine!

– user2308155
Sep 7 '17 at 17:57



Thank you all!



I'm using Mikhail Berlyant solution!


SELECT
w.startTime, w.endTime, w.startTimeMillis, w.endTimeMillis,
jams_u.uuid, jams_u.country, jams_u.city, jams_u.street,
jams_u.roadType, jams_u.turnType,
jams_u.type, jams_u.length, jams_u.speed, jams_u.level, jams_u.delay,
jams_u.startNode, jams_u.endNode, jams_u.pubMillis,
TIMESTAMP_MILLIS(jams_u.pubMillis) as pubdatetime_utc,
STRING_AGG(CONCAT(CAST(line_u.x AS STRING),' ',CAST(line_u.y AS STRING))) linestring_4326
FROM
a_import.table w,
UNNEST(jams) jams_u,
UNNEST(line) line_u
GROUP BY
w.startTime, w.endTime, w.startTimeMillis, w.endTimeMillis,
jams_u.uuid, jams_u.country, jams_u.city, jams_u.street,
jams_u.roadType, jams_u.turnType,
jams_u.type, jams_u.length, jams_u.speed, jams_u.level, jams_u.delay,
jams_u.startNode, jams_u.endNode, jams_u.pubMillis,
pubdatetime_utc



One concern with the proposed solutions that use GROUP BY only - without an ORDER BY operator within group, the order of the elements in the GROUP BY group is undefined. So you can get arbitrary order of points in the linestring, which is probably not what you want. Unfortunately, with small inline datasets you get stable results, but this might break once you have real data.


ORDER BY


GROUP BY



To solve this you need to define which attributes define group, and which define order. E.g. if uuid defines a linestring, and start timestamp defines order (they would need to be different, unlike in your sample), your query might group by uuid, and sort by timestamp.



I also prefer to use new Geospatial functions to construct WKT linestring, rather than string concatenation, which gives:


#standardSQL
WITH `yourTable` AS (
SELECT * FROM UNNEST([
STRUCT('2017-09-02 09:08:00:000' AS startTime, '2017-09-02 09:09:00:000' AS endTime, 1504343280002 AS startTimeMillis, 1504343340000 AS endTimeMillis,
1748750880 AS uuid, 'CI' AS country, 'Punta Arenas' AS city, -70.906904 AS x, -53.133514 AS y),
STRUCT('2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280001, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.907353, -53.133253),
STRUCT('2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280004, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.90771, -53.133041),
STRUCT('2017-09-02 09:08:00:000', '2017-09-02 09:09:00:000', 1504343280003, 1504343340000, 1748750880, 'CI', 'Punta Arenas', -70.908979, -53.132287)])
)
SELECT uuid, MIN(startTime) startTime, MAX(endTime) endTime,
ANY_VALUE(country), ANY_VALUE(city),
ST_MakeLine(ARRAY_AGG(ST_GeogPoint(x, y)
ORDER BY startTime, startTimeMillis)) line
FROM `yourTable`
GROUP BY uuid



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)