Python: Take Every First, Second, Third Element in Sublist
Python: Take Every First, Second, Third Element in Sublist
I'm using Python 2.7 and have the following:
my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
I'd like create a 1-d list where the elements are ordered by position in sublist and then order of sublist. So the correct output for the above list is:
[1, 4, 7, 2, 5, 8, 3, 6, 9]
[1, 4, 7, 2, 5, 8, 3, 6, 9]
Here's my (incorrect) attempt:
def reorder_and_flatten(my_list):
my_list = [item for sublist in my_list for item in sublist]
result_nums =
for i in range(len(my_list)):
result_nums.extend(my_list[i::3])
return result_nums
result = reorder_and_flatten(my_list)
This flattens my 2-d list and gives me:
[1, 4, 7, 2, 5, 8, 3, 6, 9, 4, 7, 5, 8, 6, 9, 7, 8, 9]
The first half of this list is correct but the second isn't.
I'd also like my function to be able to handle only 2 sublists. For instance, if given:
[[1, 2, 3], , [7, 8, 9]
[[1, 2, 3], , [7, 8, 9]
the correct output is:
[1, 7, 2, 8, 3, 9]
[1, 7, 2, 8, 3, 9]
Any thoughts?
Thanks!
4 Answers
4
You're attempting to flatten, and then reorder, which makes things a lot harder than reordering and then flattening.
First, for your initial problem, that's just "unzip", as explained in the docs for zip
:
zip
>>> my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> list(zip(*my_list))
... [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
(In Python 2.7, you could just write zip(…)
here instead of list(zip(…))
, but this way, the same demonstration works identically in both 2.x and 3.x.)
zip(…)
list(zip(…))
And then, you already know how to flatten that:
>>> [item for sublist in zip(*my_list) for item in sublist]
[1, 4, 7, 2, 5, 8, 3, 6, 9]
But things get a bit more complicated for your second case, where some of the lists may be empty (or maybe just shorter?).
There's no function that's like zip
but skips over missing values. You can write one pretty easily. But instead… there is a function that's like zip
but fills in missing values with None
(or anything else you prefer), izip_longest
. So, we can just use that, then filter out the None
values as we flatten:
zip
zip
None
izip_longest
None
>>> my_list = [[1, 2, 3], , [7, 8, 9]]
>>> from itertools import izip_longest
>>> list(izip_longest(*my_list))
[(1, None, 7), (2, None, 8), (3, None, 9)]
>>> [item for sublist in izip_longest(*my_list) for item in sublist if item is not None]
[1, 7, 2, 8, 3, 9]
(In Python 3, the function izip_longest
is renamed zip_longest
.)
izip_longest
zip_longest
It's worth noting that the roundrobin
recipe, as covered by ShadowRanger's answer, is an even nicer solution to this problem, and even easier to use (just copy and paste it from the docs, or pip install more_itertools
and use it from there). It is a bit harder to understand—but it's worth taking the time to understand it (and asking for help if you get stuck).
roundrobin
pip install more_itertools
None
sentinel = object()
sentinel
fillvalue
zip_longest
item is not sentinel
@ShadowRanger Yeah, I didn't want to get into that in the answer (since the OP is using all ints), hoping "… (or anything else you prefer) …" would be enough of a hint for future searchers, but it definitely does belong at least in a comment.
– abarnert
Sep 5 '18 at 23:47
result = [l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]
i.e.
my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]
# => [1, 4, 7, 2, 5, 8, 3, 6, 9]
my_list = [[1, 2, 3], , [7, 8, 9]]
[l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]
# => [1, 7, 2, 8, 3, 9]
Note: If one of the sublists has a different, but non-zero, length from the others, this code will die with an
IndexError
(because it blithely indexes to the maximum index of any input sublist).– ShadowRanger
Sep 5 '18 at 23:47
IndexError
The itertools
module's recipes section provides a roundrobin
recipe that would do exactly what you want. It produces a generator, but your expected behavior would be seen with:
itertools
roundrobin
# define roundrobin recipe here
from itertools import cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
pending = len(iterables)
nexts = cycle(iter(it).next for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
def reorder_and_flatten(my_list):
return list(roundrobin(*my_list))
Your original code's main issue is that it looped over for i in range(len(my_list)):
, extending with my_list[i::3]
. Problem is, this ends up duplicating elements from index 3 onwards (index 3 was already selected as the second element of the index 0 slice). There are lots of other small logic errors here, so it's much easier to reuse a recipe.
for i in range(len(my_list)):
my_list[i::3]
This will be fairly performant, and generalize better than most hand-rolled solutions (it will round robin correctly even if the sublists are of uneven length, and it doesn't require second pass filtering or special handling of any kind to allow None
as a value like zip_longest
does).
None
zip_longest
You should probably copy-paste the recipe there. Not that it's likely to disappear from the docs any time soon, but still. Also, the OP is using 2.7, so better to link the 2.x docs. But otherwise, yeah, hard to beat this.
– abarnert
Sep 5 '18 at 23:46
@abarnert: Missed it was Py2. Updated link, copied in recipe for posterity (they have deleted recipes before, so your paranoia is not misplaced; the
pairwise
recipe used to have a generalized window
recipe for arbitrary sized pairings that disappeared for some reason).– ShadowRanger
Sep 5 '18 at 23:53
pairwise
window
Wow, never noticed that
window
went away; I thought it just got renamed to windowed
(which is the name more_itertools
uses, but IIRC, they expanded the original recipe anyway, so it doesn't fail badly if the window size is larger than the iterable). But yeah, it's gone.– abarnert
Sep 6 '18 at 0:17
window
windowed
more_itertools
If you are happy to use a 3rd party library, you can use NumPy and np.ndarray.ravel
:
np.ndarray.ravel
import numpy as np
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
res_a = A.ravel('F') # array([1, 4, 7, 2, 5, 8, 3, 6, 9])
For the case where you have one or more empty lists, you can use filter
to remove empty lists:
filter
B = np.array(list(filter(None, [[1, 2, 3], , [7, 8, 9]])))
res_b = B.ravel('F') # array([1, 7, 2, 8, 3, 9])
Both solutions require non-empty sublists to contain the same number of items. If list conversion is necessary you can use, for example, res_a.tolist()
.
res_a.tolist()
While these "black box" methods won't teach you much, they will be faster for large arrays than list
-based operations. See also What are the advantages of NumPy over regular Python lists?
list
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
If
None
is a legal element, you'd want to makesentinel = object()
, and passsentinel
as thefillvalue
forzip_longest
, as well as testingitem is not sentinel
, so there is no possibility of dropping any input value.– ShadowRanger
Sep 5 '18 at 23:45