Ordered subsets test

Ordered subsets test



I want to test if an ordered set is a subset of a bigger ordered set. I used tuples and itertools.combinations:


itertools.combinations


def subset_test(a, b):
return a in itertools.combinations(b, len(a))



For instance,


>>> subset_test((0, 1, 2), (0, 3, 1, 4, 2))
True
>>> subset_test((0, 1, 2), (0, 3, 2, 4, 1))
False



It works, but is slow when I test big tuples.






Does order matter?

– jamylak
Aug 5 '12 at 22:13






Yes, @jamylak. I updated the question.

– msampaio
Aug 5 '12 at 22:17






@senderle, Could you suggest a better term? I want to test if A is a subset of the ordered set B.

– msampaio
Aug 5 '12 at 22:21






Regarding your original code, there is no need to call list, you can check membership in a generator too and it will save one run through the combinations.

– jamylak
Aug 5 '12 at 22:52


list






Thanks, @jamylak. I updated the question again.

– msampaio
Aug 5 '12 at 23:08




6 Answers
6



You can simply use an iterator to keep track of the position in B


>>> A = (0, 1, 2)
>>> B = (0, 3, 1, 4, 2)
>>> b_iter = iter(B)
>>> all(a in b_iter for a in A)
True






Oh, wow! I see why this works, but it looks like black magic. Is the behavior of in in this case guaranteed?

– senderle
Aug 5 '12 at 23:32



in






Is there anything in the standard that requires in to use linear iteration and to stop the moment a match is found? I can't imagine it would be done any other way, but might still be a problem for the "general case".

– Mad Physicist
Jul 14 '17 at 21:50


in






@MadPhysicist, of course not. in calls the underlying __contains__ method. For example, for a dict, in is certainly not linear search. On the other hand it has to be a linear search for an iterator - how could it mean anything else?

– John La Rooy
Jul 15 '17 at 11:02


in


__contains__


dict


in



Simple way of doing this


>>> a = (0, 1, 2)
>>> b = (0, 3, 1, 4, 2)
>>> filter(set(a).__contains__, b) == a
True



For greater efficiency use itertools


itertools


>>> from itertools import ifilter, imap
>>> from operator import eq
>>> all(imap(eq, ifilter(set(a).__contains__, b), a))
True






or filter(functools.partial(operator.contains, a), b) etc.

– John La Rooy
Aug 5 '12 at 23:49


filter(functools.partial(operator.contains, a), b)






@gnibbler Would you say that it's better to do it like that or is it ok to use special methods in this case?

– jamylak
Aug 6 '12 at 0:20






The special method version seems to run about twice as fast. List comprehension seems to be slowest of all.

– John La Rooy
Aug 6 '12 at 1:29



This should get you started


>>> A = (0, 1, 2)
>>> B = (0, 3, 1, 4, 2)
>>> b_idxs = v:k for k,v in enumerate(B)
>>> idxs = [b_idxs[i] for i in A]
>>> idxs == sorted(idxs)
True



If the list comprehension throws a KeyError, then obviously the answer is also False


KeyError


False



Here's a linear time approach (in the longest set) that doesn't require any hashing. It takes advantage of the fact that, since both sets are ordered, earlier items in the set don't need to be re-checked:


>>> def subset_test(a, b):
... b = iter(b)
... try:
... for i in a:
... j = b.next()
... while j != i:
... j = b.next()
... except StopIteration:
... return False
... return True
...



A few tests:


>>> subset_test((0, 1, 2), (0, 3, 1, 4, 2))
True
>>> subset_test((0, 2, 1), (0, 3, 1, 4, 2))
False
>>> subset_test((0, 1, 5), (0, 3, 1, 4, 2))
False
>>> subset_test((0, 1, 4), (0, 3, 1, 4, 2))
True



I'm pretty sure this is right -- let me know if you see any problems.



This should be pretty quick, but I have a faster one in mind I hope to have down soon:


def is_sorted_subset(A, B):
try:
subset = [B.index(a) for a in A]
return subset == sorted(subset)
except ValueError:
return False



Update: here's the faster one I promised.


def is_sorted_subset(A, B):
max_idx = -1
try:
for val in A:
idx = B[max_idx + 1:].index(val)
if max(idx, max_idx) == max_idx:
return False
max_idx = idx
except ValueError:
return False
return True






Note that this could become slow if B is huge since list.index is O(N)

– jamylak
Aug 5 '12 at 22:43


B


list.index






@jamylak yeah, thanks, the faster solution is posted now.

– kojiro
Aug 5 '12 at 22:57



What about this?


>>> a = (0, 1, 2)
>>> b = (0, 3, 1, 4, 2)
>>> set(a).issubset(set(b))
True



In this example a and b have ordered and unique elements and it checks if a is subset of b. Is this you want?



EDIT:



According to @Marcos da Silva Sampaio: "I want to test if A is a subset of the ordered set B."



It wouldn't be the case of:


>>> a = (2, 0, 1)
>>> b = (0, 3, 1, 4, 2)
>>> set(b).issuperset(a)
True



In this case the order of a doesn't matters.






The elements of a may not be ordered correctly eg. (2, 1, 0) in (0, 3, 1, 4, 2)

– jamylak
Aug 5 '12 at 22:57



a


(2, 1, 0)


(0, 3, 1, 4, 2)






Ops... sorry... my mistake

– rcovre
Aug 5 '12 at 22:58



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ャフサォクコ ケウ,コ,ワ メ,ロスョノ゙,クネ,フムカヤヲニ,エコ゚ツ ウイオン゙ケワサネォキモュキォウイノンコチ゚メヌナイゥフュ,カヒウネェ ネ,ホノケ,ムュキ ッボーミュハ,チ ツス ィ メウイマヤ,゙ウチ ヅ ロ,ォジヌェ ャヌット ェ,マャ,チナエヒネソキツテ トホヲヲミーァ

How do I collapse sections of code in Visual Studio Code for Windows?