Check if Object attribute is present in list of Object

Check if Object attribute is present in list of Object



I have an object with different attributes and a list that contains those objects.



Before adding an object to the list, I'd like to check if an attribute of this new object is present in the list.



This attribute is unique, so this is done to make sure that every object in the list is unique.



I would do something like this:


for post in stream:
if post.post_id not in post_list:
post_list.append(post)
else:
# Find old post in the list and replace it



But obviously line 2 doesn't work as I'm comparing the post_id to the object list.


post_id





If you have control over the post class, you could create a __hash__ method and use the value of post_id there, then just do post_list = set(stream)
– Peter
Sep 3 at 16:15


post


__hash__


post_id


post_list = set(stream)





@Peter: take into account that that won't preserver order! list(OrderedDict.fromkeys(stream)) would keep the inputs in order of first-seen id.
– Martijn Pieters
Sep 3 at 16:21


list(OrderedDict.fromkeys(stream))





Ah yeah, to be fair I assumed from the way he'd worded it order didn't matter, hadn't really given it much thought though :P
– Peter
Sep 3 at 17:01




2 Answers
2



Keep a separate set to which you add the attribute, and against which you can then test the next value:


ids_seen = set()
for post in stream:
if post.post_id not in ids_seen:
post_list.append(post)
ids_seen.add(post.post_id)



Another option is to create an ordered dict first, with the ids as keys:


posts = OrderedDict((post.post_id, post) for post in stream)
post_list = list(posts.values())



This will keep the most recently seen post reference for a given id, but you'll still unique ids only.


post


id



If ordering isn't important, just use a regular dictionary comprehension:


posts = post.post_id: post for post in stream
post_list = list(posts.values())



If you are using Python 3.6 or newer, then the order will be preserved anyway as the CPython implementation was updated to retain input order, and in Python 3.7 this feature became part of the language specification.



Whatever you do, don't use a separate list to test the post.id against, as that takes O(N) time each time you check to see if the id is present, where N is the number of items in your stream in the end. Combined with O(N) such checks, that approach would take O(N**2) quadratic time, meaning that for every 10-fold increase in the number of input items, you'd also take 100 times more time to process them all.


post.id



But when using a set or dictionary, testing if the id is already there only takes O(1) constant time, so checks are cheap. That makes a full processing loop take O(N) linear time, meaning that it'll take time directly proportional to how many input items you have.



This should work


for post in stream:
if post.post_id not in [post.post_id for post in post_list]:
post_list.append(post)





This is incredibly inefficient, as not in has to do a full scan of the list you generated. You also re-generate the post_id list each iteration. The combination is a killer, making this a O(N**3) quadratic approach where only O(N) linear time is ever needed. For 1000 items, your approach will take approximatly 1 million times as long as a O(N) linear approach.
– Martijn Pieters
Sep 3 at 16:17



not in


post_id





So yes, it should work, if you have the patience. This rapidly sucks up time as the input sequence becomes longer (10k items, 100 million times slower, 100k items, 10 billion times slower, etc).
– Martijn Pieters
Sep 3 at 16:20



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế

⃀⃉⃄⃅⃍,⃂₼₡₰⃉₡₿₢⃉₣⃄₯⃊₮₼₹₱₦₷⃄₪₼₶₳₫⃍₽ ₫₪₦⃆₠₥⃁₸₴₷⃊₹⃅⃈₰⃁₫ ⃎⃍₩₣₷ ₻₮⃊⃀⃄⃉₯,⃏⃊,₦⃅₪,₼⃀₾₧₷₾ ₻ ₸₡ ₾,₭⃈₴⃋,€⃁,₩ ₺⃌⃍⃁₱⃋⃋₨⃊⃁⃃₼,⃎,₱⃍₲₶₡ ⃍⃅₶₨₭,⃉₭₾₡₻⃀ ₼₹⃅₹,₻₭ ⃌