Iterating through tuples in a list

Iterating through tuples in a list



Let's say I have a list made up of tuples:


stList = [('NJ', 'Burlington County', '12/21/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('VA', 'Frederick County', '2/13/2018'),
('MD', 'Montgomery County', '8/7/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('NC', 'Lee County', '1/14/2018'),
('NC', 'Alamance County', '11/28/2017'),]



I want to iterate through each item(tuple) and if it already exists, remove it from stList.


stList


for item in stList:
if item in stList:
stList.remove(item)



This doesn't exactly work. Basically, when I run this, if any item in the tuple is also in the list, it removes that item, so I get this:


[('NJ', 'Burlington County', '12/21/2017'),
('VA', 'Frederick County', '2/13/2018'),
('NJ', 'Burlington County', '12/21/2017'),
('NC', 'Alamance County', '11/28/2017')]



What is a better way to approach this?




1 Answer
1



Tuples with all entries matching will be considered equal.


>>> ('NJ', 'Burlington County', '12/21/2017') == ('NJ', 'Burlington County', '12/21/2017')
>>> True

>>> ('NJ', 'Burlington County', '12/21/2017') == ('NJ', 'Burlington County', '1/21/2017')
>>> False



This can give unexpected behavior unless you are aware of how the removal is done and you are doing it properly. That is a different story.



Here are a few options.


seen = set()
result =
for item in stList:
# Tuple can be compared directly to other tupled in `seen`.
if item not in seen:
seen.add(item)
result.append(item)

stList = result



Another possibility is


seen = set()
# Use a list to preserve ordering. Change to set if that does not matter.
first_seen =
for i, item in enumerate(stList):
if item not in seen:
seen.add(item)
first_seen.append(i)

stList = [stList[i] for i in first_seen]



Edit
On second thought the second option is not as good as the first unless you need the indices for some reason (i.e., they can be reused for some other task) because result in the first case stores references and not copies of the tuples so it will incur more or less the same memory as storing indices to those tuples in stList.


result


stList


stList = list(set(stList))



If you just want an iterable and have no need to index stList, then you can even keep it as a set object.


stList


set






This worked very well. I agree now that trying to remove items from a list that I am iterating through is a bad idea. So, the seen is just a temporary holding spot for the comparison. Very nice; thanks!

– gwydion93
Sep 14 '18 at 0:50


seen






@gwydion93 Yes. seen is a set to track the items encountered so far. If you have more knowledge/control over the data structure this step can usually be improved, for e.g., by working at the bit level or making your own custom hash function etc. If the problem is unconstrained and without any specific structure, in general it is hard to do away with something like the seen variable here.

– lightalchemist
Sep 14 '18 at 1:01



seen


set


seen



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)