Iterating through tuples in a list
Iterating through tuples in a list
Let's say I have a list made up of tuples:
stList = [('NJ', 'Burlington County', '12/21/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('VA', 'Frederick County', '2/13/2018'),
('MD', 'Montgomery County', '8/7/2017'),
('NJ', 'Burlington County', '12/21/2017'),
('NC', 'Lee County', '1/14/2018'),
('NC', 'Alamance County', '11/28/2017'),]
I want to iterate through each item(tuple) and if it already exists, remove it from stList
.
stList
for item in stList:
if item in stList:
stList.remove(item)
This doesn't exactly work. Basically, when I run this, if any item in the tuple is also in the list, it removes that item, so I get this:
[('NJ', 'Burlington County', '12/21/2017'),
('VA', 'Frederick County', '2/13/2018'),
('NJ', 'Burlington County', '12/21/2017'),
('NC', 'Alamance County', '11/28/2017')]
What is a better way to approach this?
1 Answer
1
Tuples with all entries matching will be considered equal.
>>> ('NJ', 'Burlington County', '12/21/2017') == ('NJ', 'Burlington County', '12/21/2017')
>>> True
>>> ('NJ', 'Burlington County', '12/21/2017') == ('NJ', 'Burlington County', '1/21/2017')
>>> False
This can give unexpected behavior unless you are aware of how the removal is done and you are doing it properly. That is a different story.
Here are a few options.
seen = set()
result =
for item in stList:
# Tuple can be compared directly to other tupled in `seen`.
if item not in seen:
seen.add(item)
result.append(item)
stList = result
Another possibility is
seen = set()
# Use a list to preserve ordering. Change to set if that does not matter.
first_seen =
for i, item in enumerate(stList):
if item not in seen:
seen.add(item)
first_seen.append(i)
stList = [stList[i] for i in first_seen]
Edit
On second thought the second option is not as good as the first unless you need the indices for some reason (i.e., they can be reused for some other task) because result
in the first case stores references and not copies of the tuples so it will incur more or less the same memory as storing indices to those tuples in stList
.
result
stList
stList = list(set(stList))
If you just want an iterable and have no need to index stList
, then you can even keep it as a set
object.
stList
set
seen
@gwydion93 Yes.
seen
is a set
to track the items encountered so far. If you have more knowledge/control over the data structure this step can usually be improved, for e.g., by working at the bit level or making your own custom hash function etc. If the problem is unconstrained and without any specific structure, in general it is hard to do away with something like the seen
variable here.– lightalchemist
Sep 14 '18 at 1:01
seen
set
seen
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
This worked very well. I agree now that trying to remove items from a list that I am iterating through is a bad idea. So, the
seen
is just a temporary holding spot for the comparison. Very nice; thanks!– gwydion93
Sep 14 '18 at 0:50