How to find the first unique item in a list of values in a dict python 3.4
How to find the first unique item in a list of values in a dict python 3.4
Hello all I have a dict
dat =
'2018-01':['jack', 'jhon','mary','mary','jack'],
'2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
'2018-03':['Jacob', 'jhon','Reece','mary','jack'],
'2018-04':['George', 'jhon','mary','Alexander','Richard'],
I want the output like this:
Output =
'2018-01':['jack','jhon','mary'],
'2018-02':['Oliver', 'Connor','Liam'],
'2018-03':['Jacob','Reece'],
'2018-04':['George','Alexander','Richard']
I have my code which is a nested for loop inserting it to a list
lis =
for key,value in dat.iteritems():
for va in value:
if va not in lis:
val = key,va
lis.append(val)
But my dict "dat" has so many items in the values in that list. How can I do this with out nested for loop its consuming a lot of time.
Thanks in advance
EDIT: Judging from the
iteritems
you are using Python 2.7 so what you are trying to do cannot be done in a reproducable way.– Ev. Kounis
Sep 6 '18 at 11:21
iteritems
@Ev.Kounis OrderedDict ?
– DeepSpace
Sep 6 '18 at 11:23
the casting of
dat
to OrderedDict
would also be arbitrary. One would have to start with an OrderedDict
.– Ev. Kounis
Sep 6 '18 at 11:24
dat
OrderedDict
OrderedDict
Also why is
Oliver
twice in the desired output?– Ev. Kounis
Sep 6 '18 at 11:25
Oliver
8 Answers
8
What you are trying to do is this:
dat =
'2018-01':['jack', 'jhon','mary','mary','jack'],
'2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
'2018-03':['Jacob', 'jhon','Reece','mary','jack'],
'2018-04':['George', 'jhon','mary','Alexander','Richard'],
unique = set()
res =
for key, values in dat.items():
res[key] =
for value in values:
if value not in unique:
res[key].append(value)
unique.add(value)
which produces:
'2018-01': ['jack', 'jhon', 'mary'],
'2018-02': ['Oliver', 'Connor', 'Liam'],
'2018-03': ['Jacob', 'Reece'],
'2018-04': ['George', 'Alexander', 'Richard']
BUT
the order in dictionaries prior to Python version 3.7 could not be guaranteed and this makes the above code dangerous. The reason that is, is that with the same input you might end up having multiple different outputs.
To understand why take a look at this:
list1 = ['foo', 'bar', 'foobar']
list2 = ['bar']
If I use list1
to eliminate all duplicates I would end up with:
list1
list1 = ['foo', 'bar', 'foobar']
list2 =
If I use list2
to eliminate all duplicates I would end up with:
list2
list1 = ['foo', 'foobar']
list2 = ['bar']
So depending on what I start with I end up having different results. With the dict
from your example, what list
you start with is any man's guess.
dict
list
There is still hope however
because you can start with an OrderedDict
(from collections
):
OrderedDict
collections
dat = OrderedDict([('2018-01', ['jack', 'jhon', 'mary', 'mary', 'jack']),
('2018-02', ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver']),
('2018-03', ['Jacob', 'jhon', 'Reece', 'mary', 'jack']),
('2018-04', ['George', 'jhon', 'mary', 'Alexander', 'Richard'])])
and then continue with the rest of the code as before.
Hi Thanks for the answer I'm using python 3.4 x
– user9538877
Sep 6 '18 at 11:30
Then your code which you posted in your question (which btw returns a list, not a dict) does not run, as dicts have no
iteritems
in Python3 ...– SpghttCd
Sep 6 '18 at 11:38
iteritems
I mean I just pasted a snippet but end output I was looking for a dict
– user9538877
Sep 6 '18 at 11:39
Thanks I learned a lot !!!
– user9538877
Sep 6 '18 at 11:45
and if you don't want to change the format of
dat
you can write ordered_dat = OrderedDict(sorted(dat.items()))
and then call in the first line for key, values in ordered_dat.items():
. So you don't need to manually change dat if you have it already written.– Hadi Farah
Sep 6 '18 at 11:48
dat
ordered_dat = OrderedDict(sorted(dat.items()))
for key, values in ordered_dat.items():
Another take on @Ev. Kounis's approach using sets and OrderedDict
(and pprint
for sake of pretty printing):
OrderedDict
pprint
import pprint
from collections import OrderedDict
dat = OrderedDict(
'2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
'2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
'2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
'2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard'],
)
exist = set()
output = OrderedDict()
for k, v in dat.items():
output[k] = set(v) - exist
exist.update(v)
pprint.pprint(output)
# OrderedDict([('2018-01', 'mary', 'jack', 'jhon'),
# ('2018-02', 'Connor', 'Oliver', 'Liam'),
# ('2018-03', 'Jacob', 'Reece'),
# ('2018-04', 'George', 'Alexander', 'Richard')])
Should I convert my dict into an ordered dict ?
– user9538877
Sep 6 '18 at 11:35
@user9538877 If you want reproducible and predicable results.
– DeepSpace
Sep 6 '18 at 11:36
Does looping over the sorted keys of the dictionary have the same effect?
– Daniel Mesejo
Sep 6 '18 at 11:38
@DanielMesejo if it results in the order you want to have them, yes.
– Ev. Kounis
Sep 6 '18 at 11:41
Thanks man it helped!!
– user9538877
Sep 6 '18 at 11:44
You can do something like this:
l=
for k,v in dat.items():
dat[k] = list(set([i for i in v if i not in l]))
l = l + v
now dat
will be:
dat
'2018-01': ['jhon', 'mary', 'jack'],
'2018-02': ['Oliver', 'Liam', 'Connor'],
'2018-03': ['Jacob', 'Reece'],
'2018-04': ['George', 'Alexander', 'Richard']
This works but the order is not guaranteed sometimes it starts with
2018-04
and other times at 2018-01
so the output varies constantly, but it will print unique names for sure.– Hadi Farah
Sep 6 '18 at 11:52
2018-04
2018-01
Yes, this just need to change
dat
to the orderedDict
.– Mehrdad Pedramfar
Sep 6 '18 at 11:53
dat
orderedDict
If you don't mind about the order in the list of values this can be a solution.
Note the outpout of this solution may be different according to the version of Python. Indeed dict are guaranteed to be insertion ordered only from Python3.6.
dat =
'2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
'2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
'2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
'2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard'],
s = set()
d =
for k,v in dat.items():
d[k] = list(set(v) - s)
s.update(d[k])
#'2018-01': ['jack', 'jhon', 'mary'], '2018-02': ['Connor', 'Oliver', 'Liam'], '2018-03': ['Reece', 'Jacob'], '2018-04': ['Richard', 'Alexander', 'George']
I think that what you need, I just edit your code
dat =
'2018-01':['jack', 'jhon','mary','mary','jack'],
'2018-02':['Oliver', 'Connor','mary','Liam','jack','Oliver'],
'2018-03':['Jacob', 'jhon','Reece','mary','jack'],
'2018-04':['George', 'jhon','mary','Alexander','Richard'],
lis= dat.values()
lis = list(set([item for sublist in lis for item in sublist]))
out_val =
for key,value in dat.iteritems():
res =
for i in value :
if i in lis :
res.append(i)
lis.remove(i)
out_val.append(res)
your_output=dict(zip( dat.keys(), out_val))
Output :
'2018-01': ['jack', 'jhon', 'mary'],
'2018-03': ['Jacob', 'Reece'],
'2018-02': ['Oliver', 'Connor', 'Liam'],
'2018-04': ['George', 'Alexander', 'Richard']
Assuming the order is by the keys ['2018-01', '2018-02', '2018-03', '2018-04']
you could loop over the keys in that order, like this:
['2018-01', '2018-02', '2018-03', '2018-04']
d = '2018-01': ['jack', 'jhon', 'mary', 'mary', 'jack'],
'2018-02': ['Oliver', 'Connor', 'mary', 'Liam', 'jack', 'Oliver'],
'2018-03': ['Jacob', 'jhon', 'Reece', 'mary', 'jack'],
'2018-04': ['George', 'jhon', 'mary', 'Alexander', 'Richard']
result =
found = set()
for i in sorted(d):
result[i] = list(set(d[i]).difference(found))
found.update(d[i])
for i in sorted(result):
print(i, result[i])
Output
2018-01 ['mary', 'jhon', 'jack']
2018-02 ['Oliver', 'Liam', 'Connor']
2018-03 ['Reece', 'Jacob']
2018-04 ['Alexander', 'Richard', 'George']
Try this.
tmp_list1 =
for key,value in dat.iteritems():
tmp_list2 =
dat[key] = list(set(value))
for val in dat[key]:
if val not in tmp_list1:
tmp_list2.append(val)
dat[key] = tmp_list2
tmp_list1 = tmp_list1 + tmp_list2
print dat
did you try running that?
– Ev. Kounis
Sep 6 '18 at 11:40
Yes it worked as expected!
– Vijay Lingam
Sep 6 '18 at 11:47
Then you were not expecting the correct thing.
– Ev. Kounis
Sep 6 '18 at 11:53
import itertools
for i in d:
d[i].sort()
d[i] = list(i for i, _ in itertools.groupby(d[i]))
# Print the dict containing unique lists for keys.
for i in d:
print(i, "->", d[i])
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
this cannot be done in a unique way in Python versions prior to 3.7 where order in dictionaries is guaranteed. What version are you using? Are you assuming order in your dict?
– Ev. Kounis
Sep 6 '18 at 11:20