Find first value not in std::set min-max
Find first value not in std::set<int> min-max
How do I find the first value h
that is not in std::set<int> ids
such that result is clamped to [0, *ids.rbegin() + 1]
range.
h
std::set<int> ids
[0, *ids.rbegin() + 1]
I see this is rather simple problem, but I didn't find any matching question yet. Basically I want the inverted set of ids
so that I can use them.
ids
So far I have following:
#incldue <set>
std::set<int> ids;
int h = 0;
for(; !ids.empty() && (h <= *ids.rbegin() + 1); ++h)
if(!ids.count(h))
break;
// h is now smallest value not in ids.
I suspect this be improved further such that e.g the loop is not required?
@edit: to clarify what values are in the set: In my use case the value generated by the algorithm is inserted into the set. I should have really said std::set<unsigned int>
. I'm happy to so much discussion done on this question!
std::set<unsigned int>
std::vector
std::set
Btw, you can iterate on the
std::set
instead of doing linear look-up.– Jarod42
Sep 6 '18 at 20:04
std::set
Don't you just need
int h = ids.begin() - 1;
?– NathanOliver
Sep 6 '18 at 20:07
int h = ids.begin() - 1;
@NathanOliver: Wouldn't that result in
-1
if 0
was in the set?– Fred Larson
Sep 6 '18 at 20:14
-1
0
@NathanOliver: this would not detect
3
in a set 0,1,2,4,5
– Stephan Lechner
Sep 6 '18 at 20:18
3
0,1,2,4,5
7 Answers
7
Since a std::set
's elements are sorted, you can use std::adjacent_find
.
std::set
std::adjacent_find
std::adjacent_find(set.begin(), set.end(), (int a, int b) return a+1 != b; );
This will return an iterator to the first element, a
, which is not followed by the value a+1
. Or set.end()
if there is no such value.
a
a+1
set.end()
Sample usage:
std::set<int> ids -2, -1, 0, 1, 2, 4, 5, 6 ;
// This code assumes a non-empty set, since the range in the question requires same
if ( *ids.begin() > 0 )
// Special case for [0
std::cout << 0;
else
auto res = std::adjacent_find(ids.begin(),
ids.end(),
(int a, int b) return a+1 != b; );
if ( res == ids.end() )
// Special case for *ids.rbegin() + 1]
std::cout << *ids.rbegin() + 1;
else
// Print the value that should have followed this one.
std::cout << *res + 1;
Output:
3
Elegant approach.
– Stephan Lechner
Sep 6 '18 at 20:22
Thank you, @StephanLechner. That particular algo seems to be one that gets lost in the back of the drawer and forgotten.
– Drew Dormann
Sep 6 '18 at 20:30
For
1, 2, 3, 5
, you return 4
instead of 0
.– Jarod42
Sep 6 '18 at 20:41
1, 2, 3, 5
4
0
From range
[0, *ids.rbegin() + 1]
or simply [0, ids.size()]
, it is guarantied that set
has at least one missing number.– Jarod42
Sep 6 '18 at 20:53
[0, *ids.rbegin() + 1]
[0, ids.size()]
set
Not sure negative numbers have to be handled, but if it is
-3, 0, 1, 2, 4
return -2 instead of 3
(as I understand expectation). And if it is not, your sample is misleading.– Jarod42
Sep 6 '18 at 20:59
-3, 0, 1, 2, 4
3
std::set<int> ids;
int h = 0;
for( auto id : ids )
if( id != h )
break;
h++;
Results with 0 when -1 and 0 are in set.
– Öö Tiib
Sep 7 '18 at 4:48
std::set
is not optimized for this problem.
std::set
Naive approaches both give you O(n) performance and bad O(n) performance because you are walking a node based datastructure.
What you want is a sorted "leapable" datastructure (be it some kind of tree, or a skip list) where the size of the leaps is recorded.
Then you can track the delta-id and the size of the leap; if the leap is smaller than the id difference, there is an empty id in there. If not, there is no empty id in there.
None of the associative containers in std keep track of the information you need, and retrofitting them isn't practical as you are not given access to the "leap" based iteration or structure.
With such a datastructure, insert and remove is O(lgn) as is "find first unused id". Without it, find first unused id is O(n).
That involves some heavy codesmithing. We can do almost as good, with higher constant costs, by simply storing a set of ranges and wrapping it to guarantee that the ranges don't overlap.
struct range
int base = 0;
int length = 0;
friend bool operator<( range lhs, range rhs )
if (lhs.base != rhs.base) return lhs.base < rhs.base;
return lhs.length < rhs.length;
bool operator()(int x) const
ERROR( if (length < 0) std::cout << "bad lengthn"; )
ERROR( if (base < 0) std::cout << "bad basen"; )
return (x>=base) && (x<(base+length));
;
range
is a half-open interval going from [base, base+length)
. Thus [x,0)
is an empty range for all x
and [x,1)
just contains x
.
range
[base, base+length)
[x,0)
x
[x,1)
x
It is ordered by base
. If you ask for the lower bound of an ordered collection on [x,0)
base
[x,0)
Now we make a std::set<range>
and wrap it up:
std::set<range>
struct id_set
bool taken(int x) const;
int find_unused() const;
void take(int x);
void recycle(int x);
int take_unused()
auto r = find_unused();
take(r);
return r;
std::size_t count() const
std::size_t r = 0;
for (auto e:state)
r += e.length;
ERROR( if (r!=counter) std::cout << "Counter failuren"; )
return r;
private:
std::set<range> state;
using iterator = std::set<range>::iterator;
iterator get_interval(int x) const;
std::size_t counter = 0;
;
id_set::iterator id_set::get_interval(int x) const
auto start = state.lower_bound( x,0 );
if (start != state.end())
if ((*start)(x))
return start;
if (start == state.begin() )
return state.end();
auto prev = std::prev(start);
if ((*prev)(x))
return prev;
return state.end();
bool id_set::taken(int x) const
return get_interval(x) != state.end();
int id_set::find_unused() const
auto it = state.begin();
if (it == state.end()) return 0;
auto r = it->base + it->length; // we ensure these intervals are never adjacent; thus the element after the first interval is untaken
ERROR( if (taken(r)) std::cout << "find_unused failedn"; )
return r;
void id_set::take(int x)
// return an id:
void id_set::recycle(int x)
auto it = get_interval(x);
if (it == state.end()) return; // nothing to do
--counter;
// create two intervals, one before and one after:
auto lhs = *it;
lhs.length = x-lhs.base;
auto rhs = *it;
rhs.base = x+1;
rhs.length -= lhs.length+1;
// remove this interval:
state.erase(it);
// insert either non-zero length interval:
if (lhs.length > 0)
state.insert(lhs);
if (rhs.length > 0)
state.insert(rhs);
ERROR( if (taken(x)) std::cout << "recycle failedn"; )
there are probably typos above. But the core idea is that take
and recycle
are both O(lgn) operations, as is find_unused
. Thus take_unused
is also O(lgn).
take
recycle
find_unused
take_unused
Live example
The elements in the set are sorted. Hence, when you iterate the elements, you just have to detect the first "leak" (e.g. if the value of the set is larger than a linear increasing integral value). You can define where to start, e.g. you could write h=1
if you want to ignore value 0
, and also the corner case of a set of continuous numbers is covered (result is then max+1):
h=1
0
int main()
std::set<int> mySet = 5,4,6,2,1,0;
int h=0;
for(auto val : mySet)
if (val != h)
break;
h++;
cout << "not in mySet:" << h << endl;
return 0;
Your version is O(n log(n))
.
You can have it in linear time (O(n)
):
O(n log(n))
O(n)
int h = 0;
for (const int e : ids)
if (e != h)
return h;
++h;
return h;
If ids
can contains negative numbers, then change the loop to:
ids
int h = 0;
for (auto it = ids.lower_bound(0); it != ids.end(); ++it)
if (*it != h)
return h;
++h;
return h;
With sorted vector, you can even reduce complexity to O(log(n))
.
O(log(n))
If you insist on using std::set
to store a set of integers, then unfortunately linear traversal (suggested in other answers) is the best option.
std::set
With a small modification to your search tree you can achieve O(log n)
search for the smallest missing integer while retaining all the other properties - you just need to store the amount of entries smaller than the key for each tree node. Not sure whether you can implement it without rewriting the whole std::set
though.
UPD: turns out you can - check the answer by @Yakk - Adam Nevraumont
O(log n)
std::set
In fact, by caching the smallest value after each operation on your container, you can get it to "find" your number in O(1)
, which is asymptotically unbeatable. Of course, it will not gain you any actual performance improvement.
O(1)
Finally, if there are any additional limitations placed upon your integers, you can sometimes come up with some clever trick (usually happens in coding tasks from the training websites) - e.g. if you know that there is exactly one number missing in a range, you can calculate it arithmetically in O(1)
.
O(1)
I already accepted @Drew Dormann's answer. However I modified the algo such that I can to allow inserting the values into std::list<int>
keeping the list sorted.
I also made it a generic template function:
std::list<int>
#include <algorithm>
#include <list>
#include <iterator>
#include <iostream>
// Make smallest sequential val that is not in c.
// returns [val, *c.rbegin() + 1] and suitable iterator to insert val into c.
template<typename Cont>
void make_minimal_id(const Cont & c, typename Cont::value_type & val,
typename Cont::const_iterator & x)
if( c.empty())
x = c.begin();
else if ( *c.begin() > val )
x = c.begin();
val = *x - 1;
else
// "one algo that gets lost in the back of the drawer and forgotten."
x = std::adjacent_find(c.begin(), c.end(),
(typename Cont::value_type a, typename Cont::value_type b)
return a + 1 != b; );
if ( x == c.end() )
val = *c.rbegin() + 1;
else
val = *x++ + 1;
// pretty print container
template<typename C>
void print_cont(const C & c)
std::cout << std::accumulate(std::next(c.begin()), c.end(),
std::to_string(c.front()), // start with first element
(std::string a, int b)
return a + ", " + std::to_string(b); ) << std::endl;
int main()
std::list<unsigned int> ids = 3, 4, 5, 6, 9 ;
print_cont(ids);
for(int i = 0; i < 6; ++i)
unsigned int id = 0;
std::list<unsigned int>::iterator x;
make_minimal_id(ids, id, x);
ids.insert(x, id);
print_cont(ids);
// prints:
//3, 4, 5, 6, 9
//2, 3, 4, 5, 6, 9
//1, 2, 3, 4, 5, 6, 9
//0, 1, 2, 3, 4, 5, 6, 9
//0, 1, 2, 3, 4, 5, 6, 7, 9
//0, 1, 2, 3, 4, 5, 6, 7, 8, 9
//0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
return 0;
make_minimal_id()
now works for any sorted container and value. :-)
make_minimal_id()
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
With sorted
std::vector
, you would have random access and can check middle element and then check only left or right side. but you cannot access random element from astd::set
...– Jarod42
Sep 6 '18 at 20:02