What does std::vector look like in memory?

What does std::vector look like in memory?



I read that std::vector should be contiguous. My understanding is, that its elements should be stored together, not spread out across the memory. I have simply accepted the fact and used this knowledge when for example using its data() method to get the underlying contiguous piece of memory.


std::vector


data()



However, I came across a situation, where the vector's memory behaves in a strange way:


std::vector<int> numbers;
std::vector<int*> ptr_numbers;
for (int i = 0; i < 8; i++)
numbers.push_back(i);
ptr_numbers.push_back(&numbers.back());



I expected this to give me a vector of some numbers and a vector of pointers to these numbers. However, when listing the contents of the ptr_numbers pointers, there are different and seemingly random numbers, as though I am accessing wrong parts of memory.


ptr_numbers



I have tried to check the contents every step:


for (int i = 0; i < 8; i++)
numbers.push_back(i);
ptr_numbers.push_back(&numbers.back());
for (auto ptr_number : ptr_numbers)
std::cout << *ptr_number << std::endl;
std::cout << std::endl;



The result looks roughly like this:


1

some random number
2

some random number
some random number
3



So it seems as though when I push_back() to the numbers vector, its older elements change their location.


push_back()


numbers



So what does it exactly mean, that std::vector is a contiguous container and why do its elements move? Does it maybe store them together, but moves them all together, when more space is needed?


std::vector



Edit: Is std::vector contiguous only since C++17? (Just to keep the comments on my previous claim relevant to future readers.)


std::vector






The vector has to keep its promise to have elements contiguous, which means if vector has to move the elements to a bigger space, it does. -- I am aware, that std::vector is a contiguous container only since C++17 -- It has been contiguous since 1998.

– PaulMcKenzie
Sep 14 '18 at 10:32







That is what I read on cppreference: std::vector (for T other than bool) meets the requirements of Container, AllocatorAwareContainer, SequenceContainer , ContiguousContainer (since C++17) and ReversibleContainer.

– McSim
Sep 14 '18 at 10:34






@McSim Those are just the official names that are used now. It was always guaranteed std::vector stored its items in contiguous memory. Unofficially it was contiguous, because there was a defect in the standard not stating it was contiguous. Then (if I recall), the defect was corrected in the 03 standard.

– PaulMcKenzie
Sep 14 '18 at 10:36



std::vector






If you want to observe the effect of contiguous memory then use reserve in advance.

– macroland
Sep 14 '18 at 10:36


reserve






@PaulMcKenzie Thanks for the explanation.

– McSim
Sep 14 '18 at 10:41




5 Answers
5



It roughly looks like this (excuse my MS Paint masterpiece):



vector memory layout



The std::vector instance you have on the stack is a small object containing a pointer to a heap-allocated buffer, plus some extra variables to keep track of the size and and capacity of the vector.


std::vector



So it seems as though when I push_back() to the numbers vector, its older elements change their location.


push_back()


numbers



The heap-allocated buffer has a fixed capacity. When you reach the end of the buffer, a new buffer will be allocated somewhere else on the heap and all the previous elements will be moved into the new one. Their addresses will therefore change.



Does it maybe store them together, but moves them all together, when more space is needed?



Roughly, yes. Iterator and address stability of elements is guaranteed with std::vector only if no reallocation takes place.


std::vector



I am aware, that std::vector is a contiguous container only since C++17


std::vector



The memory layout of std::vector hasn't changed since its first appearance in the Standard. ContiguousContainer is just a "concept" that was added to differentiate contiguous containers from others at compile-time.


std::vector


ContiguousContainer






Why is the arrow not exactly vertical in your drawing? Made me hesitate before upvoting.

– lubgr
Sep 14 '18 at 10:41






@lubgr: I intentionally made it skewed so that it is not mistaken for operator->.

– Vittorio Romeo
Sep 14 '18 at 10:43


operator->






"Iterator and address stability of elements is not guaranteed" - it actually is, as long as no reallocation takes place. If you push_back() with size < capacity, address stability is guaranteed

– Fureeish
Sep 14 '18 at 10:57


push_back()


size < capacity






@VittorioRomeo That's ridiculous, operator->() is horizontal. operator↑() on the other hand...

– Yakk - Adam Nevraumont
Sep 14 '18 at 18:36



operator->()


operator↑()






@lubgr "exactly vertical" arrows require hardware support

– Ti Strga
Sep 14 '18 at 20:37



It's a single contiguous storage (a 1d array).
Each time it runs out of capacity it gets reallocated and stored objects are moved to the new larger place — this is why you observe addresses of the stored objects changing.



It has always been this way, not since C++17.


C++17



The storage is growing geometrically to ensure the requirement of the amortized O(1) push_back(). The growth factor is 2 (Capn+1 = Capn + Capn) in most implementations of the C++ Standard Library (GCC, Clang, STLPort) and 1.5 (Capn+1 = Capn + Capn / 2) in the MSVC variant.


O(1)


push_back()



growing std::vector



If you pre-allocate it with vector::reserve(N) and sufficiently large N, then addresses of the stored objects won't be changing when you add new ones.


vector::reserve(N)


N



In most practical applications is usually worth pre-allocating it to at least 32 elements to skip the first few reallocations shortly following one other (0→1→2→4→8→16).



It is also sometimes practical to slow it down, switch to the arithmetic growth policy (Capn+1 = Capn + Const), or stop entirely after some reasonably large size to ensure the application does not waste or grow out of memory.



Lastly, in some practical applications, like column-based object storages, it may be worth giving up the idea of contiguous storage completely in favor of a segmented one (same as what std::deque does but with much larger chunks). This way the data may be stored reasonably well localized for both per-column and per-row queries (though this may need some help from the memory allocator as well).


std::deque






Nitpicking: growth factor depends on implementation. I heard MSVC uses 1.5

– HolyBlackCat
Sep 14 '18 at 10:33






Growth factor is implementation-defined.

– Vittorio Romeo
Sep 14 '18 at 10:34






@bobah In practice it is not 2x in some implementations.

– StaceyGirl
Sep 14 '18 at 10:53






In fact, growth factor 2 is pretty bad and 1.5x actually works better, even though this may seem counter-intuitive. And since this is well-known, I’m somewhat surprised that stdlibc++ and libc++ still use 2x.

– Konrad Rudolph
Sep 14 '18 at 13:38







For those curious and unwilling to follow a link, a growth factor of 2 leads to the walking vector problem; no sum of previous buffers allocated for a vector can ever quite fits its new allocation. So on a system where the only meaningful allocation is a growing std::vector, a 2x growth factor means you consume almost 2x the memory you'd expect to, half of your memory is returned vector buffers sitting fallow. With 1.5, after ... 4? cycles the previous buffers are big enough to allocate a new buffer in.

– Yakk - Adam Nevraumont
Sep 14 '18 at 18:39




std::vector being a contiguous container means exactly what you think it means.


std::vector



However, many operations on a vector can re-locate that entire piece of memory.



One common case is when you add element to it, the vector must grow, it can re-allocate and copy all elements to another contiguous piece of memory.



So what does it exactly mean, that std::vector is a contiguous container and why do its elements move? Does it maybe store them together, but moves them all together, when more space is needed?



That's exactly how it works and why appending elements does indeed invalidate all iterators as well as memory locations when a reallocation takes place¹. This is not only valid since C++17, it has been the case ever since.



There are a couple of benefits from this approach:


data()


push_back


reserve


resize


push_back



These implications can be considered the downside of such a memory layout:


push_front


std::list


std::deque


insert(vec.begin(), element)



¹ Thanks to @FrançoisAndrieux for pointing that out.






I like the pros and cons mentioned, but I will keep the current accepted answer for now as it is more about answering the question than additional notes. It's a pity, I cannot accept two answers though :)

– McSim
Sep 14 '18 at 10:51




In terms of the actual structure, an std::vector looks something like this in memory:


std::vector


struct vector // Simple C struct as example (T is the type supplied by the template)
T *begin; // vector::begin() probably returns this value
T *end; // vector::end() probably returns this value
T *end_capacity; // First non-valid address
// Allocator state might be stored here (most allocators are stateless)
;



Relevant code snippet from the libc++ implementation as used by LLVM


libc++



Printing the raw memory contents of an std::vector:

(Don't do this if you don't know what you're doing!)


std::vector


#include <iostream>
#include <vector>

struct vector
int *begin;
int *end;
int *end_capacity;
;

int main()
union vecunion
std::vector<int> stdvec;
vector myvec;
~vecunion() /* do nothing */
vec = std::vector<int>() ;
union veciterator
std::vector<int>::iterator stditer;
int *myiter;
~veciterator() /* do nothing */
;

vec.stdvec.push_back(1); // Add something so we don't have an empty vector

std::cout
<< "vec.begin = " << vec.myvec.begin << "n"
<< "vec.end = " << vec.myvec.end << "n"
<< "vec.end_capacity = " << vec.myvec.end_capacity << "n"
<< "vec's size = " << vec.myvec.end - vec.myvec.begin << "n"
<< "vec's capacity = " << vec.myvec.end_capacity - vec.myvec.begin << "n"
<< "vector::begin() = " << (veciterator vec.stdvec.begin() ).myiter << "n"
<< "vector::end() = " << (veciterator vec.stdvec.end() ).myiter << "n"
<< "vector::size() = " << vec.stdvec.size() << "n"
<< "vector::capacity() = " << vec.stdvec.capacity() << "n"
;



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

How do I collapse sections of code in Visual Studio Code for Windows?

ャフサォクコ ケウ,コ,ワ メ,ロスョノ゙,クネ,フムカヤヲニ,エコ゚ツ ウイオン゙ケワサネォキモュキォウイノンコチ゚メヌナイゥフュ,カヒウネェ ネ,ホノケ,ムュキ ッボーミュハ,チ ツス ィ メウイマヤ,゙ウチ ヅ ロ,ォジヌェ ャヌット ェ,マャ,チナエヒネソキツテ トホヲヲミーァ