mirror of
https://github.com/klzgrad/naiveproxy.git
synced 2024-11-28 08:16:09 +03:00
317 lines
13 KiB
Markdown
317 lines
13 KiB
Markdown
|
# base/containers library
|
|||
|
|
|||
|
## What goes here
|
|||
|
|
|||
|
This directory contains some STL-like containers.
|
|||
|
|
|||
|
Things should be moved here that are generally applicable across the code base.
|
|||
|
Don't add things here just because you need them in one place and think others
|
|||
|
may someday want something similar. You can put specialized containers in
|
|||
|
your component's directory and we can promote them here later if we feel there
|
|||
|
is broad applicability.
|
|||
|
|
|||
|
### Design and naming
|
|||
|
|
|||
|
Containers should adhere as closely to STL as possible. Functions and behaviors
|
|||
|
not present in STL should only be added when they are related to the specific
|
|||
|
data structure implemented by the container.
|
|||
|
|
|||
|
For STL-like containers our policy is that they should use STL-like naming even
|
|||
|
when it may conflict with the style guide. So functions and class names should
|
|||
|
be lower case with underscores. Non-STL-like classes and functions should use
|
|||
|
Google naming. Be sure to use the base namespace.
|
|||
|
|
|||
|
## Map and set selection
|
|||
|
|
|||
|
### Usage advice
|
|||
|
|
|||
|
* Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the
|
|||
|
common case, query performance is unlikely to be sufficiently higher than
|
|||
|
std::map to make a difference, insert performance is slightly worse, and
|
|||
|
the memory overhead is high. This makes sense mostly for large tables where
|
|||
|
you expect a lot of lookups.
|
|||
|
|
|||
|
* Most maps and sets in Chrome are small and contain objects that can be
|
|||
|
moved efficiently. In this case, consider **base::flat\_map** and
|
|||
|
**base::flat\_set**. You need to be aware of the maximum expected size of
|
|||
|
the container since individual inserts and deletes are O(n), giving O(n^2)
|
|||
|
construction time for the entire map. But because it avoids mallocs in most
|
|||
|
cases, inserts are better or comparable to other containers even for
|
|||
|
several dozen items, and efficiently-moved types are unlikely to have
|
|||
|
performance problems for most cases until you have hundreds of items. If
|
|||
|
your container can be constructed in one shot, the constructor from vector
|
|||
|
gives O(n log n) construction times and it should be strictly better than
|
|||
|
a std::map.
|
|||
|
|
|||
|
* **base::small\_map** has better runtime memory usage without the poor
|
|||
|
mutation performance of large containers that base::flat\_map has. But this
|
|||
|
advantage is partially offset by additional code size. Prefer in cases
|
|||
|
where you make<6B>many objects so that the code/heap tradeoff is good.
|
|||
|
|
|||
|
* Use **std::map** and **std::set** if you can't decide. Even if they're not
|
|||
|
great, they're unlikely to be bad or surprising.
|
|||
|
|
|||
|
### Map and set details
|
|||
|
|
|||
|
Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the
|
|||
|
container is mutated.
|
|||
|
|
|||
|
| Container | Empty size | Per-item overhead | Stable iterators? |
|
|||
|
|:---------------------------------------- |:--------------------- |:----------------- |:----------------- |
|
|||
|
| std::map, std::set | 16 bytes | 32 bytes | Yes |
|
|||
|
| std::unordered\_map, std::unordered\_set | 128 bytes | 16-24 bytes | No |
|
|||
|
| base::flat\_map and base::flat\_set | 24 bytes | 0 (see notes) | No |
|
|||
|
| base::small\_map | 24 bytes (see notes) | 32 bytes | No |
|
|||
|
|
|||
|
**Takeaways:** std::unordered\_map and std::unordered\_map have high
|
|||
|
overhead for small container sizes, prefer these only for larger workloads.
|
|||
|
|
|||
|
Code size comparisons for a block of code (see appendix) on Windows using
|
|||
|
strings as keys.
|
|||
|
|
|||
|
| Container | Code size |
|
|||
|
|:------------------- |:---------- |
|
|||
|
| std::unordered\_map | 1646 bytes |
|
|||
|
| std::map | 1759 bytes |
|
|||
|
| base::flat\_map | 1872 bytes |
|
|||
|
| base::small\_map | 2410 bytes |
|
|||
|
|
|||
|
**Takeaways:** base::small\_map generates more code because of the inlining of
|
|||
|
both brute-force and red-black tree searching. This makes it less attractive
|
|||
|
for random one-off uses. But if your code is called frequently, the runtime
|
|||
|
memory benefits will be more important. The code sizes of the other maps are
|
|||
|
close enough it's not worth worrying about.
|
|||
|
|
|||
|
### std::map and std::set
|
|||
|
|
|||
|
A red-black tree. Each inserted item requires the memory allocation of a node
|
|||
|
on the heap. Each node contains a left pointer, a right pointer, a parent
|
|||
|
pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits).
|
|||
|
|
|||
|
### std::unordered\_map and std::unordered\_set
|
|||
|
|
|||
|
A hash table. Implemented on Windows as a std::vector + std::list and in libc++
|
|||
|
as the equivalent of a std::vector + a std::forward\_list. Both implementations
|
|||
|
allocate an 8-entry hash table (containing iterators into the list) on
|
|||
|
initialization, and grow to 64 entries once 8 items are inserted. Above 64
|
|||
|
items, the size doubles every time the load factor exceeds 1.
|
|||
|
|
|||
|
The empty size is sizeof(std::unordered\_map) = 64 +
|
|||
|
the initial hash table size which is 8 pointers. The per-item overhead in the
|
|||
|
table above counts the list node (2 pointers on Windows, 1 pointer in libc++),
|
|||
|
plus amortizes the hash table assuming a 0.5 load factor on average.
|
|||
|
|
|||
|
In a microbenchmark on Windows, inserts of 1M integers into a
|
|||
|
std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the
|
|||
|
time of std::set. For a typical 4-entry set (the statistical mode of map sizes
|
|||
|
in the browser), query performance is identical to std::set and base::flat\_set.
|
|||
|
On ARM, unordered\_set performance can be worse because integer division to
|
|||
|
compute the bucket is slow, and a few "less than" operations can be faster than
|
|||
|
computing a hash depending on the key type. The takeaway is that you should not
|
|||
|
default to using unordered maps because "they're faster."
|
|||
|
|
|||
|
### base::flat\_map and base::flat\_set
|
|||
|
|
|||
|
A sorted std::vector. Seached via binary search, inserts in the middle require
|
|||
|
moving elements to make room. Good cache locality. For large objects and large
|
|||
|
set sizes, std::vector's doubling-when-full strategy can waste memory.
|
|||
|
|
|||
|
Supports efficient construction from a vector of items which avoids the O(n^2)
|
|||
|
insertion time of each element separately.
|
|||
|
|
|||
|
The per-item overhead will depend on the underlying std::vector's reallocation
|
|||
|
strategy and the memory access pattern. Assuming items are being linearly added,
|
|||
|
one would expect it to be 3/4 full, so per-item overhead will be 0.25 *
|
|||
|
sizeof(T).
|
|||
|
|
|||
|
|
|||
|
flat\_set/flat\_map support a notion of transparent comparisons. Therefore you
|
|||
|
can, for example, lookup base::StringPiece in a set of std::strings without
|
|||
|
constructing a temporary std::string. This functionality is based on C++14
|
|||
|
extensions to std::set/std::map interface.
|
|||
|
|
|||
|
You can find more information about transparent comparisons here:
|
|||
|
http://en.cppreference.com/w/cpp/utility/functional/less_void
|
|||
|
|
|||
|
Example, smart pointer set:
|
|||
|
|
|||
|
```cpp
|
|||
|
// Define a custom comparator.
|
|||
|
struct UniquePtrComparator {
|
|||
|
// Mark your comparison as transparent.
|
|||
|
using is_transparent = int;
|
|||
|
|
|||
|
template <typename T>
|
|||
|
bool operator()(const std::unique_ptr<T>& lhs,
|
|||
|
const std::unique_ptr<T>& rhs) const {
|
|||
|
return lhs < rhs;
|
|||
|
}
|
|||
|
|
|||
|
template <typename T>
|
|||
|
bool operator()(const T* lhs, const std::unique_ptr<T>& rhs) const {
|
|||
|
return lhs < rhs.get();
|
|||
|
}
|
|||
|
|
|||
|
template <typename T>
|
|||
|
bool operator()(const std::unique_ptr<T>& lhs, const T* rhs) const {
|
|||
|
return lhs.get() < rhs;
|
|||
|
}
|
|||
|
};
|
|||
|
|
|||
|
// Declare a typedef.
|
|||
|
template <typename T>
|
|||
|
using UniquePtrSet = base::flat_set<std::unique_ptr<T>, UniquePtrComparator>;
|
|||
|
|
|||
|
// ...
|
|||
|
// Collect data.
|
|||
|
std::vector<std::unique_ptr<int>> ptr_vec;
|
|||
|
ptr_vec.reserve(5);
|
|||
|
std::generate_n(std::back_inserter(ptr_vec), 5, []{
|
|||
|
return std::make_unique<int>(0);
|
|||
|
});
|
|||
|
|
|||
|
// Construct a set.
|
|||
|
UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES);
|
|||
|
|
|||
|
// Use raw pointers to lookup keys.
|
|||
|
int* ptr = ptr_set.begin()->get();
|
|||
|
EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin());
|
|||
|
```
|
|||
|
|
|||
|
Example flat_map<std\::string, int>:
|
|||
|
|
|||
|
```cpp
|
|||
|
base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}},
|
|||
|
base::KEEP_FIRST_OF_DUPES);
|
|||
|
|
|||
|
// Does not construct temporary strings.
|
|||
|
str_to_int.find("c")->second = 3;
|
|||
|
str_to_int.erase("c");
|
|||
|
EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second);
|
|||
|
|
|||
|
// NOTE: This does construct a temporary string. This happens since if the
|
|||
|
// item is not in the container, then it needs to be constructed, which is
|
|||
|
// something that transparent comparators don't have to guarantee.
|
|||
|
str_to_int["c"] = 3;
|
|||
|
```
|
|||
|
|
|||
|
### base::small\_map
|
|||
|
|
|||
|
A small inline buffer that is brute-force searched that overflows into a full
|
|||
|
std::map or std::unordered\_map. This gives the memory benefit of
|
|||
|
base::flat\_map for small data sizes without the degenerate insertion
|
|||
|
performance for large container sizes.
|
|||
|
|
|||
|
Since instantiations require both code for a std::map and a brute-force search
|
|||
|
of the inline container, plus a fancy iterator to cover both cases, code size
|
|||
|
is larger.
|
|||
|
|
|||
|
The initial size in the above table is assuming a very small inline table. The
|
|||
|
actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) *
|
|||
|
inline\_size).
|
|||
|
|
|||
|
# Deque
|
|||
|
|
|||
|
### Usage advice
|
|||
|
|
|||
|
Chromium code should always use `base::circular_deque` or `base::queue` in
|
|||
|
preference to `std::deque` or `std::queue` due to memory usage and platform
|
|||
|
variation.
|
|||
|
|
|||
|
The `base::circular_deque` implementation (and the `base::queue` which uses it)
|
|||
|
provide performance consistent across platforms that better matches most
|
|||
|
programmer's expectations on performance (it doesn't waste as much space as
|
|||
|
libc++ and doesn't do as many heap allocations as MSVC). It also generates less
|
|||
|
code tham `std::queue`: using it across the code base saves several hundred
|
|||
|
kilobytes.
|
|||
|
|
|||
|
Since `base::deque` does not have stable iterators and it will move the objects
|
|||
|
it contains, it may not be appropriate for all uses. If you need these,
|
|||
|
consider using a `std::list` which will provide constant time insert and erase.
|
|||
|
|
|||
|
### std::deque and std::queue
|
|||
|
|
|||
|
The implementation of `std::deque` varies considerably which makes it hard to
|
|||
|
reason about. All implementations use a sequence of data blocks referenced by
|
|||
|
an array of pointers. The standard guarantees random access, amortized
|
|||
|
constant operations at the ends, and linear mutations in the middle.
|
|||
|
|
|||
|
In Microsoft's implementation, each block is the smaller of 16 bytes or the
|
|||
|
size of the contained element. This means in practice that every expansion of
|
|||
|
the deque of non-trivial classes requires a heap allocation. libc++ (on Android
|
|||
|
and Mac) uses 4K blocks which elimiates the problem of many heap allocations,
|
|||
|
but generally wastes a large amount of space (an Android analysis revealed more
|
|||
|
than 2.5MB wasted space from deque alone, resulting in some optimizations).
|
|||
|
libstdc++ uses an intermediate-size 512 byte buffer.
|
|||
|
|
|||
|
Microsoft's implementation never shrinks the deque capacity, so the capacity
|
|||
|
will always be the maximum number of elements ever contained. libstdc++
|
|||
|
deallocates blocks as they are freed. libc++ keeps up to two empty blocks.
|
|||
|
|
|||
|
### base::circular_deque and base::queue
|
|||
|
|
|||
|
A deque implemented as a circular buffer in an array. The underlying array will
|
|||
|
grow like a `std::vector` while the beginning and end of the deque will move
|
|||
|
around. The items will wrap around the underlying buffer so the storage will
|
|||
|
not be contiguous, but fast random access iterators are still possible.
|
|||
|
|
|||
|
When the underlying buffer is filled, it will be reallocated and the constents
|
|||
|
moved (like a `std::vector`). The underlying buffer will be shrunk if there is
|
|||
|
too much wasted space (_unlike_ a `std::vector`). As a result, iterators are
|
|||
|
not stable across mutations.
|
|||
|
|
|||
|
# Stack
|
|||
|
|
|||
|
`std::stack` is like `std::queue` in that it is a wrapper around an underlying
|
|||
|
container. The default container is `std::deque` so everything from the deque
|
|||
|
section applies.
|
|||
|
|
|||
|
Chromium provides `base/containers/stack.h` which defines `base::stack` that
|
|||
|
should be used in preference to std::stack. This changes the underlying
|
|||
|
container to `base::circular_deque`. The result will be very similar to
|
|||
|
manually specifying a `std::vector` for the underlying implementation except
|
|||
|
that the storage will shrink when it gets too empty (vector will never
|
|||
|
reallocate to a smaller size).
|
|||
|
|
|||
|
Watch out: with some stack usage patterns it's easy to depend on unstable
|
|||
|
behavior:
|
|||
|
|
|||
|
```cpp
|
|||
|
base::stack<Foo> stack;
|
|||
|
for (...) {
|
|||
|
Foo& current = stack.top();
|
|||
|
DoStuff(); // May call stack.push(), say if writing a parser.
|
|||
|
current.done = true; // Current may reference deleted item!
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
## Appendix
|
|||
|
|
|||
|
### Code for map code size comparison
|
|||
|
|
|||
|
This just calls insert and query a number of times, with printfs that prevent
|
|||
|
things from being dead-code eliminated.
|
|||
|
|
|||
|
```cpp
|
|||
|
TEST(Foo, Bar) {
|
|||
|
base::small_map<std::map<std::string, Flubber>> foo;
|
|||
|
foo.insert(std::make_pair("foo", Flubber(8, "bar")));
|
|||
|
foo.insert(std::make_pair("bar", Flubber(8, "bar")));
|
|||
|
foo.insert(std::make_pair("foo1", Flubber(8, "bar")));
|
|||
|
foo.insert(std::make_pair("bar1", Flubber(8, "bar")));
|
|||
|
foo.insert(std::make_pair("foo", Flubber(8, "bar")));
|
|||
|
foo.insert(std::make_pair("bar", Flubber(8, "bar")));
|
|||
|
auto found = foo.find("asdf");
|
|||
|
printf("Found is %d\n", (int)(found == foo.end()));
|
|||
|
found = foo.find("foo");
|
|||
|
printf("Found is %d\n", (int)(found == foo.end()));
|
|||
|
found = foo.find("bar");
|
|||
|
printf("Found is %d\n", (int)(found == foo.end()));
|
|||
|
found = foo.find("asdfhf");
|
|||
|
printf("Found is %d\n", (int)(found == foo.end()));
|
|||
|
found = foo.find("bar1");
|
|||
|
printf("Found is %d\n", (int)(found == foo.end()));
|
|||
|
}
|
|||
|
```
|
|||
|
|