In the last two blog posts, I describes the challenges I’ve encountered while building a next-generation range library. In this post, I’ll sketch for you my proposed solution: refinements of the range concepts that allow delimited, infinite, and pair-o’-iterator-style ranges to fit comfortably within the concept hierarchy without loss of performance or expressive power and with increased safety. I’ve built a range library around these concepts that subsumes and extends all of the C++98 STL algorithms and the Boost.Range adaptors, so I can say with confidence that these concepts lead to a useful and consistent generic range library.
Recap
At the end of my last post, I summed up the issues of pair-o’-iterators (PoI)-style ranges as follows:
- Delimited and infinite ranges generate poor code
- These range types are sometimes forced to model weaker concepts than they might otherwise
- Use of infinite ranges with some algorithms is unsafe
- Delimited and infinite ranges are harder to implement than they need to be
- Ranges that are possibly infinite can overflow their
difference_type
The first issue is particularly hard to swallow, so it’s where I’ll focus my energy in this post.
The Range Concept
Before I go any further, let’s be a little more formal about what “range” means. The C++ standard uses the word “range” all over the place without formally defining it. But we can infer from the section [iterator.range] that a range is something on which you can call begin
and end
to get back a pair of iterators where the end is reachable from begin. In the language of the current “Concepts Lite” proposal, we can formalize the Range concept as follows:
using std::begin; using std::end; template<typename T> using Iterator_type = decltype(begin(std::declval<T>())); template<typename T> concept bool Range = requires(T range) { { begin(range) } -> Iterator_type<T>; { end(range) } -> Iterator_type<T>; requires Iterator<Iterator_type<T>>; };
This basically says that you can call begin
and end
on a range and that you get back iterators. There are refinements of the Range
concept (not shown) called InputRange
, ForwardRange
, etc. that merely require more of their iterators. The refinement hierarchy is shown below. It’s fairly straightforward. (The above syntax was given to me by Andrew Sutton, the author of the Concepts Lite proposal, shortly after the February 2014 standardization committee meeting, so it’s guaranteed fresh. He warns me that the syntax may yet change in the future.)
These concepts are the foundation of the Boost.Range library.
Problem 1: Poor Code Generation
If you recall, to implement delimited and infinite ranges as a pair of iterators, the end iterator must be some sort of sentinel iterator. A sentinel represents a conceptual position rather than a physical one. You can still think of it as the last-plus-one position, the only difference is that you won’t know the physical position until you reach it. Since the sentinel has the same type as the iterator, it requires a runtime test to determine whether a given iterator is the sentinel or not. This leads to slow iterator comparisons and awkward range implementations.
The Iterable Concept
Think of the things you do with iterators. You increment them, you dereference them, and you compare them for equality, right? What can you do with a sentinel iterator? Not much. You can’t change its position since it represents a conceptual position, not a physical one. You can’t dereference them, because they always stand in for the last-plus-one position, which is not dereferencable. But you can compare it to an iterator. In other words, a sentinel is a very weak iterator.
The trouble with delimited and infinite ranges comes from trying to make a sentinel iterator into a regular iterator. It just isn’t one, and making it so causes problems. So just let it be. In other words:
Let range sentinels have different types than their ranges’ iterators.
The Range concept requires the begin and end iterator have the same type. If I allow the types to differ, I’m talking about something weaker than Range: the Iterable concept. Iterables are just like Ranges except the begin and end types differ. Here’s the Iterable concept:
template<typename T> using Sentinel_type = decltype(end(std::declval<T>())); template<typename T> concept bool Iterable = requires(T range) { { begin(range) } -> Iterator_type<T>; { end(range) } -> Sentinel_type<T>; requires Iterator<Iterator_type<T>>; requires EqualityComparable< Iterator_type<T>, Sentinel_type<T>>; }; template<typename T> concept bool Range = Iteratable<T> && Same<Iterator_type<T>, Sentinel_type<T>>;
All Ranges are Iterables trivially. That is, the Range concept refines Iterable by adding one additional constraint: that begin and end have the same type. In fact, the Iterable concept hierarchy parallels the Range hierarchy nicely:
This is what the hierarchy looks like when considering Ranges, Iterables, and Iterators, but it’s not necessarily the way we would actually define these concepts in our code. Notice that “rangeyness” — that is, whether begin and end have the same type — is orthogonal to the strength of the begin iterator. When we want to require that a type model RandomAccessRange, we can say requires RandomAccessIterable<T> && Range<T>
and do away with the other Range concepts entirely.
The difference between, say, a BidirectionalIterable and a ForwardIterable is in the concept modeled by the Iterable’s begin iterator. If the EqualityComparable
constraint in the Iterable
concept gives you pause, read on. I justify it below.
Iterables and the STL Algorithms
“But wait,” you say. “No STL algorithms will work with Iterables because they expect begin and end to have the same type!” That’s sadly true. So I went through all the STL algorithm to see which could be re-implemented in terms of the weaker concept. Take std::find
for example:
template<class InputIterator, class Value> InputIterator find(InputIterator first, InputIterator last, Value const & value) { for (; first != last; ++first) if (*first == value) break; return first; }
Today, std::find
requires Ranges. But notice how this algorithm never tries to change the position of the end iterator. The find
algorithm can very easily be changed to work with Iterables instead of Ranges:
template<class InputIterator, class Sentinel, class Value> InputIterator find(InputIterator first, Sentinel last, Value const & value) { for (; first != last; ++first) if (*first == value) break; return first; }
That’s it. The change is so minor, you might even have a hard time spotting it!
So, which C++98 algorithms can be made to work with Iterables instead of Ranges? Nearly all of them, it turns out. In fact, it’s easier to list the ones that don’t work with Iterables. They are:
copy_backward
- The heap algorithms (
push_heap
,pop_heap
,make_heap
,sort_heap
) inplace_merge
nth_element
partial_sort
andpartial_sort_copy
next_permutation
andprev_permutation
random_shuffle
reverse
andreverse_copy
sort
andstable_sort
stable_partition
For the 50 or so others, getting them to work with Iterables is mostly a mechanical source code transformation. By defining the Iterable concept such that Range refines it, any algorithm implemented in terms of Iterable automatically works with Ranges, which lets us reuse code. And that’s super important. There’s too much code written for iterators to think about picking an incompatible abstraction now.
The Proof is in the Perf
But what do we gain? Let’s revisit our old friend, the C-style null-terminated string. In a previous post, I defined a c_string_range
class and found that iterating through the characters generated very bad code. Let’s try again, this time using my range_facade
helper to build an Iterable instead of a Range. The code looks like this:
using namespace ranges; struct c_string_iterable : range_facade<c_string_iterable> { private: friend range_core_access; char const *sz_; char const & current() const { return *sz_; } void next() { ++sz_; } bool done() const { return *sz_ == 0; } bool equal(c_string_iterable const &that) const { return sz_ == that.sz_; } public: c_string_iterable(char const *sz) : sz_(sz) {} };
The first thing we notice is that this code is a lot simpler than the old c_string_range
class. The range_facade
helper does all the heavy lifting here. The iterator and the sentinel are all implemented in terms of the primitives shown. Gone is the awkward and complicated equality comparison. But how does it perform? To test it, I generated the optimized assembly for the following two functions, one which used the old c_string_range
class, and one that uses the new c_string_iterable
:
// Range-based int range_strlen( c_string_range::iterator begin, c_string_range::iterator end) { int i = 0; for(; begin != end; ++begin) ++i; return i; } // Iterable-based int iterable_strlen( range_iterator_t<c_string_iterable> begin, range_sentinel_t<c_string_iterable> end) { int i = 0; for(; begin != end; ++begin) ++i; return i; }
Even if you don’t know much about assembly code, the following should speak to you:
Range-based strlen
|
Iterable-based strlen
|
---|---|
pushl %ebp movl %esp, %ebp pushl %esi leal 8(%ebp), %ecx movl 12(%ebp), %esi xorl %eax, %eax testl %esi, %esi movl 8(%ebp), %edx jne LBB2_4 jmp LBB2_1 .align 16, 0x90 LBB2_8: incl %eax incl %edx movl %edx, (%ecx) LBB2_4: testl %edx, %edx jne LBB2_5 cmpb $0, (%esi) jne LBB2_8 jmp LBB2_6 .align 16, 0x90 LBB2_5: cmpl %edx, %esi jne LBB2_8 jmp LBB2_6 .align 16, 0x90 LBB2_3: leal 1(%edx,%eax), %esi incl %eax movl %esi, (%ecx) LBB2_1: movl %edx, %esi addl %eax, %esi je LBB2_6 cmpb $0, (%esi) jne LBB2_3 LBB2_6: popl %esi popl %ebp ret |
pushl %ebp movl %esp, %ebp movl 8(%ebp), %ecx xorl %eax, %eax cmpb $0, (%ecx) je LBB1_4 leal 8(%ebp), %edx .align 16, 0x90 LBB1_2: cmpb $0, 1(%ecx,%eax) leal 1(%eax), %eax jne LBB1_2 addl %eax, %ecx movl %ecx, (%edx) LBB1_4: popl %ebp ret |
The code generated from the Iterable algorithm is far superior to that generated from the pair of iterators. In fact, if you check it against the assembly for the raw C-style iteration, you’ll find it’s almost identical.
Iterators, Sentinels, and Equality
But what does it mean to compare two objects of different types for equality? Or put in more formal terms, can the requirement that an Iterable’s iterator and sentinel satisfy the cross-type EqualityComparable concept be satisfied? I believe the answer is yes.
Some background for the uninitiated: N3351 defines precisely when and how cross-type equality comparisons are meaningful. It’s not enough for the syntax “x==y” to be valid and yield a bool
. If x
and y
have different types, the types of both x
and y
must themselves be EqualityComparable, and there must be a common type to which they can both be converted, and that type must also be EqualityComparable. Think of comparing a char
with a short
. It works because both char
and short
are EqualityComparable, and because they can both be converted to an int
which is also EqualityComparable.
Iterators are comparable, and sentinels are trivially comparable (they always compare equal). The tricky part is the common type requirement. Logically, every iterator and sentinel has a common type that can be constructed as follows: assume the existence of a new iterator type I
that is a tagged union that contains either an iterator or a sentinel. When an iterator is compared to a sentinel, it behaves semantically as if both the iterator and the sentinel were first converted to two objects of type I
— call them lhs
and rhs
— and then compared according to the following truth table:
lhs is sentinel ?
|
rhs is sentinel ?
|
lhs == rhs ?
|
---|---|---|
true
|
true
|
true
|
true
|
false
|
done(rhs.iter)
|
false
|
true
|
done(lhs.iter)
|
false
|
false
|
lhs.iter == rhs.iter
|
If you’ve been following this series, the above truth table should ring a bell. It’s pretty much exactly the table we got when figuring out how c_string_range::iterator
‘s equality operator should behave, and that’s no coincidence; that was a special case of this more general construction. This construction validates an intuition you might have after seeing the two classes I wrote, c_string_range
and c_string_iterable
. One is a pair of iterators, the other an iterator/sentinel pair, but they implement equivalent procedures for computing equality. We know they’re the same, and we feel in our guts we could build an equivalent Range out of every Iterable if we’re willing to sacrifice some performance. And now we know that’s true.
Allowing direct comparison of iterators and sentinels lets us use the C++ type system to optimize a large category of iterations by eliminating branches from the equality comparison operator.
Objections
The idea of allowing begin and end iterators to have different types is not new, and it’s not mine. (In fact, many of you who have commented on the first two posts, either here or on reddit.com, have made precisely this suggestion.) I first heard about it from Dave Abrahams years ago. More recently, Dietmar Kuehl floated a similar idea on the Ranges mailing list. Sean Parent raised the following objection in a follow-up message:
Again, I think this is putting too much on iterators. Algorithms that act with a sentinel termination or with a count are different beasts. See copy_n() and copy_sentinel()
http://stlab.adobe.com/copy_8hpp.html
For ranges – I certainly thing you should be able to construct a range with:
- a pair of iterators
- an iterator and a count
- an iterator and a sentinel value
In which case, copy(r, out) can dispatch to the correct algorithm.
If I’m understanding Sean correctly, he is arguing for 3 parallel range concept hierarchies: IteratorRange, CountedRange, and SentinelRange. These hierarchies would have no refinement relationships between them. The copy
algorithm would have three underlying implementations, one for each concept hierarchy. There are 50 some odd algorithms that would need to be triplicated in this way. That’s a lot of code duplication.
In fact, it’s worse than that because some algorithms are specialized to take advantage of more refined concepts. For instance, in libc++, the rotate
algorithm dispatches to one of three implementations depending on whether you pass it forward, bidirectional, or random-access iterators. To accommodate Iterator, Counted, and SentinelRanges, we would need a grand total of 9 rotate
algorithm implementations! I have nothing but respect for Sean Parent, but that’s madness. With the Iterable concept, Sean’s three separate hierarchies get unified under a single syntax that lets us write general algorithms while preserving performance characteristics. In other words, with Iterables, 3 implementations of rotate
suffice.
(Incidentally, the Iterable concept can neatly accommodate counted ranges. If you want to turn an iterator and a count into an Iterable, you can bundle the iterator and the count together into a new iterator type that decrements the count whenever the iterator is incremented. When comparing the iterator to the sentinel, it merely checks whether the count is zero.)
Summary, For Now…
At the beginning of this post I summarized some of the problems with pair-o’-iterator ranges. I showed how a new concept, Iterable, addresses the performance issues, and touched a bit on the issue of range implementation complexity. I haven’t yet covered how the Iterable concept helps with infinite ranges, or how to address the safety issue of passing an infinite range to an algorithm that can’t handle them. This post has run a bit long, so I’ll stop for now and address the other issues in the fourth and final installment. Hopefully, this has given you a few things to think about until then.
If you want to download and play with the code, you can find it in the range-v3 repository on github. I’m happy to take suggestions and bug reports, but please don’t use this code for anything real. It’s untested and still evolving.
Acknowledgements
I’d like to thank Andrew Sutton for helping with the Concept Lite syntax and also for explaining the requirements of the cross-type EqualityComparable concept and generally improving and formalizing many of the ideas presented here. The article is immeasurably better for his many contributions.