Quantcast
Channel: std – Eric Niebler
Viewing all articles
Browse latest Browse all 11

Range Concepts, Part 3 of 4: Introducing Iterables

$
0
0

In the last two blog posts, I describes the challenges I’ve encountered while building a next-generation range library. In this post, I’ll sketch for you my proposed solution: refinements of the range concepts that allow delimited, infinite, and pair-o’-iterator-style ranges to fit comfortably within the concept hierarchy without loss of performance or expressive power and with increased safety. I’ve built a range library around these concepts that subsumes and extends all of the C++98 STL algorithms and the Boost.Range adaptors, so I can say with confidence that these concepts lead to a useful and consistent generic range library.

Recap

At the end of my last post, I summed up the issues of pair-o’-iterators (PoI)-style ranges as follows:

  • Delimited and infinite ranges generate poor code
  • These range types are sometimes forced to model weaker concepts than they might otherwise
  • Use of infinite ranges with some algorithms is unsafe
  • Delimited and infinite ranges are harder to implement than they need to be
  • Ranges that are possibly infinite can overflow their difference_type

The first issue is particularly hard to swallow, so it’s where I’ll focus my energy in this post.

The Range Concept

Before I go any further, let’s be a little more formal about what “range” means. The C++ standard uses the word “range” all over the place without formally defining it. But we can infer from the section [iterator.range] that a range is something on which you can call begin and end to get back a pair of iterators where the end is reachable from begin. In the language of the current “Concepts Lite” proposal, we can formalize the Range concept as follows:

using std::begin;
using std::end;

template<typename T>
using Iterator_type =
    decltype(begin(std::declval<T>()));

template<typename T>
concept bool Range =
    requires(T range) {
        { begin(range) } -> Iterator_type<T>;
        { end(range) }  -> Iterator_type<T>;
        requires Iterator<Iterator_type<T>>;
    };

This basically says that you can call begin and end on a range and that you get back iterators. There are refinements of the Range concept (not shown) called InputRange, ForwardRange, etc. that merely require more of their iterators. The refinement hierarchy is shown below. It’s fairly straightforward. (The above syntax was given to me by Andrew Sutton, the author of the Concepts Lite proposal, shortly after the February 2014 standardization committee meeting, so it’s guaranteed fresh. He warns me that the syntax may yet change in the future.)

Range Concept Hierarchy

Range Concept Hierarchy

These concepts are the foundation of the Boost.Range library.

Problem 1: Poor Code Generation

If you recall, to implement delimited and infinite ranges as a pair of iterators, the end iterator must be some sort of sentinel iterator. A sentinel represents a conceptual position rather than a physical one. You can still think of it as the last-plus-one position, the only difference is that you won’t know the physical position until you reach it. Since the sentinel has the same type as the iterator, it requires a runtime test to determine whether a given iterator is the sentinel or not. This leads to slow iterator comparisons and awkward range implementations.

The Iterable Concept

Think of the things you do with iterators. You increment them, you dereference them, and you compare them for equality, right? What can you do with a sentinel iterator? Not much. You can’t change its position since it represents a conceptual position, not a physical one. You can’t dereference them, because they always stand in for the last-plus-one position, which is not dereferencable. But you can compare it to an iterator. In other words, a sentinel is a very weak iterator.

The trouble with delimited and infinite ranges comes from trying to make a sentinel iterator into a regular iterator. It just isn’t one, and making it so causes problems. So just let it be. In other words:

Let range sentinels have different types than their ranges’ iterators.

The Range concept requires the begin and end iterator have the same type. If I allow the types to differ, I’m talking about something weaker than Range: the Iterable concept. Iterables are just like Ranges except the begin and end types differ. Here’s the Iterable concept:

template<typename T>
using Sentinel_type =
    decltype(end(std::declval<T>()));

template<typename T>
concept bool Iterable =
    requires(T range) {
        { begin(range) } -> Iterator_type<T>;
        { end(range) }  -> Sentinel_type<T>;
        requires Iterator<Iterator_type<T>>;
        requires EqualityComparable<
            Iterator_type<T>, Sentinel_type<T>>;
    };

template<typename T>
concept bool Range =
    Iteratable<T> &&
    Same<Iterator_type<T>, Sentinel_type<T>>;

All Ranges are Iterables trivially. That is, the Range concept refines Iterable by adding one additional constraint: that begin and end have the same type. In fact, the Iterable concept hierarchy parallels the Range hierarchy nicely:

Iterable Concept Hierarchy

Iterable Concept Hierarchy

This is what the hierarchy looks like when considering Ranges, Iterables, and Iterators, but it’s not necessarily the way we would actually define these concepts in our code. Notice that “rangeyness” — that is, whether begin and end have the same type — is orthogonal to the strength of the begin iterator. When we want to require that a type model RandomAccessRange, we can say requires RandomAccessIterable<T> && Range<T> and do away with the other Range concepts entirely.

The difference between, say, a BidirectionalIterable and a ForwardIterable is in the concept modeled by the Iterable’s begin iterator. If the EqualityComparable constraint in the Iterable concept gives you pause, read on. I justify it below.

Iterables and the STL Algorithms

“But wait,” you say. “No STL algorithms will work with Iterables because they expect begin and end to have the same type!” That’s sadly true. So I went through all the STL algorithm to see which could be re-implemented in terms of the weaker concept. Take std::find for example:

template<class InputIterator, class Value>
InputIterator
find(InputIterator first, InputIterator last,
     Value const & value)
{
    for (; first != last; ++first)
        if (*first == value)
            break;
    return first;
}

Today, std::find requires Ranges. But notice how this algorithm never tries to change the position of the end iterator. The find algorithm can very easily be changed to work with Iterables instead of Ranges:

template<class InputIterator, class Sentinel, class Value>
InputIterator
find(InputIterator first, Sentinel last,
     Value const & value)
{
    for (; first != last; ++first)
        if (*first == value)
            break;
    return first;
}

That’s it. The change is so minor, you might even have a hard time spotting it!

So, which C++98 algorithms can be made to work with Iterables instead of Ranges? Nearly all of them, it turns out. In fact, it’s easier to list the ones that don’t work with Iterables. They are:

  • copy_backward
  • The heap algorithms (push_heap, pop_heap, make_heap, sort_heap)
  • inplace_merge
  • nth_element
  • partial_sort and partial_sort_copy
  • next_permutation and prev_permutation
  • random_shuffle
  • reverse and reverse_copy
  • sort and stable_sort
  • stable_partition

For the 50 or so others, getting them to work with Iterables is mostly a mechanical source code transformation. By defining the Iterable concept such that Range refines it, any algorithm implemented in terms of Iterable automatically works with Ranges, which lets us reuse code. And that’s super important. There’s too much code written for iterators to think about picking an incompatible abstraction now.

The Proof is in the Perf

But what do we gain? Let’s revisit our old friend, the C-style null-terminated string. In a previous post, I defined a c_string_range class and found that iterating through the characters generated very bad code. Let’s try again, this time using my range_facade helper to build an Iterable instead of a Range. The code looks like this:

using namespace ranges;
struct c_string_iterable
  : range_facade<c_string_iterable>
{
private:
    friend range_core_access;
    char const *sz_;
    char const & current() const { return *sz_; }
    void next() { ++sz_; }
    bool done() const { return *sz_ == 0; }
    bool equal(c_string_iterable const &that) const
    { return sz_ == that.sz_; }
public:
    c_string_iterable(char const *sz)
        : sz_(sz) {}
};

The first thing we notice is that this code is a lot simpler than the old c_string_range class. The range_facade helper does all the heavy lifting here. The iterator and the sentinel are all implemented in terms of the primitives shown. Gone is the awkward and complicated equality comparison. But how does it perform? To test it, I generated the optimized assembly for the following two functions, one which used the old c_string_range class, and one that uses the new c_string_iterable:

// Range-based
int range_strlen(
    c_string_range::iterator begin,
    c_string_range::iterator end)
{
    int i = 0;
    for(; begin != end; ++begin)
        ++i;
    return i;
}

// Iterable-based
int iterable_strlen(
    range_iterator_t<c_string_iterable> begin,
    range_sentinel_t<c_string_iterable> end)
{
    int i = 0;
    for(; begin != end; ++begin)
        ++i;
    return i;
}

Even if you don’t know much about assembly code, the following should speak to you:

Range-based strlen Iterable-based strlen
    pushl    %ebp
    movl    %esp, %ebp
    pushl    %esi
    leal    8(%ebp), %ecx
    movl    12(%ebp), %esi
    xorl    %eax, %eax
    testl    %esi, %esi
    movl    8(%ebp), %edx
    jne    LBB2_4
    jmp    LBB2_1
    .align    16, 0x90
LBB2_8:
    incl    %eax
    incl    %edx
    movl    %edx, (%ecx)
LBB2_4:
    testl    %edx, %edx
    jne    LBB2_5
    cmpb    $0, (%esi)
    jne    LBB2_8
    jmp    LBB2_6
    .align    16, 0x90
LBB2_5:
    cmpl    %edx, %esi
    jne    LBB2_8
    jmp    LBB2_6
    .align    16, 0x90
LBB2_3:
    leal    1(%edx,%eax), %esi
    incl    %eax
    movl    %esi, (%ecx)
LBB2_1:
    movl    %edx, %esi
    addl    %eax, %esi
    je    LBB2_6
    cmpb    $0, (%esi)
    jne    LBB2_3
LBB2_6:
    popl    %esi
    popl    %ebp
    ret
        
    pushl    %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %ecx
    xorl    %eax, %eax
    cmpb    $0, (%ecx)
    je    LBB1_4
    leal    8(%ebp), %edx
    .align    16, 0x90
LBB1_2:
    cmpb    $0, 1(%ecx,%eax)
    leal    1(%eax), %eax
    jne    LBB1_2
    addl    %eax, %ecx
    movl    %ecx, (%edx)
LBB1_4:
    popl    %ebp
    ret
        

The code generated from the Iterable algorithm is far superior to that generated from the pair of iterators. In fact, if you check it against the assembly for the raw C-style iteration, you’ll find it’s almost identical.

Iterators, Sentinels, and Equality

But what does it mean to compare two objects of different types for equality? Or put in more formal terms, can the requirement that an Iterable’s iterator and sentinel satisfy the cross-type EqualityComparable concept be satisfied? I believe the answer is yes.

Some background for the uninitiated: N3351 defines precisely when and how cross-type equality comparisons are meaningful. It’s not enough for the syntax “x==y” to be valid and yield a bool. If x and y have different types, the types of both x and y must themselves be EqualityComparable, and there must be a common type to which they can both be converted, and that type must also be EqualityComparable. Think of comparing a char with a short. It works because both char and short are EqualityComparable, and because they can both be converted to an int which is also EqualityComparable.

Iterators are comparable, and sentinels are trivially comparable (they always compare equal). The tricky part is the common type requirement. Logically, every iterator and sentinel has a common type that can be constructed as follows: assume the existence of a new iterator type I that is a tagged union that contains either an iterator or a sentinel. When an iterator is compared to a sentinel, it behaves semantically as if both the iterator and the sentinel were first converted to two objects of type I — call them lhs and rhs — and then compared according to the following truth table:

lhs is sentinel ? rhs is sentinel ? lhs == rhs ?
true true true
true false done(rhs.iter)
false true done(lhs.iter)
false false lhs.iter == rhs.iter

If you’ve been following this series, the above truth table should ring a bell. It’s pretty much exactly the table we got when figuring out how c_string_range::iterator‘s equality operator should behave, and that’s no coincidence; that was a special case of this more general construction. This construction validates an intuition you might have after seeing the two classes I wrote, c_string_range and c_string_iterable. One is a pair of iterators, the other an iterator/sentinel pair, but they implement equivalent procedures for computing equality. We know they’re the same, and we feel in our guts we could build an equivalent Range out of every Iterable if we’re willing to sacrifice some performance. And now we know that’s true.

Allowing direct comparison of iterators and sentinels lets us use the C++ type system to optimize a large category of iterations by eliminating branches from the equality comparison operator.

Objections

The idea of allowing begin and end iterators to have different types is not new, and it’s not mine. (In fact, many of you who have commented on the first two posts, either here or on reddit.com, have made precisely this suggestion.) I first heard about it from Dave Abrahams years ago. More recently, Dietmar Kuehl floated a similar idea on the Ranges mailing list. Sean Parent raised the following objection in a follow-up message:

Again, I think this is putting too much on iterators. Algorithms that act with a sentinel termination or with a count are different beasts. See copy_n() and copy_sentinel()

http://stlab.adobe.com/copy_8hpp.html

For ranges – I certainly thing you should be able to construct a range with:

  1. a pair of iterators
  2. an iterator and a count
  3. an iterator and a sentinel value

In which case, copy(r, out) can dispatch to the correct algorithm.

If I’m understanding Sean correctly, he is arguing for 3 parallel range concept hierarchies: IteratorRange, CountedRange, and SentinelRange. These hierarchies would have no refinement relationships between them. The copy algorithm would have three underlying implementations, one for each concept hierarchy. There are 50 some odd algorithms that would need to be triplicated in this way. That’s a lot of code duplication.

In fact, it’s worse than that because some algorithms are specialized to take advantage of more refined concepts. For instance, in libc++, the rotate algorithm dispatches to one of three implementations depending on whether you pass it forward, bidirectional, or random-access iterators. To accommodate Iterator, Counted, and SentinelRanges, we would need a grand total of 9 rotate algorithm implementations! I have nothing but respect for Sean Parent, but that’s madness. With the Iterable concept, Sean’s three separate hierarchies get unified under a single syntax that lets us write general algorithms while preserving performance characteristics. In other words, with Iterables, 3 implementations of rotate suffice.

(Incidentally, the Iterable concept can neatly accommodate counted ranges. If you want to turn an iterator and a count into an Iterable, you can bundle the iterator and the count together into a new iterator type that decrements the count whenever the iterator is incremented. When comparing the iterator to the sentinel, it merely checks whether the count is zero.)

Summary, For Now…

At the beginning of this post I summarized some of the problems with pair-o’-iterator ranges. I showed how a new concept, Iterable, addresses the performance issues, and touched a bit on the issue of range implementation complexity. I haven’t yet covered how the Iterable concept helps with infinite ranges, or how to address the safety issue of passing an infinite range to an algorithm that can’t handle them. This post has run a bit long, so I’ll stop for now and address the other issues in the fourth and final installment. Hopefully, this has given you a few things to think about until then.

If you want to download and play with the code, you can find it in the range-v3 repository on github. I’m happy to take suggestions and bug reports, but please don’t use this code for anything real. It’s untested and still evolving.

Acknowledgements

I’d like to thank Andrew Sutton for helping with the Concept Lite syntax and also for explaining the requirements of the cross-type EqualityComparable concept and generally improving and formalizing many of the ideas presented here. The article is immeasurably better for his many contributions.


Viewing all articles
Browse latest Browse all 11

Trending Articles