Document Number: | P0574r1 |
Date: | 2017-03-02 |
Author: | Anthony
Williams Just Software Solutions Ltd |
Reply-To: | Anthony
Williams Just Software Solutions Ltd |
Bryce Adelstein Lelbach Lawrence Berkeley National Laboratory | |
Audience: | SG1, LWG |
This paper addresses NB comments CH 10.
The C++17 draft features overloads of a lot of the standard library
algorithms which take an ExecutionPolicy
argument, in order to
allow parallel implementations. However, the requirements of many of the
algorithms are written in such a way as to require a sequential
implementation, or a sub-optimal parallel implementation, thus eliminating
some or all of the benefit in the new overloads.
P0523r1 attempts to address this
problem by a blanket relaxation of the constraints for overloads with
an ExecutionPolicy
. I think this is an incorrect solution. It
simultaneously grants too much leeway for some algorithms, while not fixing
the problems with others. Instead, I think it is better to address each
algorithm individually.
In this paper I have gathered those algorithms where the requirements are specified in such a way as to rule out an efficient parallel implementation, and proposed alternative specifications for those algorithms.
This paper has been updated since P0574r0 to incorporate changes suggested by SG1 at Kona 2017. Notably:
std::execution::sequential_policy
is no longer special;
it is treated as any other ExecutionPolicy
from the point
of view of complexity constraints.exclusive_scan
,
inclusive_scan
, transform_exclusive_scan
and transform_inclusive_scan
.It has also been updated to incorporate changes suggested by LWG at Kona 2017, notably:
BinaryOperation
, etc are function objects.last
iterator.unique_copy
, unique_copy_if
and partition_copy
to P0467r2.adjacent_difference
and move it to P0467r2.Modify the complexity for adjacent_find
paragraph 2 as follows:
2 Complexity: For the overloads with noExecutionPolicy
For a nonempty range, exactlymin((i - first) + 1, (last - first) - 1)
applications of the corresponding predicate, wherei
isadjacent_find
's return value. For the overloads with anExecutionPolicy
, O(last - first
) applications of the corresponding predicate.
Modify the requirements for copy_if
in paragraph 12 as follows:
12 Requires: The ranges[first, last)
and[result, result + (last - first))
shall not overlap. [Note: For the overloads with anExecutionPolicy
, there may be a performance cost ifiterator_traits<ForwardIterator1>::value_type
is notMoveConstructible
. — end note]
Modify the requirements for remove_copy
and remove_copy_if
in paragraph 7 as follows:
7 Requires: The ranges[first, last)
and[result, result + (last - first))
shall not overlap. The expression*result = *first
shall be valid. [Note: For the overloads with anExecutionPolicy
, there may be a performance cost ifiterator_traits<ForwardIterator1>::value_type
is notMoveConstructible
. — end note]
Modify the complexity for nth_element
in paragraph 3 as follows:
3 Complexity: For the overloads with noExecutionPolicy
, lLinear on average. For the overloads with anExecutionPolicy
, O(N) applications of the predicate, and O(N log N) swaps, where N =last - first
.
Modify the complexity for partition
in paragraph 7 as follows:
7 Complexity:IfLet N =ForwardIterator
meets the requirements for aBidirectionalIterator
, at most(last - first) / 2
swaps are done; otherwise at mostlast - first
swaps are done. Exactlylast - first
applications of the predicate are done.last - first
:
- For the overloads with no
ExecutionPolicy
, exactly N applications of the predicate. At most N / 2 swaps ifForwardIterator
meets theBidirectionalIterator
requirements and at most N swaps otherwise.- For the overloads with an
ExecutionPolicy
, O(N log N) swaps and O(N) applications of the predicate.
Modify the complexity for stable_partition
in paragraph 11 as follows:
11 Complexity:At most N log(N) swaps, where N =Let N =last - first
, but only O(N) swaps if there is enough extra memory. Exactlylast - first
applications of the predicate.last - first
:
- For the overloads with no
ExecutionPolicy
, at most N log N swaps, but only O(N) swaps if there is enough extra memory. Exactly N applications of the predicate.- For the overloads with an
ExecutionPolicy
, O(N log N) swaps and O(N) applications of the predicate.
Modify the complexity for merge
in paragraph 4 as follows:
4 Complexity: For the overloads with noExecutionPolicy
, aAt most(last1 - first1) + (last2 - first2) - 1
comparisons. For the overloads with anExecutionPolicy
, O((last1 - first1) + (last2 - first2)
) comparisons.
Modify the complexity for inplace_merge
in paragraph 8 as follows:
8 Complexity:When enough additional memory is available,Let N =(last - first) - 1
comparisons. If no additional memory is available, an algorithm with complexity N log(N) may be used, where N =last - first
.last - first
:
- For the overloads with no
ExecutionPolicy
, if enough additional memory is available, exactly N- 1
comparisons.- For the overloads with no
ExecutionPolicy
if no additional memory is available, and for the overloads with anExecutionPolicy
, O(N log N) comparisons.
<numeric>
synopsis [numeric.ops.overview]Add the following paragraph after paragraph 1:
1 The requirements on the types of algorithms' arguments that are described in the introduction to Clause 25 also apply to the following algorithms.
2 Throughout this Clause, the
UnaryOperation
,BinaryOperation
,BinaryOperation1
andBinaryOperation2
parameters are used whenever an algorithm expects a function object (20.14).
Modify the requirements for reduce
in paragraph 5 as follows:
5 Requires:
T
shall beMoveConstructible
(Table 23).- All of
binary_op(init, *first)
,binary_op(*first, init)
,binary_op(init, init)
, andbinary_op(*first, *first)
shall be convertible toT
.binary_op
shall neither invalidate iterators or subranges, nor modify elements in the range[first, last
.)]
Modify the requirements for transform_reduce
in paragraph 1 as follows:
1 Requires:
T
shall beMoveConstructible
(Table 23).- All of
binary_op(init, unary_op(*first))
,binary_op(unary_op(*first), init)
,binary_op(init, init)
, andbinary_op(unary_op(*first), unary_op(*first))
shall be convertible toT
.- Neither
unary_op
norbinary_op
shall invalidate subranges, or modify elements in the range[first, last
.)]
Modify the requirements for exclusive_scan
in paragraph 3 as follows:
3 Requires:
T
shall beMoveConstructible
(Table 23).- All of
binary_op(init, *first)
,binary_op(init, init)
, andbinary_op(*first, *first)
shall be convertible toT
.binary_op
shall neither invalidate iterators or subranges, nor modify elements in the ranges[first, last
or)][result, result + (last - first)
.)]
Modify the requirements for inclusive_scan
in paragraph 3 as follows:
3 Requires:
- If
init
is provided,T
shall beMoveConstructible
(Table 23); otherwise,ForwardIterator1
's value type shall beMoveConstructible
(Table 23).- If
init
is provided, all ofbinary_op(init, *first)
,binary_op(init, init)
, andbinary_op(*first, *first)
shall be convertible toT
; otherwise,binary_op(*first, *first)
shall be convertible toForwardIterator1
's value type.binary_op
shall neither invalidate iterators or subranges, nor modify elements in the ranges[first, last
or)][result, result + (last - first)
.)]
Modify the requirements for transform_exclusive_scan
in paragraph 1 as follows:
1 Requires:
T
shall beMoveConstructible
(Table 23).- All of
binary_op(init, unary_op(*first))
,binary_op(init, init)
, andbinary_op(unary_op(*first), unary_op(*first))
shall be convertible toT
.- Neither
unary_op
norbinary_op
shall invalidate subranges, or modify elements in the ranges[first, last
or)][result, result + (last - first)
.)]
Modify the requirements for transform_inclusive_scan
in paragraph 1 as follows:
1 Requires:Direction to the Editor: Add a footnote to each use of fully-closed ranges in the changes above that apply to 26.8 [numeric.ops] that says "The use of fully closed ranges is intentional".
- If
init
is provided,T
shall beMoveConstructible
(Table 23); otherwise,ForwardIterator1
's value type shall beMoveConstructible
(Table 23).- If
init
is provided, all ofbinary_op(init, unary_op(*first))
,binary_op(init, init)
, andbinary_op(unary_op(*first), unary_op(*first))
shall be convertible toT
; otherwise,binary_op(unary_op(*first), unary_op(*first))
shall be convertible toForwardIterator1
's value type.- Neither
unary_op
norbinary_op
shall invalidate iterators or subranges, or modify elements in the ranges[first, last
or)][result, result + (last - first)
.)]