Iterator::fold is a little slow compared to bare loop

## Background

I was writing a [small project to compare the memory impact of iterators](https://github.com/mlodato517/iterator_comparison). I decided to also investigate the runtime impact and see how close Rust came to providing various zero cost abstractions to iteration. I compared three types of iteration:
```rust
fn filter_map_filter(nums: &[u64]) -> Vec<u64> {
  nums.iter().filter(...).map(...).filter(...).collect()
}

fn fold(nums: &[u64]) -> Vec<u64> {
  nums.iter().fold(Vec::new(), ...)
}

fn raw(nums: &[u64]) -> Vec<u64> {
  let mut result = Vec::new();
  for n in nums { ... }
  result
}
```
and saw this Criterion output plot:
<img width="1158" alt="criterion-plot" src="https://user-images.githubusercontent.com/18740355/93150001-fe14fd00-f6c5-11ea-83ab-8e3b5dc82382.png">

The outliers to the right are the timings of the `fold` functions while the two pairs on the left are the `filter_map_filter` and `raw` versions. Rust did a great job ensuring `.filter.map.filter` was the same speed as a raw loop but `.fold` seemed to be lacking.

## Quick Investigation

Looking at the [source for `.fold`](https://doc.rust-lang.org/src/core/iter/traits/iterator.rs.html#2015) the `accum` is reassigned with the result of each invocation of `f`. I quickly tested if this could be improved with a `&mut` instead in [this PR](https://github.com/mlodato517/iterator_comparison/pull/2). The result was surprising (to me):

<img width="1248" alt="criterion-plot-mut-ref" src="https://user-images.githubusercontent.com/18740355/93150335-db371880-f6c6-11ea-9821-f26cc159d86e.png">

The "custom fold" method was faster than all the other options (which doesn't make a ton of sense to me but that's what y'all are here for!).

## Path Forward

I initially was going to suggest adding some sort of `fold_mut` or some better named method to allow for this faster `fold` iterator. This could be a performance improvement in some areas and could also improve the syntax when the closure couldn't "easily" return the new accumulator:
```rust
iter.fold(Vec::new(), |v, x| {
  v.push(x);
  v // this line is a little weird
})
```

I made [a branch for this](https://github.com/rust-lang/rust/compare/master...mlodato517:ml-fold-mut?expand=1) if we want to head in that direction (the tests are slim, the benchmarks are probably overkill, the stability is missing, and the docs are probably slim and improperly formatted but it's a start!) and saw some improvements in the benchmarks I added:

![benchmarks](https://user-images.githubusercontent.com/18740355/93152131-dcb70f80-f6cb-11ea-8582-873a8a307e6a.png)


Now I'm not sure if this "`fold_mut`" path is the right way to go - I'm not sure if it's awkward or dangerous. It seems similar to Ruby's [`each_with_object`](https://ruby-doc.org/core-2.4.1/Enumerable.html#method-i-each_with_object) so there's maybe something there. It could also be that with some compiler witchcraft we can just make `fold` a "true" zero cost abstraction.

In any case, thought I'd post here instead of making a PR so we could decided if there _should_ be any PR and I'm happy to help with whatever path forward we choose!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iterator::fold is a little slow compared to bare loop #76725

Background

Quick Investigation

Path Forward

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Iterator::fold is a little slow compared to bare loop #76725

Description

Background

Quick Investigation

Path Forward

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions