Testcase: map-reduce
Rust makes it very easy to parallelise data processing, without many of the headaches traditionally associated with such an attempt.
The standard library provides great threading primitives out of the box. These, combined with Rust's concept of Ownership and aliasing rules, automatically prevent data races.
The aliasing rules (one writable reference XOR many readable references) automatically prevent
you from manipulating state that is visible to other threads. (Where synchronisation is needed,
there are synchronisation
primitives like Mutex
es or Channel
s.)
In this example, we will calculate the sum of all digits in a block of numbers. We will do this by parcelling out chunks of the block into different threads. Each thread will sum its tiny block of digits, and subsequently we will sum the intermediate sums produced by each thread.
Note that, although we're passing references across thread boundaries, Rust understands that we're
only passing read-only references, and that thus no unsafety or data races can occur. Also because
the references we're passing have 'static
lifetimes, Rust understands that our data won't be
destroyed while these threads are still running. (When you need to share non-static
data between
threads, you can use a smart pointer like Arc
to keep the data alive and avoid non-static
lifetimes.)
we're move
-ing the data segments into the thread, Rust will also ensure the data is kept alive
until the threads exit, so no dangling pointers occur.
Assignments
It is not wise to let our number of threads depend on user inputted data. What if the user decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the program so that the data is always chunked into a limited number of chunks, defined by a static constant at the beginning of the program.
参照
- Threads
- vectors and iterators
- closures, move semantics and
move
closures - destructuring assignments
- turbofish notation to help type inference
- unwrap vs. expect
- enumerate