A Visual Introduction to Rust
This is a short introduction to the Rust programming language, intended for programmers with some C or C++ experience.
The tutorial makes use of an interactive visualization system for Rust code being developed by FP Lab. You should read this tutorial before completing the remaining questions, Q2 and Q3.
Motivation
C and C++ are popular languages for low-level systems programming because they give programmers direct control over memory allocation and deallocation. However, these languages are not memory safe. Programs can crash or exhibit security vulnerabilities due to memory-related bugs, such as use-after-free bugs. Programs can also have memory leaks, which occur when memory is not freed even when it is no longer needed.
Most other popular languages are memory safe, but this comes at the cost of run-time performance: they rely on a run-time garbage collector to automatically free memory when it is no longer being used. The overhead of garbage collection can be significant for performance critical tasks.
Rust is designed to be the best of both worlds: it is memory safe without the need for a garbage collector. Instead, it relies on a compile-time ownership and borrowing system to automatically determine when memory can be freed.
The trade-off is that Rust's ownership and borrowing system can be difficult to learn. The purpose of this tutorial is to help you learn Rust's ownership and borrowing system visually.
For example, this tutorial will help you understand code like the following. Hover over the different components of the visualization to see explanations. Don't worry yet about what is going on in detail—these concepts will be explained in this tutorial.
Research Disclosure
Your exercise answers and logs of your interactions with this tool might be used for research purposes. All data used for research purposes will be anonymized: your identity will not be connected to this data. If you wish to opt out, you can contact the instructor (comar@umich.edu) at any time up to seven days after final grades have been issued. Opting out has no impact on your grade.
Click the next button on the right of the page to continue.
Rust Basics
Main Function
In every Rust program, the main
function executes first:
fn main() { // code here will run first }
Variables
In Rust, we use let
bindings to introduce variables. Variables are immutable
or mutable.
Immutable Variables
By default, variables are immutable in Rust. This means that once a value is
bound to the variable, the binding cannot be changed. We use let
bindings to
introduce immutable variables as follows:
fn main() { let x = 5; }
In this example, we introduce a variable x
of type i32
(a 32-bit signed
integer type) and bind the value 5
to it.
You cannot assign to an immutable variable. So the following example causes a compiler error:
fn main() { let x = 5; x = 6; // ERROR: cannot assign twice to immutable variable x }
Mutable Variables
If you want to be able to assign to a variable, it must be marked as mutable
with let mut
:
fn main() { let mut x = 5; x = 6; //OK }
Copies
For simple types like integers, binding and assignment creates a copy.
For example, we can bind the value 5
to x
and then bind y
with a copy of x
:
fn main() { let x = 5; let y = x; }
Copying occurs only for simple types like i32
and other types that
have been marked as copyable (they implement the Copy
trait -- we will not
discuss traits here).
We will discuss how more interesting data
structures that are not copyable behave differently in later sections
of the tutorial.
Functions
Besides main
, we can define additional functions. In the following example, we
define a function called plus_one
which takes an i32
as input and returns an
i32
value that is one more than the input:
fn main() { let six = plus_one(5); } fn plus_one(x: i32) -> i32 { x + 1 }
Notice how there is no explicit return. In Rust, if the last expression in the
function body does not end in a semicolon, it is the return value. (Rust also
has a return
keyword, but we do not use it here.)
Printing to the Terminal
In Rust, we can print to the terminal using println!
:
fn main() { println!("Hello, world!") }
This code prints Hello, world!
to the terminal, followed by a newline
character.
We can also use curly brackets in the input string of println!
as a
placeholder for subsequent arguments:
fn main() { let x = 1; let y = 2; println!("x = {} and y = {}", x, y); }
This prints x = 1 and y = 2
.
Note that the !
at the end of println!
indicates that it is a macro, not a
function. It behaves slightly differently from normal functions, but you do not
need to worry about the details here.
Ownership
In the previous section, we considered only simple values, like integers. However, in real-world Rust programs, we work with more complex data structures that allocate resources on the heap. When we allocate resources, we need a strategy for de-allocating these resources. Most programming languages use one of two strategies:
-
Manual Deallocation (C, C++): The programmer is responsible for explicitly deallocating memory, e.g. using
free
in C ordelete
in C++. This is performant but can result in critical memory safety issues such as use-after-free bugs, double-free bugs, and memory leaks. These can cause crashes, memory corruption, and security vulnerabilities. In fact, about 70% of security bugs in major software products like Windows and Chrome are due to memory safety issues. -
Garbage Collection (OCaml, Java, Python, etc.): The programmer does not have to explicitly deallocate memory. Instead, a garbage collector frees (deallocates) memory by doing a dynamic analysis that detects when no further references to the data remain live. This prevents memory safety bugs. However, a garbage collector can incur sometimes substantial run-time performance overhead.
Rust uses a third strategy—a static (i.e. compile-time) ownership system. Because this is a purely compile-time mechanism, it achieves memory safety without the performance overhead of garbage collection!
The key idea is that each resource in memory has a unique owner, which controls access to that resource. When the owner's lifetime ends (it "dies"), e.g. by going out of scope, the resource is deallocated (in Rust, we say that the resource is dropped.)
Heap-Allocated Strings
For example, heap-allocated strings, of type String
, are managed by Rust's ownership system.
Consider the following example, which constructs a heap-allocated string and
prints it out.
This code prints hello
.
The String::from
function allocates a String
on the heap. The String
is
initialized from a provided string literal (string literals themselves have a
more primitive type, &str
, but that detail is not important here.) Ownership
of this string resource is moved to the variable s
(of type String
) when
String::from
returns on Line 2.
The println!
macro does not cause a change in ownership (we say more about
println!
later.)
At the end of the main
function, the variable s
goes out of scope. It has
ownership of the string resource, so Rust will drop, i.e. deallocate, the
resource at this point. We do not need an explicit free
or delete
like we
would in C or C++, nor is there any run-time garbage collection overhead.
Hover over the lines and arrows in the visualization next to the code example above to see a description of the events that occur on each line of code.
Moves
In the example above, we saw that ownership of the heap-allocated string moved
to the caller when String::from
returned. This is one of several ways in which
ownership of a resource can move. We will now consider each situation in
more detail.
Binding
Ownership can be moved when initializing a binding with a variable.
In the following example, we define a variable x
that owns a String
resource. Then, we define another variable, y
, initialized with x
. This
causes ownership of the string resource to be moved from x
to y
. Note that
this behavior is different than than the copying behavior for simple types like
integers that we discussed in the previous section.
This code prints hello
.
At the end of the function, both x
and y
go out of scope (their lifetimes
have ended). x
does not own a resource anymore, so nothing special happens.
y
does own a resource, so its resource is dropped. Hover over the
visualization to see how this works.
Each resource must have a unique owner, so x
will no longer own the String
resource after it is moved to y
. This means that access to the resource
through x
is no longer possible. Think of it like handing a resource to
another person: you no longer have access to it once it has moved. For
example, the following generates a compiler error:
fn main() { let x = String::from("hello"); let y = x; println!("{}", x) // ERROR: x does not own a resource }
The compiler error actually says borrow of moved value: x
(we will discuss what
borrow means in the next section.)
If we move to a variable that has a different scope, e.g. due to curly braces,
then you can see by
hovering over the visualization that the resource is dropped at the end of y
's
scope rather than at the end of x
's scope.
This code prints hello
on one line and Hello, world!
on the next.
Assignment
As with binding, ownership can be moved by assignment to a mutable variable,
e.g. y
in the following example.
When y
acquires ownership over x
's resource on Line 4, the resource it
previously acquired (on Line 3) no longer has an owner, so it is dropped.
Function Call
Ownership is moved into a function when it is called with a resource argument.
As an example,
below we see that ownership of the string resource in main
is moved from s
to the takes_ownership
function. Consequently, when s
goes out of scope at
the end of main
, there is no owned string resource to be dropped.
This code prints hello
.
From the perspective of takes_ownership
, it can be assumed that the argument
variable some_string
will receive ownership of a String
resource from the
caller (each time it is called). The argument variable some_string
goes out of
scope at the end of the function, so the resource that it owns is dropped at
that point.
Return
Finally, ownership can be returned from a function.
In the following example, f
allocates a String
and returns it to the
caller. Ownership is moved from x
to the caller, so there is no owned resource
to be dropped at the end of f
. Instead, the resource is dropped when the new
owner, s
, goes out of scope at the end of main
. (If the String
were
dropped at the end of f
, there would be a use-after-free bug in main
on Line
9!)
This code prints hello
.
Borrowing
In the previous section, we learned that each resource has a unique owner. Ownership can be moved—for example, into a function.
In many situations, however, we do not want to permanently move a resource into a function. Instead, we want to retain ownership but allow the function to temporarily access the resource while it executes.
We could accomplish this by having each function agree to return resources of this
sort. For
example, take_and_return_ownership
below takes ownership of a string
resource and returns ownership of that exact same resource. The caller, main
,
assigns the returned resource to the same variable, s
.
This code prints hello
twice.
The type of
take_and_return_ownership
does not guarantee that the returned resource is the
same as the provided resource. Instead, the programmer has to trust that it returns
the same resource.
As code becomes more complex, this pattern of returning all of the provided resources explicitly becomes both syntactically and semantically unwieldy.
Fortunately, Rust offers a powerful solution: passing in arguments via a reference. Taking a reference does not change the owner of a resource. Instead, the reference simply borrows access to the resource temporarily. Rust's borrow checker requires that references to resources do not outlive their owner, to avoid the possibility of there being references to resources that the ownership system has decided can be dropped.
There are two kinds of borrows in Rust, immutable borrows and mutable borrows. These differ in how much access to the resource they provide.
Immutable Borrows
In the following example, we define a function, f
, that takes an immutable
reference to a String
, which has type &String
, as input. It then de-references
the immutable reference, written *s
, in order to print it.
When the main
function calls f
, it must provide a reference to a String
as
an argument. Here, we do so by taking a reference to the let-bound variable x
on Line 3, written &x
. Taking a reference does not cause a change in
ownership, so x
still owns the string resource in the remainder of main
and it can, for example, print x
on Line 4. The resource will be dropped when
x
goes out of scope at the end of main
as we discussed previously. Because f
takes a reference, it is only borrowing access to the resource that the
reference points to. It does not need to explicitly return the resource because
it does not own it. Rust knows that the borrow does not outlive the owner
because the borrow is no longer accessible after f
returns.
This code prints hello
twice.
Note: you do not actually need to dereference s
to pass it to println!
in Rust:
it is a macro, so it will automatically dereference or borrow as needed
to ensure that a move is not needed. Indeed, Rust does a lot of implicit
borrowing and dereferencing to make its syntax simple, as we will see in other examples
below.
Methods of the String
type, like len
for computing the length, typically
take their arguments by reference. You can call a method explicitly with a
reference, e.g. String::len(&s)
. As shorthand, you can use dot notation to
call a method, e.g. s.len()
. This implicitly takes a reference to s
.
This code prints len1 = 5 = len2 = 5
.
You can keep multiple immutable borrows live at the same time, e.g. y
and z
in the following example are both live as shown in the visualization. For this
reason, immutable borrows are also sometimes called shared borrows: each
immutable reference shares access to the resource with the owner and with any
other immutable references that might be live.
This code prints hello and hello
.
Ownership of a resource cannot be moved while it is borrowed. For example, the following is erroneous:
fn main() { let s = String::from("hello"); let x = &s; let s2 = s; // ERROR: cannot move s while a borrow is live println!("{}", String::len(x)); }
The compiler error here is: cannot move out of s because it is borrowed
.
Mutable Borrows
Unlike immutable borrows, Rust's mutable borrows allow you to mutate the
borrowed resource. In the example below, we push (copy) the contents of a String
s2
to the end of the heap-allocated String
s1
twice, first by explictly calling
the String::push_str
method, and then using the equivalent shorthand method
call syntax. In both cases, the method takes a mutable reference to s1
,
written explicitly &mut s1
.
This code prints Hello, world, world
.
Code that does a lot of mutation is notoriously difficult to reason about, so in Rust, mutation is much more carefully controlled than in other imperative languages.
First, you can only take a mutable borrow from a mutable variable, i.e. one
bound using let mut
like s1
in the example above. Immutability is the
default in Rust because it is considered easier to reason about.
Second, mutable borrows are unique—you cannot take a borrow, mutable or immutable, if any mutable borrow is live. This means that you can be certain that no other code will be mutating a resource when you have mutably borrowed it. For this reason, mutable borrows are also sometimes called unique borrows.
For example, the following code is erroneous because a mutable borrow, y
, is
live.
fn main() { let mut x = String::from("hello"); let y = &mut x; f(&x); // ERROR: y is still live String::push_str(y, ", world"); } fn f(x : &String) { println!("{}", x); }
The compiler error here is: cannot borrow x as immutable because it is also borrowed as mutable
.
The following code is erroneous for the same reason.
fn main() { let mut x = String::from("Hello"); let y = &mut x; let z = &mut x; // ERROR: y is still live String::push_str(y, ", world"); String::push_str(z, ", friend"); println!("{}", x); }
The compiler error here is: cannot borrow x as mutable more than once at a time
.
Optional: Threading in Rust
In the example above, the two calls to push_str
are sequenced. However, if we
wanted to execute them concurrently, we could do so by spawning a thread as
follows. Here, || { e }
is Rust's notation for an anonymous function taking
unit input.
use std::thread; fn main() { let mut x = String::from("Hello"); let y = &mut x; let z = &mut x; // NOT OK: y is still live thread::spawn(|| { String::push_str(y, ", world"); }); String::push_str(z, ", friend"); println!("{}", x); }
If the borrow checker did not stop us, this program would have a race
condition—it could print either Hello, world, friend
or Hello, friend, world
depending on the interleaving of the main thread and the newly spawned thread.
By tightly controlling mutation, Rust prevents races mediated by shared mutable state.
(The topic of parallelism and concurrency in Rust will be explored further in A9!)
Non-Lexical Lifetimes
Above, we use the phrase "live borrow". A borrow is live if it is in scope and there remain future uses of the borrow. A borrow dies as soon it is no longer needed. So the following code works, even though there are two mutable borrows in the same scope:
This code prints Hello, world, world!!
.
Vectors in Rust
The previous sections cover everything you need to know about ownership and borrowing in Rust! This section introduces another interesting data structure: vectors.
Like with other languages, the Rust standard library contains many useful
collection types. One of the most useful and common ones are vectors, which
have type Vec<T>
, where T
is the type the that vector holds.
Vectors are heap-allocated, mutable collections that store multiple values of
the same type contiguously in memory. In many ways, they are similar C++
vector
s and serve similar purposes.
Vectors are implemented with generics, which allow them to hold any type.
For example, we can have Vec<i32>
and Vec<String>
which are the types of
i32
vectors and String
vectors, respectively. Vectors can hold any
struct
or enum
type as well.
Creating A Vector
Empty Vectors
To make a new empty vector, we can use the Vec::new()
function as follows:
fn main() { let v: Vec<i32> = Vec::new(); }
Here, Vec::new()
creates an empty vector of i32
s and moves ownership to v
.
Note that we included a type annotation to v
. This is necessary here because
otherwise, Rust won't know which type of vector to create.
Creating Vectors from Initial Values
We can also create new vectors with initial values using the vec!
macro:
fn main() { let v = vec![1, 2 ,3]; }
Here, we create a new Vec<i32>
containing the values 1
, 2
, and 3
in
that order. Note that in this case, we did not need to include a type annotation
for v
. This is because we are creating the vector with initial values of a
specific type, so Rust can figure out the type of v
in this case.
Reading Elements of Vectors
Accessing an Element at a Particular Index
We can use the indexing syntax or the get()
method to get the value at a
particular index of the vector:
fn main() { let v = vec![1, 2, 3]; let third: &i32 = &v[2]; println!("The third element is {}", third); match v.get(2) { Some(third) => println!("The third element is {}", third), None => println!("There is no third element."), } }
Here, we use both ways of getting a particular element. The first way is using
the indexing syntax (square brackets), which gives us an immutable reference to
the element. The second way is using the get()
method, which returns an
Option
type.
With the indexing syntax, if we performed an out-of-bounds access in the vector,
the program would panic (i.e. cause an unrecoverable error.) With the get()
method, an out-of-bounds access would result in the method returning None
.
With the get()
method, we can handle out-of-bounds accesses gracefully rather
than causing the program to crash.
Iterating over Elements
We can iterate over elements in a vector with a for
loop to read the values:
fn main() { let v = vec![1, 2, 3]; for i in &v { println!("{}", i); } }
Here, we simply read the values of the vector and print them to the terminal.
Note that the for
loop is immutably borrowing v
, as shown by the &v
.
Mutating Vectors
Push
We can add elements to the back of a vector using the push()
method:
fn main() { let mut v = Vec::new(); v.push(1); v.push(2); v.push(3); }
This creates an empty vector and adds the values 1
, 2
, and 3
to the back
of the vector in that order. In this case, we did not need a type annotation
because the type is inferred from the values we pushed to it. Note that we made
v
a mutable variable here. If we didn't, the borrow checker would not allow
us to make calls to push()
.
Writing Elements at a Particular Index
We can also write to elements at a particular index in a similar way to how
we read elements at a particular index. We can use the indexing syntax or the
get_mut()
method:
fn main() { let mut v = vec![1, 2, 3]; let second: &mut i32 = &mut v[1]; *second = 3; match v.get_mut(2) { Some(third) => *third = 9, None => println!("There is no third element."), } }
Here, we use the indexing syntax to get a mutable reference to the second
element and change its value to 3
. We then use the get_mut()
method to get
a mutable reference to the third element and change its value to 9
.
As with the example for reading elements at a particular index, an out-of-bounds
access with the indexing sytanx can cause a panic
while an out-of-bounds
access with the get_mut()
method returns None
.
Iterating Over Elements
We can iterate over elements in a vector with a for
loop to mutate the values:
fn main() { let mut v = vec![1, 2, 3]; for i in &mut v { *i = *i + 1 } }
Here, we add 1
to each of the values in the vector. Note that the for
loop
is mutably borrowing v
, as shown by the &mut v
.
Optional: Structs in Rust
Creating a struct
To define a struct, we enter the keyword struct
and name the entire struct. A struct’s name should describe the significance of the pieces of data being grouped together. Then, inside curly brackets, we define the names and types of the pieces of data, which we call fields. Here is an example showing a struct that stores information about a user account.
#![allow(unused)] fn main() { struct User { username: String, email: String, sign_in_count: u64, active: bool, } }
We create an instance by stating the name of the struct and then add curly brackets containing key: value
pairs, where the keys are the names of the fields and the values are the data we want to store in those fields. Then we can use dot field to obtain the value in a struct.
#![allow(unused)] fn main() { let mut user1 = User { email: String::from("someone@example.com"), username: String::from("someusername123"), active: true, sign_in_count: 1, }; user1.email = String::from("anotheremail@example.com"); }
Each fields in the struct can be referenced independently. Here's an example of defining a struct, generating an instance of it, letting it interact with functions and referencing field r.h
.
Calling a method in a struct
Struct can also include methods whose definition is given in the impl
of it. When calling a method or a variable from a struct, we use object.something()
or (&object).something()
, which are the same. No matter it is a &, &mut, *
or nothing, always use .
and not need to use ->
because Rust will automatically adds in &, &mut, *
so object
matches the signature of the method.
Ownership of struct data
When the instance of the struct owns all its fields, i.e. no reference or pointer in the struct, the ownership is basically the same with data outside of a struct. It's also possible for fields of a struct to own resources. Here's an example of the cases where one of the field y
owns a string
resouce.
When the any of the data members is not owned by the struct, it needs lexical lifetime specified to allow the struct owning a reference of a data resouce. This will ensure that the resource referenced will have the same lifetime as the struct as long as they share the same lexical lifetime label.
Here is an example of using lifetime annotations <'a>
in struct definitions to allow reference of string &p
in a struct Excerpt
.
Additional Resources
This tutorial introduced you to the basics of Rust and its ownership and borrowing system. If you are interested in diving deeper into Rust, here are some helpful resources:
- The Rust Programming Language (a.k.a. "the book"): An overview of Rust from first principles
- Rustlings: Small exercises in Rust
- Rust By Example: Introduces Rust concepts by through examples, as the name suggests
- Learn Rust: Contains links to many different Rust learning resources including the ones above