A Visual Introduction to Rust

This is a short introduction to the Rust programming language, intended for programmers with some C or C++ experience.

The tutorial makes use of an interactive visualization system for Rust code being developed by FP Lab. You should read this tutorial before completing the remaining questions, Q2 and Q3.

Motivation

C and C++ are popular languages for low-level systems programming because they give programmers direct control over memory allocation and deallocation. However, these languages are not memory safe. Programs can crash or exhibit security vulnerabilities due to memory-related bugs, such as use-after-free bugs. Programs can also have memory leaks, which occur when memory is not freed even when it is no longer needed.

Most other popular languages are memory safe, but this comes at the cost of run-time performance: they rely on a run-time garbage collector to automatically free memory when it is no longer being used. The overhead of garbage collection can be significant for performance critical tasks.

Rust is designed to be the best of both worlds: it is memory safe without the need for a garbage collector. Instead, it relies on a compile-time ownership and borrowing system to automatically determine when memory can be freed.

The trade-off is that Rust's ownership and borrowing system can be difficult to learn. The purpose of this tutorial is to help you learn Rust's ownership and borrowing system visually.

For example, this tutorial will help you understand code like the following. Hover over the different components of the visualization to see explanations. Don't worry yet about what is going on in detail—these concepts will be explained in this tutorial.

Research Disclosure

Your exercise answers and logs of your interactions with this tool might be used for research purposes. All data used for research purposes will be anonymized: your identity will not be connected to this data. If you wish to opt out, you can contact the instructor (comar@umich.edu) at any time up to seven days after final grades have been issued. Opting out has no impact on your grade.

Click the next button on the right of the page to continue.

Rust Basics

Main Function

In every Rust program, the main function executes first:

fn main() {
    // code here will run first
}

Variables

In Rust, we use let bindings to introduce variables. Variables are immutable or mutable.

Immutable Variables

By default, variables are immutable in Rust. This means that once a value is bound to the variable, the binding cannot be changed. We use let bindings to introduce immutable variables as follows:

fn main() {
    let x = 5;
}

In this example, we introduce a variable x of type i32 (a 32-bit signed integer type) and bind the value 5 to it.

You cannot assign to an immutable variable. So the following example causes a compiler error:

fn main() {
    let x = 5;
    x = 6; // ERROR: cannot assign twice to immutable variable x
}

Mutable Variables

If you want to be able to assign to a variable, it must be marked as mutable with let mut:

fn main() {
    let mut x = 5;
    x = 6; //OK
}

Copies

For simple types like integers, binding and assignment creates a copy. For example, we can bind the value 5 to x and then bind y with a copy of x:

fn main() {
    let x = 5;
    let y = x;
}

Copying occurs only for simple types like i32 and other types that have been marked as copyable (they implement the Copy trait -- we will not discuss traits here). We will discuss how more interesting data structures that are not copyable behave differently in later sections of the tutorial.

Functions

Besides main, we can define additional functions. In the following example, we define a function called plus_one which takes an i32 as input and returns an i32 value that is one more than the input:

fn main() {
    let six = plus_one(5);
}

fn plus_one(x: i32) -> i32 {
    x + 1
}

Notice how there is no explicit return. In Rust, if the last expression in the function body does not end in a semicolon, it is the return value. (Rust also has a return keyword, but we do not use it here.)

Printing to the Terminal

In Rust, we can print to the terminal using println!:

fn main() {
    println!("Hello, world!")
}

This code prints Hello, world! to the terminal, followed by a newline character.

We can also use curly brackets in the input string of println! as a placeholder for subsequent arguments:

fn main() {
    let x = 1;
    let y = 2;
    println!("x = {} and y = {}", x, y);
}

This prints x = 1 and y = 2.

Note that the ! at the end of println! indicates that it is a macro, not a function. It behaves slightly differently from normal functions, but you do not need to worry about the details here.

Ownership

In the previous section, we considered only simple values, like integers. However, in real-world Rust programs, we work with more complex data structures that allocate resources on the heap. When we allocate resources, we need a strategy for de-allocating these resources. Most programming languages use one of two strategies:

  1. Manual Deallocation (C, C++): The programmer is responsible for explicitly deallocating memory, e.g. using free in C or delete in C++. This is performant but can result in critical memory safety issues such as use-after-free bugs, double-free bugs, and memory leaks. These can cause crashes, memory corruption, and security vulnerabilities. In fact, about 70% of security bugs in major software products like Windows and Chrome are due to memory safety issues.

  2. Garbage Collection (OCaml, Java, Python, etc.): The programmer does not have to explicitly deallocate memory. Instead, a garbage collector frees (deallocates) memory by doing a dynamic analysis that detects when no further references to the data remain live. This prevents memory safety bugs. However, a garbage collector can incur sometimes substantial run-time performance overhead.

Rust uses a third strategy—a static (i.e. compile-time) ownership system. Because this is a purely compile-time mechanism, it achieves memory safety without the performance overhead of garbage collection!

The key idea is that each resource in memory has a unique owner, which controls access to that resource. When the owner's lifetime ends (it "dies"), e.g. by going out of scope, the resource is deallocated (in Rust, we say that the resource is dropped.)

Heap-Allocated Strings

For example, heap-allocated strings, of type String, are managed by Rust's ownership system. Consider the following example, which constructs a heap-allocated string and prints it out.

This code prints hello.

The String::from function allocates a String on the heap. The String is initialized from a provided string literal (string literals themselves have a more primitive type, &str, but that detail is not important here.) Ownership of this string resource is moved to the variable s (of type String) when String::from returns on Line 2.

The println! macro does not cause a change in ownership (we say more about println! later.)

At the end of the main function, the variable s goes out of scope. It has ownership of the string resource, so Rust will drop, i.e. deallocate, the resource at this point. We do not need an explicit free or delete like we would in C or C++, nor is there any run-time garbage collection overhead.

Hover over the lines and arrows in the visualization next to the code example above to see a description of the events that occur on each line of code.

Moves

In the example above, we saw that ownership of the heap-allocated string moved to the caller when String::from returned. This is one of several ways in which ownership of a resource can move. We will now consider each situation in more detail.

Binding

Ownership can be moved when initializing a binding with a variable.

In the following example, we define a variable x that owns a String resource. Then, we define another variable, y, initialized with x. This causes ownership of the string resource to be moved from x to y. Note that this behavior is different than than the copying behavior for simple types like integers that we discussed in the previous section.

This code prints hello.

At the end of the function, both x and y go out of scope (their lifetimes have ended). x does not own a resource anymore, so nothing special happens. y does own a resource, so its resource is dropped. Hover over the visualization to see how this works.

Each resource must have a unique owner, so x will no longer own the String resource after it is moved to y. This means that access to the resource through x is no longer possible. Think of it like handing a resource to another person: you no longer have access to it once it has moved. For example, the following generates a compiler error:

fn main() {
    let x = String::from("hello");
    let y = x;
    println!("{}", x) // ERROR: x does not own a resource
}

The compiler error actually says borrow of moved value: x (we will discuss what borrow means in the next section.)

If we move to a variable that has a different scope, e.g. due to curly braces, then you can see by hovering over the visualization that the resource is dropped at the end of y's scope rather than at the end of x's scope.

This code prints hello on one line and Hello, world! on the next.

Assignment

As with binding, ownership can be moved by assignment to a mutable variable, e.g. y in the following example.

When y acquires ownership over x's resource on Line 4, the resource it previously acquired (on Line 3) no longer has an owner, so it is dropped.

Function Call

Ownership is moved into a function when it is called with a resource argument. As an example, below we see that ownership of the string resource in main is moved from s to the takes_ownership function. Consequently, when s goes out of scope at the end of main, there is no owned string resource to be dropped.

This code prints hello.

From the perspective of takes_ownership, it can be assumed that the argument variable some_string will receive ownership of a String resource from the caller (each time it is called). The argument variable some_string goes out of scope at the end of the function, so the resource that it owns is dropped at that point.

Return

Finally, ownership can be returned from a function.

In the following example, f allocates a String and returns it to the caller. Ownership is moved from x to the caller, so there is no owned resource to be dropped at the end of f. Instead, the resource is dropped when the new owner, s, goes out of scope at the end of main. (If the String were dropped at the end of f, there would be a use-after-free bug in main on Line 9!)

This code prints hello.

Borrowing

In the previous section, we learned that each resource has a unique owner. Ownership can be moved—for example, into a function.

In many situations, however, we do not want to permanently move a resource into a function. Instead, we want to retain ownership but allow the function to temporarily access the resource while it executes.

We could accomplish this by having each function agree to return resources of this sort. For example, take_and_return_ownership below takes ownership of a string resource and returns ownership of that exact same resource. The caller, main, assigns the returned resource to the same variable, s.

This code prints hello twice.

The type of take_and_return_ownership does not guarantee that the returned resource is the same as the provided resource. Instead, the programmer has to trust that it returns the same resource.

As code becomes more complex, this pattern of returning all of the provided resources explicitly becomes both syntactically and semantically unwieldy.

Fortunately, Rust offers a powerful solution: passing in arguments via a reference. Taking a reference does not change the owner of a resource. Instead, the reference simply borrows access to the resource temporarily. Rust's borrow checker requires that references to resources do not outlive their owner, to avoid the possibility of there being references to resources that the ownership system has decided can be dropped.

There are two kinds of borrows in Rust, immutable borrows and mutable borrows. These differ in how much access to the resource they provide.

Immutable Borrows

In the following example, we define a function, f, that takes an immutable reference to a String, which has type &String, as input. It then de-references the immutable reference, written *s, in order to print it.

When the main function calls f, it must provide a reference to a String as an argument. Here, we do so by taking a reference to the let-bound variable x on Line 3, written &x. Taking a reference does not cause a change in ownership, so x still owns the string resource in the remainder of main and it can, for example, print x on Line 4. The resource will be dropped when x goes out of scope at the end of main as we discussed previously. Because f takes a reference, it is only borrowing access to the resource that the reference points to. It does not need to explicitly return the resource because it does not own it. Rust knows that the borrow does not outlive the owner because the borrow is no longer accessible after f returns.

This code prints hello twice.

Note: you do not actually need to dereference s to pass it to println! in Rust: it is a macro, so it will automatically dereference or borrow as needed to ensure that a move is not needed. Indeed, Rust does a lot of implicit borrowing and dereferencing to make its syntax simple, as we will see in other examples below.

Methods of the String type, like len for computing the length, typically take their arguments by reference. You can call a method explicitly with a reference, e.g. String::len(&s). As shorthand, you can use dot notation to call a method, e.g. s.len(). This implicitly takes a reference to s.

This code prints len1 = 5 = len2 = 5.

You can keep multiple immutable borrows live at the same time, e.g. y and z in the following example are both live as shown in the visualization. For this reason, immutable borrows are also sometimes called shared borrows: each immutable reference shares access to the resource with the owner and with any other immutable references that might be live.

This code prints hello and hello.

Ownership of a resource cannot be moved while it is borrowed. For example, the following is erroneous:

fn main() {
  let s = String::from("hello");
  let x = &s;
  let s2 = s; // ERROR: cannot move s while a borrow is live
  println!("{}", String::len(x));
}

The compiler error here is: cannot move out of s because it is borrowed.

Mutable Borrows

Unlike immutable borrows, Rust's mutable borrows allow you to mutate the borrowed resource. In the example below, we push (copy) the contents of a String s2 to the end of the heap-allocated String s1 twice, first by explictly calling the String::push_str method, and then using the equivalent shorthand method call syntax. In both cases, the method takes a mutable reference to s1, written explicitly &mut s1.

This code prints Hello, world, world.

Code that does a lot of mutation is notoriously difficult to reason about, so in Rust, mutation is much more carefully controlled than in other imperative languages.

First, you can only take a mutable borrow from a mutable variable, i.e. one bound using let mut like s1 in the example above. Immutability is the default in Rust because it is considered easier to reason about.

Second, mutable borrows are unique—you cannot take a borrow, mutable or immutable, if any mutable borrow is live. This means that you can be certain that no other code will be mutating a resource when you have mutably borrowed it. For this reason, mutable borrows are also sometimes called unique borrows.

For example, the following code is erroneous because a mutable borrow, y, is live.

fn main() {
  let mut x = String::from("hello");
  let y = &mut x;
  f(&x); // ERROR: y is still live
  String::push_str(y, ", world");
}

fn f(x : &String) {
  println!("{}", x);
}

The compiler error here is: cannot borrow x as immutable because it is also borrowed as mutable.

The following code is erroneous for the same reason.

fn main() {
    let mut x = String::from("Hello");
    let y = &mut x; 
    let z = &mut x; // ERROR: y is still live
    String::push_str(y, ", world");
    String::push_str(z, ", friend");
    println!("{}", x);
}

The compiler error here is: cannot borrow x as mutable more than once at a time.

Optional: Threading in Rust

In the example above, the two calls to push_str are sequenced. However, if we wanted to execute them concurrently, we could do so by spawning a thread as follows. Here, || { e } is Rust's notation for an anonymous function taking unit input.

use std::thread;

fn main() {
    let mut x = String::from("Hello");
    let y = &mut x; 
    let z = &mut x; // NOT OK: y is still live
    thread::spawn(|| { String::push_str(y, ", world"); });
    String::push_str(z, ", friend");
    println!("{}", x);
}

If the borrow checker did not stop us, this program would have a race condition—it could print either Hello, world, friend or Hello, friend, world depending on the interleaving of the main thread and the newly spawned thread. By tightly controlling mutation, Rust prevents races mediated by shared mutable state. (The topic of parallelism and concurrency in Rust will be explored further in A9!)

Non-Lexical Lifetimes

Above, we use the phrase "live borrow". A borrow is live if it is in scope and there remain future uses of the borrow. A borrow dies as soon it is no longer needed. So the following code works, even though there are two mutable borrows in the same scope:

This code prints Hello, world, world!!.

Vectors in Rust

The previous sections cover everything you need to know about ownership and borrowing in Rust! This section introduces another interesting data structure: vectors.

Like with other languages, the Rust standard library contains many useful collection types. One of the most useful and common ones are vectors, which have type Vec<T>, where T is the type the that vector holds.

Vectors are heap-allocated, mutable collections that store multiple values of the same type contiguously in memory. In many ways, they are similar C++ vectors and serve similar purposes.

Vectors are implemented with generics, which allow them to hold any type. For example, we can have Vec<i32> and Vec<String> which are the types of i32 vectors and String vectors, respectively. Vectors can hold any struct or enum type as well.

Creating A Vector

Empty Vectors

To make a new empty vector, we can use the Vec::new() function as follows:

fn main() {
    let v: Vec<i32> = Vec::new();
}

Here, Vec::new() creates an empty vector of i32s and moves ownership to v. Note that we included a type annotation to v. This is necessary here because otherwise, Rust won't know which type of vector to create.

Creating Vectors from Initial Values

We can also create new vectors with initial values using the vec! macro:

fn main() {
    let v = vec![1, 2 ,3];
}

Here, we create a new Vec<i32> containing the values 1, 2, and 3 in that order. Note that in this case, we did not need to include a type annotation for v. This is because we are creating the vector with initial values of a specific type, so Rust can figure out the type of v in this case.

Reading Elements of Vectors

Accessing an Element at a Particular Index

We can use the indexing syntax or the get() method to get the value at a particular index of the vector:

fn main() {
    let v = vec![1, 2, 3];

    let third: &i32 = &v[2];
    println!("The third element is {}", third);

    match v.get(2) {
        Some(third) => println!("The third element is {}", third),
        None => println!("There is no third element."),
    }
}

Here, we use both ways of getting a particular element. The first way is using the indexing syntax (square brackets), which gives us an immutable reference to the element. The second way is using the get() method, which returns an Option type.

With the indexing syntax, if we performed an out-of-bounds access in the vector, the program would panic (i.e. cause an unrecoverable error.) With the get() method, an out-of-bounds access would result in the method returning None. With the get() method, we can handle out-of-bounds accesses gracefully rather than causing the program to crash.

Iterating over Elements

We can iterate over elements in a vector with a for loop to read the values:

fn main() {
    let v = vec![1, 2, 3];
    for i in &v {
        println!("{}", i);
    }
}

Here, we simply read the values of the vector and print them to the terminal. Note that the for loop is immutably borrowing v, as shown by the &v.

Mutating Vectors

Push

We can add elements to the back of a vector using the push() method:

fn main() {
    let mut v = Vec::new();
    v.push(1);
    v.push(2);
    v.push(3);
}

This creates an empty vector and adds the values 1, 2, and 3 to the back of the vector in that order. In this case, we did not need a type annotation because the type is inferred from the values we pushed to it. Note that we made v a mutable variable here. If we didn't, the borrow checker would not allow us to make calls to push().

Writing Elements at a Particular Index

We can also write to elements at a particular index in a similar way to how we read elements at a particular index. We can use the indexing syntax or the get_mut() method:

fn main() {
    let mut v = vec![1, 2, 3];

    let second: &mut i32 = &mut v[1];
    *second = 3;

    match v.get_mut(2) {
        Some(third) => *third = 9,
        None => println!("There is no third element."),
    }
}

Here, we use the indexing syntax to get a mutable reference to the second element and change its value to 3. We then use the get_mut() method to get a mutable reference to the third element and change its value to 9.

As with the example for reading elements at a particular index, an out-of-bounds access with the indexing sytanx can cause a panic while an out-of-bounds access with the get_mut() method returns None.

Iterating Over Elements

We can iterate over elements in a vector with a for loop to mutate the values:

fn main() {
    let mut v = vec![1, 2, 3];
    for i in &mut v {
        *i = *i + 1
    }
}

Here, we add 1 to each of the values in the vector. Note that the for loop is mutably borrowing v, as shown by the &mut v.

Optional: Structs in Rust

Creating a struct

To define a struct, we enter the keyword struct and name the entire struct. A struct’s name should describe the significance of the pieces of data being grouped together. Then, inside curly brackets, we define the names and types of the pieces of data, which we call fields. Here is an example showing a struct that stores information about a user account.


#![allow(unused)]
fn main() {
struct User {
    username: String,
    email: String,
    sign_in_count: u64,
    active: bool,
}
}

We create an instance by stating the name of the struct and then add curly brackets containing key: value pairs, where the keys are the names of the fields and the values are the data we want to store in those fields. Then we can use dot field to obtain the value in a struct.


#![allow(unused)]
fn main() {
    let mut user1 = User {
        email: String::from("someone@example.com"),
        username: String::from("someusername123"),
        active: true,
        sign_in_count: 1,
    };

    user1.email = String::from("anotheremail@example.com");
}

Each fields in the struct can be referenced independently. Here's an example of defining a struct, generating an instance of it, letting it interact with functions and referencing field r.h.

Calling a method in a struct

Struct can also include methods whose definition is given in the impl of it. When calling a method or a variable from a struct, we use object.something()or (&object).something(), which are the same. No matter it is a &, &mut, *or nothing, always use . and not need to use -> because Rust will automatically adds in &, &mut, * so object matches the signature of the method.

Ownership of struct data

When the instance of the struct owns all its fields, i.e. no reference or pointer in the struct, the ownership is basically the same with data outside of a struct. It's also possible for fields of a struct to own resources. Here's an example of the cases where one of the field y owns a string resouce.

When the any of the data members is not owned by the struct, it needs lexical lifetime specified to allow the struct owning a reference of a data resouce. This will ensure that the resource referenced will have the same lifetime as the struct as long as they share the same lexical lifetime label.

Here is an example of using lifetime annotations <'a> in struct definitions to allow reference of string &p in a struct Excerpt.

Additional Resources

This tutorial introduced you to the basics of Rust and its ownership and borrowing system. If you are interested in diving deeper into Rust, here are some helpful resources: