Path Trimming In Nightly Rust

Posted September 4, 2020 ‐ 5 min read

As of yesterday, the Rust PR (which I had worked on) is merged into Rust nightly, and it has wide implications on compiler errors.

In this post I describe the change and what to expect from it.

The problem with full paths in errors

A simple program such as the following, would result in a type error.

fn main() {
    let a = vec![vec![String::from("a")]];
    let b = vec![String::from("b")];
    a == b;
}

The type error can be described as such: cannot compare between values of the types Vec<Vec<String>> and Vec<String>. Before the changes in the PR, this was almost the first line of the error message, and the rest of the error message gives us more details about traits:

old output
error[E0277]: can't compare `std::vec::Vec<std::string::String>` with `std::string::String`
 --> example.rs:5:7
  |
5 |     a == b;
  |       ^^ no implementation for `std::vec::Vec<std::string::String> == std::string::String`
  |
  = help: the trait `std::cmp::PartialEq<std::string::String>` is not implemented for `std::vec::Vec<std::string::String>`
  = note: required because of the requirements on the impl of `std::cmp::PartialEq<std::vec::Vec<std::string::String>>` for `std::vec::Vec<std::vec::Vec<std::string::String>>`

It is surely noticeable that in the above error the greatest contribution to cognitive burden is the fully qualified paths (e.g. std::vec::Vec) of types and traits. It has made a significant readability difference for many people.

Enter path trimming

In the large majority of cases there would be only one Vec symbol and one String symbol that is importable through the entire program being linked, for all crates that are available. Surely that there are crates existing that define items named Vec, but they are rare, and the situation that the user defines Vec is rare.

Considering the observation that the overlap between module namespaces is rather minimal, we can do a uniqueness check that verifies that uniqueness holds, i.e, that Vec and String are unique as items defined in the compilation. Even if that's not the case, the compilation would still succeed without any new warning. But for the unique symbols, we don't have to print the entire path in warnings and errors, and we can thus trim it to the last component - the symbol itself.

With trimming according to uniqueness, the following error is printed instead:

new output
error[E0277]: can't compare `Vec<String>` with `String`
 --> example.rs:5:7
  |
5 |     a == b;
  |       ^^ no implementation for `Vec<String> == String`
  |
  = help: the trait `PartialEq<String>` is not implemented for `Vec<String>`
  = note: required because of the requirements on the impl of `PartialEq<Vec<String>>` for `Vec<Vec<String>>`

This behavior can be controlled using a new debug option -Z trim-diagnostic-paths=false, and it is enabled by default only for rustc itself.

As for the toll it takes on the compiler, it is similar to the algorithm that computes 'use suggestions' on errors caused by undefined identifiers. This means iterating all importable symbols of the entire program or library being linked. Since this may be heavy, we made sure it is only done in case there are warnings or errors by the compiler. If that assertion is invalidated, it's a bug, and you'd see a panic related to trimmed paths.

Trimming considerations

Trimming is done only relative to what the currently built crate does, so -

  • All the local definitions in the built crate are considered, regardless of whether they are exported from it or not. This is different from how external crates are treated, where only the externally visible and importable definitions are taken into account.

  • Trimming is considered between all crates including the one being built, so if you define a Vec type anywhere in your crate, then Vec name will no longer be considered unique because another Vec can be imported from std::vec too, and thus the full paths of both types will be printed as just as before.

  • Because several glob imports (i.e. use foo::*;) can happen in a single place, it wouldn't be clear which items they bring if we trim the paths that are related to these items. Thus, glob imports cancel out the uniqueness of the symbols that they import.

What's next

This change in behavior will probably go under some refinements and more testing until it reaches stable Rust. There are expected follow-ups, for instance, to allow some ambiguity, as not all items are treated equal. For example between the Result type alias in std::io, and the Result type itself from std::result.

Thanks

The change has been hard to maintain as a PR, as it affected more than 1000 unit tests. It has gone through several revisions until the implementation was good.

However, despite being not a frequent Rust compiler contributor where most of the code involved was new to me, it has been greatly instructive to rely on long-term members of the Rust compiler team. There have been folks who were crucial in reviewing, so I'd like to thank them — Vadim Petrochenkov, Eduard Burtescu, Esteban Kuber, and also other contributors — Aaron Hill, and luzato for their help.


Share this post
Follow author