Path Trimming In Nightly Rust
Posted September 4, 2020 ‐ 5 min read
As of yesterday, the Rust PR (which I had worked on) is merged into Rust nightly, and it has wide implications on compiler errors.
In this post I describe the change and what to expect from it.
The problem with full paths in errors
A simple program such as the following, would result in a type error.
|
The type error can be described as such: cannot compare between values of the
types Vec<Vec<String>>
and Vec<String>
. Before the changes in the PR, this
was almost the first line of the error message, and the rest of the error
message gives us more details about traits:
old output | |
error[E0277]: can't compare `std::vec::Vec<std::string::String>` with `std::string::String`
--> example.rs:5:7
|
5 | a == b;
| ^^ no implementation for `std::vec::Vec<std::string::String> == std::string::String`
|
= help: the trait `std::cmp::PartialEq<std::string::String>` is not implemented for `std::vec::Vec<std::string::String>`
= note: required because of the requirements on the impl of `std::cmp::PartialEq<std::vec::Vec<std::string::String>>` for `std::vec::Vec<std::vec::Vec<std::string::String>>`
|
It is surely noticeable that in the above error the greatest contribution to
cognitive burden is the fully qualified paths (e.g. std::vec::Vec
) of types
and traits. It has made a significant readability difference for many people.
Enter path trimming
In the large majority of cases there would be only one Vec
symbol and one
String
symbol that is importable through the entire program being linked, for
all crates that are available. Surely that there are crates existing that
define items named Vec
, but they are rare, and the situation that the user
defines Vec
is rare.
Considering the observation that the overlap between module namespaces is
rather minimal, we can do a uniqueness check that verifies that uniqueness
holds, i.e, that Vec
and String
are unique as items defined in the
compilation. Even if that's not the case, the compilation would still succeed
without any new warning. But for the unique symbols, we don't have to print the
entire path in warnings and errors, and we can thus trim it to the last
component - the symbol itself.
With trimming according to uniqueness, the following error is printed instead:
new output | |
error[E0277]: can't compare `Vec<String>` with `String`
--> example.rs:5:7
|
5 | a == b;
| ^^ no implementation for `Vec<String> == String`
|
= help: the trait `PartialEq<String>` is not implemented for `Vec<String>`
= note: required because of the requirements on the impl of `PartialEq<Vec<String>>` for `Vec<Vec<String>>`
|
This behavior can be controlled using a new debug option -Z trim-diagnostic-paths=false
, and it is enabled by default only for rustc
itself.
As for the toll it takes on the compiler, it is similar to the algorithm that computes 'use suggestions' on errors caused by undefined identifiers. This means iterating all importable symbols of the entire program or library being linked. Since this may be heavy, we made sure it is only done in case there are warnings or errors by the compiler. If that assertion is invalidated, it's a bug, and you'd see a panic related to trimmed paths.
Trimming considerations
Trimming is done only relative to what the currently built crate does, so -
-
All the local definitions in the built crate are considered, regardless of whether they are exported from it or not. This is different from how external crates are treated, where only the externally visible and importable definitions are taken into account.
-
Trimming is considered between all crates including the one being built, so if you define a
Vec
type anywhere in your crate, thenVec
name will no longer be considered unique because anotherVec
can be imported fromstd::vec
too, and thus the full paths of both types will be printed as just as before. -
Because several glob imports (i.e.
use foo::*;
) can happen in a single place, it wouldn't be clear which items they bring if we trim the paths that are related to these items. Thus, glob imports cancel out the uniqueness of the symbols that they import.
What's next
This change in behavior will probably go under some refinements and more
testing until it reaches stable Rust. There are expected follow-ups, for
instance, to allow some ambiguity, as not all items are treated equal. For
example between the Result
type alias in std::io
, and the Result
type
itself from std::result
.
Thanks
The change has been hard to maintain as a PR, as it affected more than 1000 unit tests. It has gone through several revisions until the implementation was good.
However, despite being not a frequent Rust compiler contributor where most of the code involved was new to me, it has been greatly instructive to rely on long-term members of the Rust compiler team. There have been folks who were crucial in reviewing, so I'd like to thank them — Vadim Petrochenkov, Eduard Burtescu, Esteban Kuber, and also other contributors — Aaron Hill, and luzato for their help.