From 4185dca09501ae8b941457563c5658d4d5d98d5b Mon Sep 17 00:00:00 2001 From: Boxy Date: Fri, 6 Jun 2025 20:20:06 +0100 Subject: [PATCH 1/4] Stub chapter and consolidate under `/hir/` --- src/SUMMARY.md | 5 +++-- src/diagnostics.md | 2 +- src/hir.md | 2 +- src/hir/ambig-unambig-ty-and-consts.md | 1 + src/{hir-debugging.md => hir/debugging.md} | 0 src/{ast-lowering.md => hir/lowering.md} | 2 +- src/overview.md | 2 +- 7 files changed, 8 insertions(+), 6 deletions(-) create mode 100644 src/hir/ambig-unambig-ty-and-consts.md rename src/{hir-debugging.md => hir/debugging.md} (100%) rename src/{ast-lowering.md => hir/lowering.md} (97%) diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 2acc3c219..50a3f44ad 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -121,8 +121,9 @@ - [Feature gate checking](./feature-gate-ck.md) - [Lang Items](./lang-items.md) - [The HIR (High-level IR)](./hir.md) - - [Lowering AST to HIR](./ast-lowering.md) - - [Debugging](./hir-debugging.md) + - [Lowering AST to HIR](./hir/lowering.md) + - [Ambig/Unambig Types and Consts](./hir/ambig-unambig-ty-and-consts.md) + - [Debugging](./hir/debugging.md) - [The THIR (Typed High-level IR)](./thir.md) - [The MIR (Mid-level IR)](./mir/index.md) - [MIR construction](./mir/construction.md) diff --git a/src/diagnostics.md b/src/diagnostics.md index 01e59c919..33f5441d3 100644 --- a/src/diagnostics.md +++ b/src/diagnostics.md @@ -553,7 +553,7 @@ compiler](#linting-early-in-the-compiler). [AST nodes]: the-parser.md -[AST lowering]: ast-lowering.md +[AST lowering]: ./hir/lowering.md [HIR nodes]: hir.md [MIR nodes]: mir/index.md [macro expansion]: macro-expansion.md diff --git a/src/hir.md b/src/hir.md index 0c1c99415..72fb10701 100644 --- a/src/hir.md +++ b/src/hir.md @@ -5,7 +5,7 @@ The HIR – "High-Level Intermediate Representation" – is the primary IR used in most of rustc. It is a compiler-friendly representation of the abstract syntax tree (AST) that is generated after parsing, macro expansion, and name -resolution (see [Lowering](./ast-lowering.html) for how the HIR is created). +resolution (see [Lowering](./hir/lowering.md) for how the HIR is created). Many parts of HIR resemble Rust surface syntax quite closely, with the exception that some of Rust's expression forms have been desugared away. For example, `for` loops are converted into a `loop` and do not appear in diff --git a/src/hir/ambig-unambig-ty-and-consts.md b/src/hir/ambig-unambig-ty-and-consts.md new file mode 100644 index 000000000..b5458d71b --- /dev/null +++ b/src/hir/ambig-unambig-ty-and-consts.md @@ -0,0 +1 @@ +# Ambig/Unambig Types and Consts \ No newline at end of file diff --git a/src/hir-debugging.md b/src/hir/debugging.md similarity index 100% rename from src/hir-debugging.md rename to src/hir/debugging.md diff --git a/src/ast-lowering.md b/src/hir/lowering.md similarity index 97% rename from src/ast-lowering.md rename to src/hir/lowering.md index 033fd4b76..02c69b860 100644 --- a/src/ast-lowering.md +++ b/src/hir/lowering.md @@ -1,6 +1,6 @@ # AST lowering -The AST lowering step converts AST to [HIR](hir.html). +The AST lowering step converts AST to [HIR](../hir.md). This means many structures are removed if they are irrelevant for type analysis or similar syntax agnostic analyses. Examples of such structures include but are not limited to diff --git a/src/overview.md b/src/overview.md index 92d0c7b0c..8a1a22fad 100644 --- a/src/overview.md +++ b/src/overview.md @@ -410,7 +410,7 @@ For more details on bootstrapping, see - Guide: [The HIR](hir.md) - Guide: [Identifiers in the HIR](hir.md#identifiers-in-the-hir) - Guide: [The `HIR` Map](hir.md#the-hir-map) - - Guide: [Lowering `AST` to `HIR`](ast-lowering.md) + - Guide: [Lowering `AST` to `HIR`](./hir/lowering.md) - How to view `HIR` representation for your code `cargo rustc -- -Z unpretty=hir-tree` - Rustc `HIR` definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) - Main entry point: **TODO** From a02af2f1353429ee996575d8a956e3b1e471ebe9 Mon Sep 17 00:00:00 2001 From: Boxy Date: Tue, 17 Jun 2025 15:41:07 +0100 Subject: [PATCH 2/4] Write chapter on Unambig vs Ambig Types/Consts --- src/hir/ambig-unambig-ty-and-consts.md | 54 +++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/src/hir/ambig-unambig-ty-and-consts.md b/src/hir/ambig-unambig-ty-and-consts.md index b5458d71b..4d9a2d081 100644 --- a/src/hir/ambig-unambig-ty-and-consts.md +++ b/src/hir/ambig-unambig-ty-and-consts.md @@ -1 +1,53 @@ -# Ambig/Unambig Types and Consts \ No newline at end of file +# Ambig/Unambig Types and Consts + +Types and Consts args in the HIR can be in two kinds of positions "ambig" or "unambig". Ambig positions are where +it would be valid to parse either a type or a const, unambig positions are where only one kind would be valid to +parse. + +```rust +fn func(arg: T) { + // ^ Unambig type position + let a: _ = arg; + // ^ Unambig type position + + func::(arg); + // ^ ^ + // ^^^^ Ambig position + + let _: [u8; 10]; + // ^^ ^^ Unambig const position + // ^^ Unambig type position +} + +``` + +Most types/consts in ambig positions are able to be disambiguated as either a type or const during either parsing or ast-lowering. +Currently the only exception to this is inferred generic arguments in path segments. In `Foo<_>` it is not clear whether the `_` argument is an +inferred type argument, or an inferred const argument. + +In unambig positions, inferred arguments are represented with `hir::TyKind::Infer` or `hir::ConstArgKind::Infer` depending on whether it is a type or const position respectively. +In ambig positions, inferred arguments are represented with `hir::GenericArg::Infer`. + +A naive implementation of this structure would result in there being potentially 5 places where an inferred type/const could be found in the HIR if you just looked at the types: +- In unambig type position as a `hir::TyKind::Infer` +- In unambig const arg position as a `hir::ConstArgKind::Infer` +- In an ambig position as a `GenericArg::Ty(TyKind::Infer)` +- In an ambig position as a `GenericArg::Const(ConstArgKind::Infer)` +- In an ambig position as a `GenericArg::Infer` + +This has a few failure modes: +- People may write visitors which check for `GenericArg::Infer` but forget to check for `hir::TyKind/ConstArgKind::Infer`, only handling infers in ambig positions by accident. +- People may write visitors which check for `hir::TyKind/ConstArgKind::Infer` but forget to check for `GenericArg::Infer`, only handling infers in unambig positions by accident. +- People may write visitors which check for `GenerArg::Ty/Const(TyKind/ConstArgKind::Infer)` and `GenerigArg::Infer`, not realising that we never represent inferred types/consts in ambig positions as a `GenericArg::Ty/Const`. +- People may write visitors which check for *only* `TyKind::Infer` and not `ConstArgKind::Infer` forgetting that there are also inferred const arguments (and vice versa). + +To make writing HIR visitors less error prone when caring about inferred types/consts we have a relatively complex system: + +1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty` and `hir::Ty<()>`. `AmbigArg` is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position. + +2. The `visit_ty` and `visit_const_arg` methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated `visit_infer` method. + +This has a number of benefits: +- It's clear that `GenericArg::Ty/Const` cannot represent inferred type/const arguments +- Implementors of `visit_ty` and `visit_const_arg` will never encounter inferred types/consts making it impossible to write a visitor that seems to work right but handles edge cases wrong +- The `visit_infer` method handles *all* cases of inferred type/consts in the HIR making it easy for visitors to handle inferred type/consts in one dedicated place and not forget cases \ No newline at end of file From c963b4ad93245859912ee073f3f23a03f227e956 Mon Sep 17 00:00:00 2001 From: Boxy Date: Tue, 17 Jun 2025 15:48:15 +0100 Subject: [PATCH 3/4] Add links --- src/hir/ambig-unambig-ty-and-consts.md | 28 +++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/src/hir/ambig-unambig-ty-and-consts.md b/src/hir/ambig-unambig-ty-and-consts.md index 4d9a2d081..a43ad8b49 100644 --- a/src/hir/ambig-unambig-ty-and-consts.md +++ b/src/hir/ambig-unambig-ty-and-consts.md @@ -25,29 +25,39 @@ Most types/consts in ambig positions are able to be disambiguated as either a ty Currently the only exception to this is inferred generic arguments in path segments. In `Foo<_>` it is not clear whether the `_` argument is an inferred type argument, or an inferred const argument. -In unambig positions, inferred arguments are represented with `hir::TyKind::Infer` or `hir::ConstArgKind::Infer` depending on whether it is a type or const position respectively. +In unambig positions, inferred arguments are represented with [`hir::TyKind::Infer`][ty_infer] or [`hir::ConstArgKind::Infer`][const_infer] depending on whether it is a type or const position respectively. In ambig positions, inferred arguments are represented with `hir::GenericArg::Infer`. A naive implementation of this structure would result in there being potentially 5 places where an inferred type/const could be found in the HIR if you just looked at the types: - In unambig type position as a `hir::TyKind::Infer` - In unambig const arg position as a `hir::ConstArgKind::Infer` -- In an ambig position as a `GenericArg::Ty(TyKind::Infer)` -- In an ambig position as a `GenericArg::Const(ConstArgKind::Infer)` -- In an ambig position as a `GenericArg::Infer` +- In an ambig position as a [`GenericArg::Type(TyKind::Infer)`][generic_arg_ty] +- In an ambig position as a [`GenericArg::Const(ConstArgKind::Infer)`][generic_arg_const] +- In an ambig position as a [`GenericArg::Infer`][generic_arg_infer] This has a few failure modes: - People may write visitors which check for `GenericArg::Infer` but forget to check for `hir::TyKind/ConstArgKind::Infer`, only handling infers in ambig positions by accident. - People may write visitors which check for `hir::TyKind/ConstArgKind::Infer` but forget to check for `GenericArg::Infer`, only handling infers in unambig positions by accident. -- People may write visitors which check for `GenerArg::Ty/Const(TyKind/ConstArgKind::Infer)` and `GenerigArg::Infer`, not realising that we never represent inferred types/consts in ambig positions as a `GenericArg::Ty/Const`. +- People may write visitors which check for `GenerArg::Type/Const(TyKind/ConstArgKind::Infer)` and `GenerigArg::Infer`, not realising that we never represent inferred types/consts in ambig positions as a `GenericArg::Type/Const`. - People may write visitors which check for *only* `TyKind::Infer` and not `ConstArgKind::Infer` forgetting that there are also inferred const arguments (and vice versa). To make writing HIR visitors less error prone when caring about inferred types/consts we have a relatively complex system: -1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty` and `hir::Ty<()>`. `AmbigArg` is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position. +1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty` and `hir::Ty<()>`. [`AmbigArg`][ambig_arg] is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position. -2. The `visit_ty` and `visit_const_arg` methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated `visit_infer` method. +2. The [`visit_ty`][visit_infer] and [`visit_const_arg`][visit_const_arg] methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated [`visit_infer`][visit_infer] method. This has a number of benefits: -- It's clear that `GenericArg::Ty/Const` cannot represent inferred type/const arguments +- It's clear that `GenericArg::Type/Const` cannot represent inferred type/const arguments - Implementors of `visit_ty` and `visit_const_arg` will never encounter inferred types/consts making it impossible to write a visitor that seems to work right but handles edge cases wrong -- The `visit_infer` method handles *all* cases of inferred type/consts in the HIR making it easy for visitors to handle inferred type/consts in one dedicated place and not forget cases \ No newline at end of file +- The `visit_infer` method handles *all* cases of inferred type/consts in the HIR making it easy for visitors to handle inferred type/consts in one dedicated place and not forget cases + +[ty_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.TyKind.html#variant.Infer +[const_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.ConstArgKind.html#variant.Infer +[generic_arg_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Type +[generic_arg_const]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Const +[generic_arg_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.GenericArg.html#variant.Infer +[ambig_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.AmbigArg.html +[visit_ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_ty +[visit_const_arg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_const_arg +[visit_infer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/intravisit/trait.Visitor.html#method.visit_infer \ No newline at end of file From 9d7ba8573d4dbfba8ac459f1c15329010f8c3f29 Mon Sep 17 00:00:00 2001 From: Boxy Date: Wed, 18 Jun 2025 15:26:18 +0100 Subject: [PATCH 4/4] Reviews --- src/hir/ambig-unambig-ty-and-consts.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/hir/ambig-unambig-ty-and-consts.md b/src/hir/ambig-unambig-ty-and-consts.md index a43ad8b49..709027883 100644 --- a/src/hir/ambig-unambig-ty-and-consts.md +++ b/src/hir/ambig-unambig-ty-and-consts.md @@ -1,11 +1,11 @@ # Ambig/Unambig Types and Consts -Types and Consts args in the HIR can be in two kinds of positions "ambig" or "unambig". Ambig positions are where +Types and Consts args in the HIR can be in two kinds of positions ambiguous (ambig) or unambiguous (unambig). Ambig positions are where it would be valid to parse either a type or a const, unambig positions are where only one kind would be valid to parse. ```rust -fn func(arg: T) { +fn func(arg: T) { // ^ Unambig type position let a: _ = arg; // ^ Unambig type position @@ -21,19 +21,19 @@ fn func(arg: T) { ``` -Most types/consts in ambig positions are able to be disambiguated as either a type or const during either parsing or ast-lowering. -Currently the only exception to this is inferred generic arguments in path segments. In `Foo<_>` it is not clear whether the `_` argument is an -inferred type argument, or an inferred const argument. +Most types/consts in ambig positions are able to be disambiguated as either a type or const during parsing. Single segment paths are always represented as types in the AST but may get resolved to a const parameter during name resolution, then lowered to a const argument during ast-lowering. The only generic arguments which remain ambiguous after lowering are inferred generic arguments (`_`) in path segments. For example, in `Foo<_>` it is not clear whether the `_` argument is an inferred type argument, or an inferred const argument. In unambig positions, inferred arguments are represented with [`hir::TyKind::Infer`][ty_infer] or [`hir::ConstArgKind::Infer`][const_infer] depending on whether it is a type or const position respectively. In ambig positions, inferred arguments are represented with `hir::GenericArg::Infer`. -A naive implementation of this structure would result in there being potentially 5 places where an inferred type/const could be found in the HIR if you just looked at the types: -- In unambig type position as a `hir::TyKind::Infer` -- In unambig const arg position as a `hir::ConstArgKind::Infer` -- In an ambig position as a [`GenericArg::Type(TyKind::Infer)`][generic_arg_ty] -- In an ambig position as a [`GenericArg::Const(ConstArgKind::Infer)`][generic_arg_const] -- In an ambig position as a [`GenericArg::Infer`][generic_arg_infer] +A naive implementation of this would result in there being potentially 5 places where you might think an inferred type/const could be found in the HIR from looking at the structure of the HIR: +1. In unambig type position as a `hir::TyKind::Infer` +2. In unambig const arg position as a `hir::ConstArgKind::Infer` +3. In an ambig position as a [`GenericArg::Type(TyKind::Infer)`][generic_arg_ty] +4. In an ambig position as a [`GenericArg::Const(ConstArgKind::Infer)`][generic_arg_const] +5. In an ambig position as a [`GenericArg::Infer`][generic_arg_infer] + +Note that places 3 and 4 would never actually be possible to encounter as we always lower to `GenericArg::Infer` in generic arg position. This has a few failure modes: - People may write visitors which check for `GenericArg::Infer` but forget to check for `hir::TyKind/ConstArgKind::Infer`, only handling infers in ambig positions by accident. @@ -45,7 +45,7 @@ To make writing HIR visitors less error prone when caring about inferred types/c 1. We have different types in the compiler for when a type or const is in an unambig or ambig position, `hir::Ty` and `hir::Ty<()>`. [`AmbigArg`][ambig_arg] is an uninhabited type which we use in the `Infer` variant of `TyKind` and `ConstArgKind` to selectively "disable" it if we are in an ambig position. -2. The [`visit_ty`][visit_infer] and [`visit_const_arg`][visit_const_arg] methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated [`visit_infer`][visit_infer] method. +2. The [`visit_ty`][visit_ty] and [`visit_const_arg`][visit_const_arg] methods on HIR visitors only accept the ambig position versions of types/consts. Unambig types/consts are implicitly converted to ambig types/consts during the visiting process, with the `Infer` variant handled by a dedicated [`visit_infer`][visit_infer] method. This has a number of benefits: - It's clear that `GenericArg::Type/Const` cannot represent inferred type/const arguments