-
Notifications
You must be signed in to change notification settings - Fork 371
GPU Web 2026‐03‐10 WGSL
Chair: JB
Scribes: DS, DN
Time: Tuesday 7-8pm Paris, 2-3pm Toronto, 11am-noon Americas/Los Angeles
- Google
- Alan Baker
- dan sinclair
- David Neto
- James Price
- Natalie Chouinard
- Peter McNeeley
- Microsoft
- Rafael Cintron
- Mozilla
- Jim Blandy
- WESL
- Lee Mighdoll
- APAC-timed meeting in the first week of April
- WebXR-themed meetings coming up (mostly API)
-
Shader Symposium Talks
- DS: There was a talk from WESL folks, and one from Dawn/Tint folks, do check out the videos
- LM: And more talks tomorrow at the Khronos GDC event
- At GDC Khronos hosting 3D on the web, including WebGPU content
- Date: March 11, 2026
- Time: 5:30 PM to 9:30 PM PDT
- https://www.khronos.org/events/3d-on-the-web-2026
builtins with a 3D variant should all have 1D variants (global_invocation_index, workgroup_index) (WGSL proposed language change) #5154
- JB: Alan has posted a PR: #5554
- CTS: #4595
- JB: Discussion on issue is mostly editorial/bike shedding. Got approval from Oguz and from dneto so, just needs approvals from Mozilla/Apple and then can land. Will add Mozillas approval. Just need thumbs up from Apple. Will ping him on the issue.
Subgroup Size Control proposal, #5545
- JB: Jiawei has posted a PR: #5578
- JB: Proposal up for a long time, PR for a few weeks. I haven't had a chance to look through the language yet. Not able to give a thumbs up yet. Looks pretty uncontroversial
- DN: Was discussion on 5578 that we're adding a second set of min/max bounds to subgroup sizes and that's a little confusing to a new comer. Why would i have two mins/maxes. Alan posted content last week, that's the biggest thing that remains. My understanding is the compiler might advertise wise range for anti-fingerprinting but the compiler will only give you one or two. We can narrow the range, so it's fine. With compute it may have a different range then fragment shaders. The compute shader can only do a couple and the frag can only do 4.
- AB: on d3d on intel some will give you a size that isnt' in the listed range. But everything in the attribute is validated against the range. So, on teh frag shader you could get a size of 8 but it lists a size of 16. In Tint we have special handling for those devices to say it's 8. But that wouldnt' be valid in the attribute because d3d would reject it. You can't have a wider range for this, only a narrow range. Has to be valid. So, unclear if you can successfully avoid fingerprinting in this way unless every device has 32 which i don't know if it's true
- JB: You can avoid finger printing if you assume capabilities and enables gives you a lattice. Fingerprinting gives you a lattice and you pick a point. If you have a flat lattice, then you can't bucket it. In general if there is any subsumption relationship, where something is more powerful then something else it should be possible.
- AB: Devices should do 32, but intel may not
- JB: We dont' need a universal size
- AB: You have 32 buckets and use as you wish, or you can choose not to expose on a device if needed.
- DN: But it's the oddball ones that need this anyway
- AB: It becomes a usability question. If you didn't have this for NV devices which you know will be 32, or apple silicon you know is 32, do you need it or can we just say it's 32? This is mostly from vulkan as an intel feature as they do more subgroup sizes. AMD does 32 or 64 wave size on RDNA.
- JB: Can you explain the situation where d3d12 advertises a range but in practice you get a smaller subgroup?
- AB: I think it's a bug. Can describe it from Vulkan. The way it was worded in the property you can pick a value and just return a value, and it didnt' matter if you matched to that. So, a lot of intel devices just returned 1 value even though they could take multiple. The min wasn't the min. But then a feature was introduced to allow a range. In the first version it was just a subgroup size value, which didnt' make sense. Doesn't make sense for devices with multiple values. Similar issue with D3d..
- JB: So, the work around would be to detect the case when the impl gives a smaller subgroup and then not offer this feature?
- AB: No, that's why they split into new properties. On those adapters all of the things in the new range are valid values for hte attribute but you could produce a shader in one of those values. But if you didn't pick a specific size you might get something out of that range (intel frag shader case). So, the subgroup extension property values, tint does special handling because the min size is 16 and we get 8. There was also errata about the max where in some drivers it couldn't be used yet. in SM6.6 can't use max, so we don't. For wave size feature it's needed and validated at runtime.
- JB: So, when one is forcing a specific size there is one range. But if not setting a specific size you may get chosen a size outside the range that you choose. But within a range we give in a different way. And that's why this proposal has a separate set of property values. So, see replies to all of the comments, so AB are you ready to approve or are there more changes needed
- AB: Need to double check but convinced myself i don't see a good way around this issue. You don't need this on NV, but you need it on Intel as they have the largest variance in subgroup size.
- JP: I think it's one specific class of GPU that does this
- AB: Yes, but it's a widely used class of GPU. Could be a point of discussion. Feel like this should be discussed in the API side meeting. Gut says to take mostly as is, needs another round of review, but come to terms with 2 limits. Maybe drop compute in the future, but pretty minor.
- JP: No CTS yet, so that blocks landing the PR.
- JB: CTS and 2 impls, but we have been going with 1 and CTS. Next steps, another round of review and approval from other browser vendors.
[WGSL] Native Barycentric Coordinates Built-in in WGSL (SV_Barycentrics / gl_BaryCoord) #5566
- JB: What milestone is appropriate for this? This milestone, or put off? M3 is stuff after bindless. It sounds like where exposing something already there in some platforms. If it's an easy thing to implement, makes sense to offer it.
- DS: When we looked on Vulkan, it wasn’t on a lot of devices (Android). So we stopped looking at it. I think we implemented it behind a chromium experimental flag.
- DN: Spec is small, then it's just are there bugs and what's the reach. Shouldn't slow us down in the committee, Not a big stake for 2 or 3 just someone needs to do the work
- JB: Call it M2 for now and if folks want new features we can get to it but otherwise we'll prioritize below bindless. Resolved: M2 but after bindless
Clarification around aliasing rules? #5576
- JB: Looks like this might be confusion around reading spec? Are there remaining spec issues here or can we close?
- DN: Think we can close, reviewed and had nothing to add.
- JB: Ok, looks same the me. Resolved: no change to spec
[buffer-view] Viewing the content of a storage buffer as different types. #5338
- JB: This is the one with 3 different proposals for taking a storage buffer and viewing portions as a different type. At different times I've had different angles. First, thought bufferView which only does
bufferto a specific type thought if we were going to do that should just do any casting. Now that I've thought about it longer, stopped feeling like we should go that far, and the original proposal is the most appealing to me at this point. Hopefully not too much harm in going around in circles. Good territory to explore. Mozilla current preference (not in stone) is that thebufferViewproposal, the original proposal, is closest to what we thing is best for WGSL because it's the most restrictive in that it only does 1 type transition from untyped buffer to a typed view. The things you can do with the type'd views is what you have already in the language. As the least powerful proposal it's the least constraining. It's pretty simple and allows folks to do what they need to do. suggested addingbufferArrayViewbuiltin to give the number of elements and create a runtime-sized array and I think with that, seems like a pretty nice and sufficient addition. - DN: Works for me. Haven't looked at array view yet, but trust you've been thinking about it.
- JB: AB sounded like, in your comment, you said it really only makes sense for unsized arrays, just to be clear what you're contrasting. I think thinking both a struct with a runtime sized tail and the case where you just get an array
- AB: More that it doesnt' make sense for sized array as you have dynamic parameters.
- JB: Yes, if T is pipeline-creation-fixed footprint itwoudn't make sense to apply buffer array view.
- AB: That was my point, if we want to allow anything with an unsized tail I think that's ok.
- JB: Ok.
- JP: For bufferArrayView, why tie to buffer? If have pointer to array, why can't i just have arrayView that takes a pointer to an array and does this. You aren't changing the type, you're just subslicing an array.
- JB: We treat runtime-sized and fixed-size arrays as different types.
- JP: Just talking the input, right now it's a buffer, coudl it just be a ptr runtime-sized array
- AB: That's where the tail question comes in. What JB is describing is the output could be a struct with a runtime-array at the end. What JP is showing is array to array
- JP: Thought JB was just array output
- JB: No, the idea is right now we can have a runtime-sized global and the length of the variable position of that, if it's a struct or flat array, is determined by the size of the binding. What we need for bindless is we have a gigantic storage buffer with lots of things at various locations and we want to say this portion of my buffer is a struct with a dynamically sized tail and you could apply arrayLength to that tail just like with a global. There isnt' any other way to get that except with this bufferArrayView. Would give a way to create these.
- JP: Looking at the bufferArrayView that you added takes a T and produces ptr<array<T>> something more general
- JB: It's a typo, no it's a pointer to a buffer
- JP: Takes pointer to buffer but output is pointer to array
- JB: Second comment below expand that thought to cover things with tails. The full generalizatino is pointer to T where T is non-fixed footprint type.
- JP: Could wej ust add size param to bufferView ? To achieve the same thing.
- DN: That would avoid some fussyness with having to calculate (from the shader author perspective) the fixed size of struct s and dividing and counting.
- JB: We have all the info and they don't, we shouldn't make them redo it
- DN: So have bufferView overload with 3rd arg that takes in number of bytes to pull in
- JB: The first overload is fixed footprint sizes and the second with array element count is only for runtime sized types
- AB: Seems unnecessary. You could potentially want the rest of the buffer
- JP: Right, give me rest or give me chunk
- JB: So first overload without a count gives a runtime sized type and that would implicitly mean to the end of the buffer. Is this to the end of the buffer somethign folks want? Again, the idea is the storage buffer are gigantic mixed use bad of stuff
- AB: Thinking with regular binding, want to load data as uint and do something in 16 and then write back i'd want to use the whole buffer
- JB: As a way to get functionality of runtime sized global but with multiple interpretations. Seems fine.
- AB: Do we only want non-fixed-footprint types in storage. OR do we allow in more storage classes and no variables. Unsized array in workgroup as long as the var is sized? We currently don't allow as it can't be expressed.
- JB: One of the reasons we dont' allow is arrayLength builtin function is a spirv OpArrayLength whereas this idea means really we pass around a start offset/eelemnt count. That's what this point really is. If we go to the trouble, then we'd allow in whatever address space
- DN: And think it solves a code generictiy issue folks ask about where they want more generic code with fixed types, especially array sizes. In the new world, have workgroup variable sized with a constant or override and make everything else more generic, then fixed to address space and not type. Could write qsort using generic code if you want and we'd do indexing.
- AB: To be clear, the way we're describing it the base variable is a sized buffer not an array.
- DN: Yes, in workgroup.
- DS: If we have unsized array at the end of a struct; we get those sizes right now from the API side. Does that present a problem in this new situation.
- AB: We make the compiler have to recalculate this and pass it around.
- AB: feeds into robustness checks etc.
- JB: Implementation question, if compiling down to target with flexible pointers then no problem, but if going to something like old WGSL, would have to generate a copy of each callee that takes a pointer depending on which buffer is nidexed into. Similar to generalized pointers.
- AB: Since we already run that, it naturally hangs off. We would have been generating function specializations to access specific variables
- JB: So this introduces the potential due to code copying due to specialization. But, in general things all get inlined anyway so possibly not that big of a deal.
- AB: Will still need to do some tracking for stuff like that and as we've talked about like if we allow the buffers to be passed around where we let there be auto conversions from sized/unsized or sized to smaller size then you have to keep track that you reduced the bounds. May want to revisit with buffer array view.
- JB: So, sounds like next step is we want to get language into the proposal for bufferArrayView.
- DS: So to be clear, bufferView proposal, remove viewBytes and reinterpret and add bufferArrayView
- JB: Yes, clean all that up and then add bufferArrayView to discuss. Tentative position, need to revisit based on Apple feedback.
- AB: happy to put some more text into proposal to cover this stuff.
[Explicit BGL params] Relax the function traceback rule? #5357
Part of the WGSL Explicit Bindgroup Layout Parameters proposal, #5353
- DN: One of the issues is that the usages are over pairs of texture/sampler when used in functions. So, the analysis is coupled and currently, as far as i know, it's not considered. Second, DS wrote that if you defer the decision of how to declare the param type then you are waiting long enough such that if you have differnet usages later either you have action at a distance or forcing call site specialization possibly causing combinatorial blowup. Coupling is a mechanical thing we have to get right, the second is do we allow action at a distance or blowup? We already have the specialization. The coupling sounds like the work we did for compat to map to combined texture samplers, so it's a cost we've already partially absorbed.
- AB: Despite the spec being out of date, we do this in Dawn, we trace the whole way, so effectively we've done the pairing and checking past function boundaries.
- JB: Discussed earlier with CW, seems like Tint is doing, because it works from entry points towards leaf functions, and constructs the cartesian product of texture/samplers. Then Tint is assuming there are more possible pairings. If instead we did leaf to entry point you could do a much more precise analysis, uniformity and alias analysis does this. We could make it more precise to permit more usages. Tracing already exists, probably better ways to trace that permit more programs but the WGSL doesn't specify these things. I'm assigned an issue to make the WGSL spec property say how pairings of texture/sampler are generated. Then that could be a concept in WGSL that the API spec can consume. Think that will be a good point to address this issue as well. Was surprised, didn't Kai say there is a combinatorial explosion because the generated code doesn't need to know, just that they're used correctly. We don't need to specify what they are.
- JP: Yes, we reflect the combinations to Dawn but we dont' generate different code based on the combinations.
- AB: As long as the WGSL spec is in terms of pipeline creation, it should be per entry point
- JB: Make summary of each function which includes behaviour of callees. When you get to EP you get results for that EP. The aPI can then consult that at pipeline creation time to generate an error.
- JB: Sounds like tentatively someone needs to write text and see what they find out, but tentative resolution is we will be able to relax the trace back rule.
Please review the following proposals over the course of the next few weeks, and file issues for suggested changes, unclear text, areas where investigation is needed, and so on. These issues can then be discussed in committee as necessary.
While committee members may always request changes at any point, the hope is that once the committee has resolved all the issues raised and a proposal ‘graduates’ from this section, it is reasonable for the proposal’s authors to begin drafting a PR against the spec itself.
[buffer-view] Viewing the content of a storage buffer as different types. #5338
buffer-view.md-
reinterpret_view.md(alternative proposal, but let’s keep discussion of both in #5338)
Subgroup Size Control #5545
WGSL Proposal for fragment depth (less, greater, any) #5342