AMD, do support for NV_shader_buffer_load next! Shader Buffer Load brought "Buffer Device Address" / pointers to OpenGL/glsl, long before Vulkan was even a thing. It's the best thing since sliced bread, and easily lets you access all your vertex data with pointers, i.e., you don't need to bind any vertex buffers anymore. Also easily lets you draw the entire scene in a single draw call since vertex shaders can just load data from wherever the pointers lead them, e.g., it makes GLSL vertex shaders look like this:
This is the real killer feature of Vulkan/DX12, it makes writing generalized renderer so much easier because you don't need to batch draw calls per vertex layout of individual meshes. Personally I use Buffer Device Address for connecting Multidraw Indirect calls to mesh definitions to materials as well.
I just wish there was more literature about this, especially about perf implications. Also synchronization is very painful, which may be why this is hard to do on a driver level inside OpenGL
fingerlocks 2 days ago [-]
Maybe I’m missing something, but isn’t this the norm in Metal as well? You can bind buffers individually, or a use a single uber-buffer that all vertex shaders can access.
But I haven’t written OpenGL since Metal debuted over a decade ago.
reactordev 2 days ago [-]
This is the way
garaetjjte 2 days ago [-]
You could do similar thing with SSBO, I think?
m-schuetz 2 days ago [-]
That is for SSBOs. u_nodes is a pointer to an SSBO in this case. That SSBO then has lots of more pointers to various different SSBOs that contain the vertex data.
garaetjjte 2 days ago [-]
I'm thinking of declaring array of SSBOs that contain array of data structs. Address would be represented by index of SSBO binding and offset within that buffer. Though that limits maximum number of used SSBOs within drawcall to GL_MAX_VERTEX_SHADER_STORAGE_BLOCKS.
m-schuetz 2 days ago [-]
To my knowledge you can't have an array of SSBOs in OpenGL. You could have one SSBO for everything, but that makes other things very difficult, like how to deal with dynamically growing scenes, loading and unloading models, etc.
(3) "Do we allow arrays of shader storage blocks?
RESOLVED: Yes; we already allow arrays of uniform blocks, where each
block instance has an identical layout but is backed by a separate
buffer object. It seems like we should do this here for consistency.
PS: You could also access data through bindless textures, though you would need to deal with ugly wrappers to unpack structs from image formats.
m-schuetz 2 days ago [-]
Do you have an example for that? I can't find any.
Regarding bindless textures, they're really ugly to use. Shader buffer load is so much better, being able to access everything with simple pointers.
I wanted to say that with some compiler hacking it should be possible to lower SPIR-V using GL_EXT_buffer_reference into bindless image loads, but SPIR-V doesn't have standardized bindless texture, duh!
bsder 2 days ago [-]
VK_EXT_descriptor_buffer?
If you are using Slang, then you just access everything as standard pointers to chunks of GPU memory.
And it's mostly Intel and mobile dragging their feet on VK_EXT_descriptor_buffer ...
m-schuetz 2 days ago [-]
I'm talking about OpenGL. Vulkan is too hard for my small mind to understand, so I'm still using OpenGL. And the extension that allows this in OpenGL came out in 2010, so long before Vulkan.
bsder 1 days ago [-]
No one at the big companies is developing OpenGL anymore and their support for the GLSL compiler has dwindled to nothing.
If you want that extension you're going to have better luck convincing the Zink folks:
https://docs.mesa3d.org/drivers/zink.html
"The Zink driver is a Gallium driver that emits Vulkan API calls instead of targeting a specific GPU architecture. This can be used to get full desktop OpenGL support on devices that only support Vulkan."
However, you're still probably going to have to come off of GLSL and use Slang or HLSL. The features you want are simply not going to get developed in the GLSL compiler at this point.
m-schuetz 14 hours ago [-]
> The features you want are simply not going to get developed in the GLSL compiler at this point.
They exist in GLSL on Nvidia devices. If other vendors refuse to implement them, then I will be an Nvidia-only developer. Fine by me. I no longer care about other vendors if they completely ignore massive quality of life features.
zackmorris 2 days ago [-]
A little bit off topic but: GL_LINES doesn't have a performant analog on lots of other platforms, even Unity. Drawing a line properly requires turning the two endpoint vertices into a quad and optionally adding endcaps which are at least triangular but can be polygons. From my understanding, that requires a geometry shader since we're adding virtual/implicit vertices. Does anyone know if mesh shaders could accomplish the same thing?
Also I wish that GL_LINES was open-sourced for other platforms. Maybe it is in the OpenGL spec and I just haven't looked. I've attempted some other techniques like having the fragment shader draw a border around each triangle, but they all have their drawbacks.
kg 2 days ago [-]
To draw lines instead of a geometry shader you can use instancing, since you know how many vertices you need to represent a line segment's bounding box. Have one vertex buffer that just contains N vertices (the actual attribute data doesn't matter, but you can shove UVs or index values in there) and bind it alongside a buffer containing your actual line information (start, end, color, etc). The driver+GPU will replicate the 'line vertex buffer' vertices for every instance in the 'line instance buffer' that you bound.
This works for most other regular shapes too, like a relatively tight bounding box for circles if you're drawing a bunch of them.
m-schuetz 2 days ago [-]
In my experience, drawing quads with GL_POINTS in OpenGL was way faster than drawing quads with instancing in DirectX. That was noticable with the DirectX vs. OpenGL backends for WebGL, where switching between the two resulted in widely different performance.
reactordev 2 days ago [-]
drawing using GL_LINES is old school fixed function pipeline and it's how modern graphics hardware works. If you want a single line, draw a small rectangle between V1 and V2 using geometry. The thickness is the distance between P1 and P2 / P3 and P4 of the rectangle. A line has no thickness as it's 1 dimensional.
Draw in screen space based on projected points in world space.
set gl_Color to your desired color vec and bam, line.
mandarax8 2 days ago [-]
I'm not sure exactly what you mean, but you can both output line primitives directly from the mesh shader or output mitered/capped extruded lines via triangles.
As far as other platforms, there's VK_EXT_line_rasterization which is a port of opengl line drawing functionality to vulkan.
rezmason 2 days ago [-]
hundredrabbits' game Verreciel uses a reimplementation of webgl-lines, to pretty good effect, if I may say so:
nvidium is using GL_NV_mesh_shader which is only available for nVIDIA cards. This mod is the only game/mod I know of that uses mesh shaders & is OpenGL. & so the new gl extension will let users of other vendors use the mod if it gets updated to use the new extension.
mellinoe 2 days ago [-]
Presumably because Minecraft is the only application which still uses OpenGL but would use the extension
doubletwoyou 2 days ago [-]
pretty sure the base minecraft rendering engine is still using opengl, and most of the improvement mods also just use opengl so exposing this extension to them is probably important to a game where its 50 billion simple cubes being rendered
FrustratedMonky 2 days ago [-]
Is Minecraft the only thing using OpenGL anymore?
What is the current state of OpenGL, I thought it had faded away?
aj_hackman 2 days ago [-]
It's officially deprecated in favor of Vulkan, but it will likely live on for decades to come due to legacy CAD software and a bunch of older games still using it. I don't share the distaste many have for it, it's good to have a cross-platform medium-complexity graphics API for doing the 90% of rendering that isn't cutting-edge AAA gaming.
Cieric 2 days ago [-]
> It's officially deprecated in favor of Vulkan
Can you provide a reference for this? I work in the GPU driver space (not on either of these apis), but from my understanding Vulkan wasn't meant to replace OpenGL, it was only introduced to give developers the chance at getting lower level in the hardware (still agnostic from the hardware, at least compared to compiling PTX/CUDA or against AMD's PAL directly, many still think they failed.) I would still highly advocate for developers using OpenGL or dx11 if their game/software doesn't need the capabilities of Vulkan or dx12. And even if you did, you might be able to get away with interop and do small parts with the lower api and leave everything else in the higher api.
I will admit I don't like the trend of all the fancy new features only getting introduced into Vulkan and dx12, but I'm not sure how to change that trend.
jplusequalt 2 days ago [-]
I think Vulkan was originally called OpenGL Next. Furthermore, Vulkan's verbosity allows for a level of control of the graphics pipeline you simply can't have with OpenGL, on top of having built in support for things like dynamic rendering, bindless descriptors, push constants, etc.
Those are the main reasons IMO why most people say it's deprecated.
ecshafer 2 days ago [-]
I only play with this stuff as a hobbiest. But OpenGL is way more simple than Vulkan I think. Vulkan is really really complicated to get some basic stuff going.
marmarama 2 days ago [-]
Which is as-designed. Vulkan (and DX12, and Metal) is a much more low-level API, precisely because that's what professional 3D engine developers asked for.
Closer to the hardware, more control, fewer workarounds because the driver is doing something "clever" hidden behind the scenes. The tradeoff is greater complexity.
Mere mortals are supposed to use a game engine, or a scene graph library (e.g. VulkanSceneGraph), or stick with OpenGL for now.
The long-term future for OpenGL is to be implemented on top of Vulkan (specifically the Mesa Zink driver that the blog post author is the main developer of).
m-schuetz 2 days ago [-]
> Closer to the hardware
To what hardware? Ancient desktop GPUs vs modern desktop GPUs? Ancient smartphones? Modern smartphones? Consoles? Vulkan is an abstraction of a huge set of diverging hardware architectures.
And a pretty bad one, on my opinion. If you need to make an abstraction due to fundamentally different hardware, then at least make an abstraction that isn't terribly overengineered for little to no gain.
MountainTheme12 2 days ago [-]
Closer to AMD and mobile hardware. We got abominations like monolithic pipelines and layout transition thanks to the first, and render passes thanks to the latter.
Luckily all of these are out or on their way out.
pjmlp 2 days ago [-]
Not really, other than on desktops, because as we all know mobile hardware gets the drivers it gets on release date, and that's it.
Hence why on Android, even with Google nowadays enforcing Vulkan, if you want to deal with a less painful experience in driver bugs, better stick with OpenGL ES, outside Pixel and Samsung phones.
MountainTheme12 1 days ago [-]
Trying to fit both mobile and desktop in the same API was just a mistake. Even applications that target both desktop and mobile end up having significantly different render paths despite using the same API.
I fully expect it to be split into Vulkan ES sooner or later.
fingerlocks 2 days ago [-]
100%. Metal is actually self-described as a high level graphics library for this very reason. I’ve never actually used it on non-Apple hardware, but the abstractions for vendor support is there. And they are definitely abstract. There is no real getting-your-hands-dirty exposure of the underlying hardware
dagmx 1 days ago [-]
Metal does have to support AMD and Intel GPUs for another year after all, and had to support NVIDIA for a hot minute too.
fingerlocks 14 hours ago [-]
Wow, what a brain fart. So much of metal has improved since M-series, I just forgot it was even the same framework entirely. Even the stack is different now that we have metal cpp and swift++ interop with unified memory access.
flohofwoe 2 days ago [-]
> fewer workarounds because the driver is doing something "clever" hidden behind the scenes.
I would be very surprised if current Vulkan drivers are any different in this regard, and if yes then probably only because Vulkan isn't as popular as D3D for PC games.
Vulkan is in a weird place that it promised a low-level explicit API close to the hardware, but then still doesn't really match any concrete GPU architecture and it still needs to abstract over very different GPU architectures.
At the very least there should have been different APIs for desktop and mobile GPUs (not that the GL vs GLES split was great, but at least that way the requirements for mobile GPUs don't hold back the desktop API).
And then there's the issue that also ruined OpenGL: the vendor extension mess.
The last OpenGL release 4.6 was in 2017... I think that speaks for itself ;)
And at least on macOS, OpenGL is officially deprecated, stuck at 4.1 and is also quickly rotting (despite running on top of Metal now - but I don't think anybody at Apple is doing serious maintenance work on their OpenGL implementation).
fulafel 2 days ago [-]
That's not "OpenGL is officially deprecated".
flohofwoe 2 days ago [-]
In the end, if nobody is maintaining the OpenGL standard, implementations and tooling it doesn't matter much whether it is officially deprecated or just abandondend.
fulafel 2 days ago [-]
.. but people ARE maintaining the implementations and tooling even if the spec might not be getting new features aside from extensions. There's a difference.
> A slow moving graphics API is a good thing for many uses.
It's not slow moving. It's completely frozen.
The Mesa guys are the only ones actually fixing bugs and improving implementations, but the spec is completely frozen and unmaintained. Apple, Microsoft and Google don't really care if OpenGL works well on their platforms.
jcelerier 2 days ago [-]
> the spec is completely frozen and unmaintained.
but, literally this article is about something new that was added to the OpenGL spec
Delk 2 days ago [-]
Well, not really to the OpenGL spec itself. It's about a new OpenGL extension being added to the extension registry. Vendors may implement it if they wish. AFAIK the core OpenGL spec hasn't been updated in years, so even though new extensions keep getting developed by vendors, the official baseline hasn't moved.
I suppose the same is true of Direct3D 11, though. Only the Direct3D 12 spec has been updated in years from what I can tell. (I'm not a graphics programmer.)
fulafel 2 days ago [-]
A main reason to do new OpenGL releases was to roll developed extensions to required features of a new OpenGL version to give application programmers a cohesive target platform. The pace of API extensions has slowed down enough that it's not going to be a problem for a while.
josefx 2 days ago [-]
> but I don't think anybody at Apple is doing serious maintenance work on their OpenGL implementation
In other words nothing changed. The OpengGL standard had been well past 4.1 for years when Apple released Metal. People working with various 3D tools had to disable system integrity checks to install working drivers from NVIDIA to replace whatever Apple shipped by default.
aj_hackman 2 days ago [-]
I've never been able to successfully create a GL context > version 2.1, or invoke the GLSL compiler.
As a sidenote, I've very much enjoyed your blog, and developed a similar handle system as yours around the same time. Mine uses 32 bits though - 15 for index, 1 for misc stuff, 8 for random key, and 8 for object type :^)
jlokier 2 days ago [-]
Recent versions of macOS will provide either an OpenGL 2.1 context or OpenGL 4.1 context, depending on how you request the context. You have to request a 3.2+ core profile, and not use X11 or the glX* functions.
From macOS 10.7 to 10.9, you'd get an OpenGL 3.2 context. As OpenGL 4.1 is backward compatible to OpenGL 3.2, it's fine that the same code gets OpenGL 4.1 now.
Basically, macOS will provide an "old" API to programs that need it, which is fixed at 2.1, and a "modern" API to programs that know how to ask for it, which has settled at 4.1 and is unlikely to change.
OpenGL 4.1 is harmonised with OpenGL ES 2.0. Almost the same rendering model, features, extensions, etc. On iOS, iPadOS etc you can use OpenGL ES 2.0, and no version of OpenGL (non-ES), so my guess is that's why macOS settled on OpenGL 4.1. Both platforms offer the same OpenGL rendering features, but through slightly different APIs.
But if you request 4.1 over GLX (which uses X11/Xorg/XQuartz), the X11 code only supports OpenGL 2.1. For example, if you're porting some Linux code or other GLX examples over.
Unfortunately, the GLX limitation is probably just due to the Xorg-based XQuartz being open source but only minimally maintained since before OpenGL 3.2 was added to macOS. XQuartz uses Xorg and Mesa, which have all the bindings for 4.1, but some of them are not quite wired up.
aj_hackman 2 days ago [-]
The universal narrative around OpenGL is that it's deprecated, so I assumed that came with a thumbs-up from Khronos. In any case, I'm not holding my breath for GL > 4.6.
fulafel 2 days ago [-]
OpenGL in the form of WebGL is living its best life.
It's the only way to ship portable 3D software across the desktop and mobile platforms without platform specific code paths, thanks to the API fragmentation and proprietary platform antics from our beloved vendors.
In some years WebGPU may mature and start gaining parity (webgl took a looooong time to mature), and after that it'll still take more years for applications to switch given older hardware, the software inertia needed to port all the middleware over etc.
m-schuetz 2 days ago [-]
There is also the problem that WebGPU doesn't really add much except for compute shaders. Older WebGL apps have hardly any reason to port. Other problem is that WebGPU is even worse outdated than WebGL was at its release. When WebGL was released, it was maybe 5 years outdated. WebGPU somewhat came out in major desktop browsers this year, and by now it's something like 15 years behind the state of the art. OpenGL, which got de facto deprecated more than half a decade ago, is orders of magnitude more powerful with respect to hardware capabilities/features than WebGPU.
charlotte-fyi 2 days ago [-]
This comparison is kind of sloppy, though. OpenGL on the desktop needs to be compared to a concrete WebGPU implementation. While it still lags behind state of the art, `wgpu` has many features on desktop that aren't in the standard. For example, they've started working on mesh shaders too: https://github.com/gfx-rs/wgpu/issues/7197. If you stick to only what's compatible with WebGL2 on the desktop you'd face similar limitations.
m-schuetz 2 days ago [-]
I'm of course talking about WebGPU for web browsers, and I'd rather not use a graphics API like wgpu with uncertain support for the latest GPU features. Especially since wgpu went for the same paradigm as Vulkan, so it's not even that much better to use but you sacrifice lots of features. Also Vulkan seems to finally start fixing mistakes like render passes and pipelines, whereas WebGPU (and I guess wgpu?) went all in.
apocalypses 2 days ago [-]
Saying WebGPU “only” adds compute shaders is crazy reductive and misses the point entirely for how valuable an addition this is, from general purpose compute through to simplification of rendering pipelines through compositing passes etc.
In any case it’s not true anyway. WebGPU also does away with the global state driver, which has always been a productivity headache/source of bugs within OpenGL, and gives better abstractions with pipelines and command buffers.
m-schuetz 2 days ago [-]
I disagree. Yes, the global state is bad, but pipelines, render passes, and worst of all static bind groups and layouts, are by no means better. Why would I need to create bindGroups and bindGroup layouts for storage buffers? They're buffers and references to them, so let me just do the draw call and pass references to the ssbos as argument, rather than having to first create expensive bindings, with the need to cache them because they are somehow expensive.
Also, compute could have easily been added to WebGL, making WebGL pretty much on-par with WebGPU, just 7 years earlier. It didn't happen because WebGPU was supposed to be a better replacement, which it never became. It just became something different-but-not-better.
If you'd have to do even half of all the completely unnecessary stuff that Vulkan forces you to do in CUDA, CUDA would have never become as popular as it is.
charlotte-fyi 2 days ago [-]
I agree with you in that I think there's a better programming model out there. But using a buffer in a CUDA kernel is the simple case. Descriptors exist to bind general purpose work to fixed function hardware. It's much more complicated when we start talking about texture sampling. CUDA isn't exactly great here either. Kernel launches are more heavyweight than calling draw precisely because they're deferring some things like validation to the call site. Making descriptors explicit is verbose and annoying but it makes resource switching more front of mind, which for workloads primarily using those fixed function resources is a big concern. The ultimate solution here is bindless, but that obviously presents it's own problems for having a nice general purpose API since you need to know all your resources up front. I do think CUDA is probably ideal for many users but there are trade-offs here still.
pjmlp 2 days ago [-]
It didn't happen because of Google, Intel did the work to make it happen.
pjmlp 2 days ago [-]
Although I tend to bash WebGL and WebGPU for what they offer versus existing hardware, lagging a decade behind, they have a very important quality for me.
They are the only set of 3D APIs that have been adopted in the mainstream computing, designed for managed languages, instead of year another thing to be consumed by C.
Technically Metal is also used by a managed language, but it was designed for Objective-C/C++ first, with Swift as official binding.
Microsoft gave up on Managed Direct X and XNA, and even with all the safety talks, Direct X team doesn't care to provide official COM bindings to C#.
Thus that leaves us WebGL and WebGPU for managed languages fans, which even if lagging behind, as PlayCanvas and ShaderToy show, there are enough capabilities on the shader languages that have not yet taken off.
flohofwoe 2 days ago [-]
D3D (up to D3D11 at least) is also a "managed" API since it uses refcounting to keep resources alive for as long as they are used by the GPU, there really isn't much difference to garbage collection.
Metal allows to disable refcounted lifetime management when recording commands since it actually adds significant overhead and D3D12 and Vulkan removed it entirely.
Unfortunately WebGPU potentially produces even more garbage than WebGL2, and we'll have yet to see how this turns out. Some drawcall heavy code actually runs faster on WebGL2 than WebGPU which really doesn't look great for a modern 3D API (not mainly because of GC but every little bit of overhead counts).
pjmlp 2 days ago [-]
The point is that those APIs were not designed with anything beyond C and C++ as consumers, and everyone else has to do their due deligence and build language bindings from scratch.
So we end up in an internal cycle that we cannot get rid of.
Metal and Web 3D APIs add other consumer languages in mind, you also see this in how ANARI is being designed.
Yes every little bit of performance counts, but it cannot be that APIs get designed as if everyone is still coding in Assembly, and then it is up to whoever cares to actually build proper high level abstractions on top, that is how we end up with Vulkan.
flohofwoe 2 days ago [-]
> but it cannot be that APIs get designed as if everyone is still coding in Assembly
Why not though? In the end an API call is an API call, and everything is compiled down to machine code no matter what the source language is.
FWIW, the high-level "OOP-isms" of the Metal API is also its biggest downside. Even simple create-option "structs" like MTLRenderPassDescriptor are fully lifetime-managed Objective-C objects where every field access is a method call - that's simply unnecessary overkill.
And ironically, the most binding-friendly API for high-level languages might still be OpenGL, since this doesn't have any structs or 'objects with methods', but only plain old function calls with primitive-type parameters and the only usage of pointers is for pointing to unstructured 'bulk data' like vertex-buffer- or texture-content, this maps very well even to entirely un-C-like languages - and the changes that WebGL did to the GL API (for instance adding 'proper' JS objects for textures and buffers) are arguably a step back compared to native GL where those resource objects are just opaque handles.
pjmlp 2 days ago [-]
Because not everyone doing 3D graphics is implementing AAA rendering engines on RTX cards.
The ANARI effort was born exactly because the visualisation industry refusal to adopt Vulkan as is.
flohofwoe 2 days ago [-]
Looking at the ANARI spec and SDK it looks pretty much like a typical C API to me, implementing an old-school scene-graph system. What am I missing - e.g. what makes it specifically well suited for non-C languages? :)
If anything it looks more like an admission by Khronos that Vulkan wasn't such a great idea (but a 3D API that's based on scene graphs isn't either, so I'm not sure what's so great about ANARI tbh).
pjmlp 1 days ago [-]
Python is part of ANARI value proposal, and the standard takes this into account.
Dumb question, but is there a way to use WebGL for a desktop app without doing Electron stuff?
adastra22 2 days ago [-]
...OpenGL?
m-schuetz 2 days ago [-]
OpenGL is going to live a long life simply because Vulkan is way more complex and overengineered than it needs to be.
flohofwoe 2 days ago [-]
Vulkan (1.0 at least) being a badly designed API doesn't mean that OpenGL will be maintained unfortunately. Work on OpenGL pretty much stopped in 2017.
m-schuetz 2 days ago [-]
I am sadly aware, but I won't switch until the complexity is fixed. Although I did kind of switch, but to CUDA because the overengineered complexity of Vulkan drove me away. I'm neither smart nor patient enough for that. What should be a malloc is a PhD thesis in Vulkan, what should be a memcpy is another thesis, and what should be a simple kernel launch is insanity.
exDM69 2 days ago [-]
> I am sadly aware, but I won't switch until the complexity is fixed
It pretty much is by now if you can use Vulkan 1.4 (or even 1.3). It's a pretty lean and mean API once you've got it bootstrapped.
There's still a lot of setup code to get off the ground (device enumeration, extensions and features, swapchain setup, pipeline layouts), but beyond that Vulkan is much nicer to work with than OpenGL. Just gotta get past the initial hurdle.
jsheard 2 days ago [-]
It's steadily getting better as they keep walking back aspects which turned out to be needlessly complex, or only needed to be complex for the sake of older hardware that hardly anyone cares about anymore, but yeah there's still a way to go. Those simpler ways of doing things are just grafted onto the side of the existing API surface so just knowing which parts you're supposed to use is a battle in itself. Hopefully they'll eventually do a clean-slate Vulkan 2.0 to tidy up the cruft, but I'm not getting my hopes up.
m-schuetz 2 days ago [-]
Might be getting better but just yesterday I dabbled in Vulkan again, digging through the examples from https://github.com/SaschaWillems/Vulkan, and the complexity is pure insanity. What should be a simple malloc ends up being 40 lines of code, what should be a simple memcpy is another 30 lines of code, and what should be a single-line kernel launch is another 50 lines of bindings, layouts, pipelines, etc.
flohofwoe 2 days ago [-]
Tbf, a lot of the complexity (also in the official Khronos samples) is caused by insane C++ abstraction layers and 'helper frameworks' on top of the actual Vulkan C API.
Just directly talking to the C API in the tutorials/examples instead of custom wrapper code would be a lot more helpful since you'd don't need to sift through the custom abstraction layers (even if it would be slightly more code).
E.g. have a look at the code snippets in here and weep in despair ;)
Why should these things be simple? Graphics hardware varies greatly even across generations from the same vendors. Vulkan as an API is trying to offer the most functionality to as much of this hardware as possible. That means you have a lot of dials to tweak.
Trying to obfuscate all the options goes against what Vulkan was created for. Use OpenGL 4.6/WebGPU if you want simplicity.
flohofwoe 2 days ago [-]
A simple vkCreateSystemDefaultDevice() function like on Metal instead of requiring hundreds of lines of boilerplate would go a long way to make Vulkan more ergonomic without having to give up a more verbose fallback path for the handful Vulkan applications that need to pick a very specific device (and then probably pick the wrong one on exotic hardware configs).
And the rest of the API is full of similar examples of wasting developer time for the common code path.
Metal is a great example of providing both: a convenient 'beaten path' for 90% of use cases but still offering more verbose fallbacks when flexibility is needed.
Arguably, the original idea to provide a low-level explicit API also didn't quite work. Since GPU architectures are still vastly different (especially across desktop and mobile GPUs), a slightly more abstract API would be able to provide more wiggle room for drivers to implement an API feature more efficiently under the hood, and without requiring users to write different code paths for each GPU vendor.
jplusequalt 2 days ago [-]
Metal has the benefit of being developed by Apple for Apple devices. I'd imagine that constraint allows them to simplify code paths in a way Vulkan can't/won't. Again, Metal doesn't have to deal with supporting dozens of different hardware systems like Vulkan does.
flohofwoe 2 days ago [-]
Metal also works for external GPUs like NVIDIA or AMD though (not sure how much effort Apple still puts into those use cases, but Metal itself is flexible enough to deal with non-Apple GPUs).
m-schuetz 2 days ago [-]
CUDA can be complex if you want, but it offers more powerful functionality as an option that you can choose, rather than mandating maximum complexity right from the start. This is where Vulkan absolutely fails. It makes everything maximum effort, rather than making the common things easy.
jplusequalt 2 days ago [-]
I think CUDA and Vulkan are two completely different beasts, so I don't believe this is a good comparison. One is for GPGPU, and the other is a graphics API with compute shaders.
Also, CUDA is targeting a single vendor, whereas Vulkan is targeting as many platforms as possible.
m-schuetz 2 days ago [-]
The point still stands: Vulkan chose to go all-in on mandatory maximum complexity, instead of providing less-complex routes for the common cases. Several extensions in recent years have reduced that burden because it was recognized that this is an actual issue, and it demonstrated that less complexity would have been possible right from the start. Still a long way to go, though.
pjmlp 2 days ago [-]
Yes, recent example, the board getting released by Qualcomm after acquiring Arduino.
Between OpenGL ES 3.1 and Vulkan 1.1, I would certainly go with OpenGL ES.
Narishma 2 days ago [-]
Oh I didn't know the new Arduino board had a GPU. Do we know what kind?
I don't doubt OpenGL will live forever. But Vulkan 1.3/1.4 is not as bad as people make it out to be.
m-schuetz 2 days ago [-]
So I've been told so I'm trying to take another look at it. At least the examples at https://github.com/SaschaWillems/Vulkan, which are probably not 1.3/1.4 yet except for the trianglevulkan13 example, are pure insanity. Coming from CUDA, I can't fathom why what should be simple things like malloc, memcpy and kernel launches, end up needing 300x as many lines.
jplusequalt 2 days ago [-]
In part, because Vulkan is a graphics API, not a GPGPU framework like CUDA. They're entirely different beasts.
Vulkan is also trying to expose as many options as possible so as to be extensible on as many platforms as possible. Also, Vulkan isn't even trying to make it more complex than it need be--this is just how complex graphics programming is period. The only reasons people think Vulkan/DX12 are overly complicated is because they're used to using APIs where the majority of the heavy lifting comes from the drivers.
barchar 2 days ago [-]
No, it is overly complex for modern hardware (unless you use shader objects). Vulkan forces you to statically specify a ton of state that's actually dynamic on modern GPUs. You could cut things down a ton with a new API. Ofc you'd have to require a certain level of hardware support, but imo that will become natural going forward.
Actually, it would be kinda neat to see an API that's fully designed assuming a coherent, cached, shared memory space between device and host. Metal I guess is closest.
bsder 2 days ago [-]
> Vulkan forces you to statically specify a ton of state that's actually dynamic on modern GPUs.
Desktop GPUs. Tiling GPUs are still in use on mobile and you can't use the tiling hardware effectively without baking the description into pipelines.
> You could cut things down a ton with a new API.
VK_KHR_dynamic_rendering is what you are looking for
> Actually, it would be kinda neat to see an API that's fully designed assuming a coherent, cached, shared memory space between device and host.
You can just ask for exactly that--even on Vulkan. If you don't want to support computer systems that don't support RBAR, you can do that.
jplusequalt 7 hours ago [-]
>Ofc you'd have to require a certain level of hardware support
Have you used Vulkan? Specifying required hardware support for your physical device is literally one of the first thing you do when setting up Vulkan.
flohofwoe 2 days ago [-]
> In part, because Vulkan is a graphics API, not a GPGPU framework like CUDA. They're entirely different beasts.
Tbf, the distinction between rendering and compute has been disappearing for quite a while now, apart from texture sampling there isn't much reason to have hardware that's dedicated for rendering tasks on GPUs, and when there's hardly any dedicated rendering hardware on GPUs, why still have dedicated rendering APIs?
barchar 2 days ago [-]
And, mesh shading in particular is basically "what if we just deleted all that vertex specification crap and made you write a compute shader"
Note that it's not always better. The task shaders are quite hardware specific and it makes sense to ship defaults inside the driver.
pjmlp 2 days ago [-]
Yes, I predict eventually we will be back at software rendering, with the difference that now it will be hardware accelerated due to running on compute hardware.
jplusequalt 2 days ago [-]
This is not a statement on the hardware, it's a statement on what the APIs are trying to achieve. In this regard, they are remarkably different.
flohofwoe 2 days ago [-]
The point is that a (self-declared) low-level API like Vulkan should just be a thin interface to GPU hardware features. For instance the entire machinery to define a vertex layout in the PSO is pretty much obsolete today, vertex pulling is much more flexible and requires less API surface, and this is just one example of the "disappearing 3D API".
More traditional rendering APIs can then be build on top of such a "compute-first-API", but that shouldn't be the job Khronos.
pjmlp 2 days ago [-]
Except that you also need to have it available on target systems, good luck on Android.
jplusequalt 7 hours ago [-]
I'm fairly sure Vulkan runs just fine on Android? You won't have access to dynamic rendering, so you'll have to manage renderpasses, but I don't think you're going to have issues running Vulkan on a modern Android device.
I believe the llama.cpp Vulkan backend is inoperable on Adreno GPUs
fulafel 2 days ago [-]
That's OpenCL, not OpenGL.
Sharlin 2 days ago [-]
It's super frequently recommended as a starting point for learners because it's high level enough to get something on the screen in ten lines of code but low level enough to teach you the fundamentals of how the rendering pipeline works (even though GL's abstraction is rather anachronistic and differs from how modern GPUs actually work). Vulkan (requiring literally a thousand LoC worth of initialization to render a single triangle) is emphatically not any sort of replacement for that use case (and honestly not for 95% of hobbyist/indie use cases either unless you use a high-level abstraction on top of it).
The worst thing about OpenGL is probably the hilariously non-typesafe C API.
cubefox 2 days ago [-]
I believe the modern OpenGL replacement would be WebGPU, which is not just made for browsers, and which isn't as low-level as Vulkan or DirectX 12.
kbolino 2 days ago [-]
I don't think any major platform that ever supported OpenGL or OpenGL ES--including desktops, smartphones/tablets, and web browsers--has actually removed it yet. Apple will probably be the first to pull the plug, but they've only aggressively deprecated it so far.
jhasse 2 days ago [-]
How exactly is it aggressiv? I'm selling games using OpenGL on iOS, iPadOS, tvOS and macOS, works with all of their latest hardware. I'm not getting a warning or any sign from them that they will remove support.
kbolino 2 days ago [-]
It was my understanding (which could definitely be wrong) that their OpenGL support is both behind the times--which is impressive since OpenGL has received no major new features AFAIK in the past decade, the topic of this HN post notwithstanding--and won't even get any bugfixes.
charlotte-fyi 2 days ago [-]
The last supported version they ship doesn't support compute, which is a pretty big limitation.
catapart 2 days ago [-]
This sounds pretty cool, but can anyone dumb this down for me? Mesh shaders are good because they are more efficient than the general purpose triangle shaders? Or is this something else entirely?
flohofwoe 2 days ago [-]
It's essentially a replacement for vertex shaders which map more closely to how GPUs are processing big and complex triangle meshes as small packets of vertices in parallel by doing the job of splitting a complex triangle mesh into such small packets of vertices in an offline asset-pipeline job instead of relying too much on 'hardware magic' like vertex caches.
AFAIK mesh shaders also get rid of (the ever troublesome) geometry shaders and hull shaders, but don't quote me on that :)
By far most traditional triangle rendering use cases should only see minimal performance improvements though, it's very much the definition of 'diminishing returns'.
It's definitely more straightforward and 'elegant' though.
Oh, awesome! Yeah, that's a great introduction. Seems like it introduces a new abstraction that allows a single mesh to be mapped to much smaller groups of vertices so you can take advantage of BVHs and stuff like that on a more granular level, right in the shader code. Very cool stuff! Thanks for the info.
ProofHouse 2 days ago [-]
Fundamentally, for OpenGL, "getting shaders" meant moving from a fixed, built-in set of graphics effects to giving developers custom control over the graphics pipeline.
Imagine you hired a robot artist to draw.
Before Shaders (The Old Way): The robot had a fixed set of instructions. You could only tell it "draw a red circle here" or "draw a blue square there." You could change the colors and basic shapes, but you couldn't change how it drew them. This was called the fixed-function pipeline.
After Shaders (The New Way): You can now give the robot custom, programmable instructions, or shaders. You can write little programs that tell it exactly how to draw things.
The Two Original Shaders
This programmability was primarily split into two types of shaders:
Vertex Shader: This program runs for every single point (vertex) of a 3D model. Its job is to figure out where that point should be positioned on your 2D screen. You could now program custom effects like making a character model jiggle or a flag wave in the wind.
Fragment (or Pixel) Shader: After the shape is positioned, this program runs for every single pixel inside that shape. Its job is to decide the final color of that pixel. This is where you program complex lighting, shadows, reflections, and surface textures like wood grain or rust.
brcmthrowaway 2 days ago [-]
What about bump mapping, where's that done? That's a texture that changes the geometry.
tiniuclx 1 days ago [-]
That's usually a job for the fragment shader.
jayd16 2 days ago [-]
It doesn't change the geometry, it just changes the lighting to give that appearance.
cubefox 2 days ago [-]
As far as I understand, mesh shaders allow you to generate arbitrary geometry on the GPU. That wasn't possible with the traditional vertex pipeline, which only allowed specialized mesh transformations like tesselation.
For example, hair meshes (lots of small strands) are usually generated on the CPU from some basic parameters (basic hairstyle shape, hair color, strand density, curliness, fuzziness etc) and then the generated mesh (which could be quite large) is loaded onto the GPU. But the GPU could do that itself using mesh shaders, saving a lot of memory bandwidth. Here is a paper about this idea: https://www.cemyuksel.com/research/hairmesh_rendering/Real-T...
However, the main application of mesh shaders currently is more restricted: Meshes are chunked into patches (meshlets), which allows for more fine grained occlusion culling of occluded geometry.
Though most these things, I believe, can already be done with compute shaders, although perhaps not as elegantly, or with some overhead.
SeriousM 2 days ago [-]
It seems to me, as not so 3d savy, that 3d objects and shaders have a similar connection as html structure and css.
Nowadays you need a structure of objects yet the layout, color and behavor comes from css.
In this regard, 3d scenes offer the elements but shaders can design them much more efficient than a engine ever could.
Is that accurate?
Btw, can objects modified by shaders signal collisions?
flohofwoe 2 days ago [-]
3D scenes (closest thing to the DOM) and materials (closest thing to CSS) are several abstraction layers above what modern 3D APIs provide, this is more 'rendering/game engine' territory.
3D APIs are more on the level of 'draw this list of triangles, and the color of a specific pixel in the triangle is computed like this: (hundreds of lines of pixel shader code)" - but even this is slowly being being replaced by even lower level code which implements completely custom rendering pipelines entirely on the GPU.
adastra22 2 days ago [-]
Shaders are not layout. I don't think there is an HTML/DOM analogy here that works. But if you had to force one, shaders are more like Javascript. It's a terrible analogy though.
Collisions aren't part of a graphics API.
flohofwoe 2 days ago [-]
> Collisions aren't part of a graphics API.
You can do occlusion queries though, which is a form of 2D collision detection similar to what home computer sprite hardware provided ;)
Makes me sad this won't ever get to webgl since 'webgl3' is basically not considered anymore (replaced by webgpu)
Pannoniae 2 days ago [-]
Wow, very nice ;)
OpenGL was a very nice API and even despite its shortcomings, it is quite telling that VK didn't fully replace it 10 years later.
Cross-vendor mesh shader support is great - we had NV_mesh_shader for quite a while but it's great that it's also supported on AMD now. It's good for voxel games like this - the shape of the vertex data is fairly fixed and very compressible, mesh shaders can really cut down on the VRAM usage and help reduce overhead.
Most minecraft optimisation mods generally try to reduce drawcalls by batching chunks (16x16x16) into bigger regions and use more modern OpenGL to reduce API overhead.
This mod does GPU-driven culling for invisible chunk sections (so the hidden chunks aren't rendered but without a roundtrip to the CPU) and also generates the triangles themselves with a mesh shader from the terrain data, which cuts down on the vertex size a lot.
(EDIT: I reworded this section because the mod does only a few drawcalls in total so my wording was inaccurate. Sorry!)
Sadly, optimising the game is a bit tricky due to several reasons - the first big one is translucency sorting, because there are translucent blocks in the game like stained glass, which have to be properly sorted for the blending to work. (the base game doesn't sort correctly either by default....)
The second is that it's quite overengineered, so improving it while also not breaking other mods and accidentally fixing vanilla bugs is quite hard.
There are further improvements possible but honestly, this is day and night compared to the vanilla renderer :)
For us mere mortals (not working at Unity or Unreal), the complexity is just too much. Vulkan tries to abstract desktop and mobile together, but if you're making an indie game, there's no value for you in that. The GL/GLES split was better because each could evolve to its strengths instead of being chained to a fundamentally different design.
The global state in OpenGL is certainly an annoyance, but I do not think that replacing it with fixed pipelines is an improvement, especially considering that most of that state is just a register write in desktop GPUs. Luckily, they eased up on that, but the API is still confusing, the defaults are not sane, and you need vendor-specific advice to know what's usable and what isn't. Ironically, writing Vulkan makes you more vendor-dependent in a sense, because you don't have OpenGL extension hell - you have Vulkan extension hell AND a bunch of incidental complexity around the used formats and layouts and whatnot.
On a more positive note, I seriously hope that OpenGL won't be entirely abandoned in the future, it has been a great API so far and it only really has small issues and driver problems but nothing really unfixable.
flohofwoe 2 days ago [-]
> OpenGL was a very nice API
I think this is an extremely subjective take :) If you haven't been closely following OpenGL development since the late 1990s it is a very confusing API, since it simply stacks new concepts on top of old concepts all the way back to GL 2.0. E.g. if anything good can be said about Vulkan it's that at least it isn't such a hot mess of an API (yet) like OpenGL has become in the last 25 years ;)
Just look at glVertexAttribPointer()... it's an absolute mess of hidden footguns. A call to glVertexAttribPointer() 'captures' the current global vertex buffer binding for that attribute (very common source of bugs when working with vertex-input from different buffers), and the 'pointer' argument isn't a pointer at all, but a byte-offset into a vertex buffer. The entire API is full of such weird "sediment layers", and yes there are more recent vertex specification functions which are cleaner, but the old functions are still part of the new GL versions and just contribute to the confusion for new people trying to understand the API.
Pannoniae 2 days ago [-]
>I think this take is an extremely subjective take.
Okay fair but that's all takes on this site :)
Yes, vertexAttribPointer is a footgun (in my project I wrote an analyser to generate a compiler error when you write it down...) but luckily in modern OpenGL it doesn't matter because you have separated vertex format. The names are confusing because it's legacy shit but the functionality is there. It's very much not as clean as other APIs but it gets the job done.
If you stick to the modern versions (so bindVertexBuffer / vertexAttribFormat / VertexAttribBinding) and do one VAO per vertex format, it's quite nice. And just forbid using the old ones. ;)
More broadly, I admit it's a subjective thing but I find these issues much smaller than like, broader conceptual issues. You mix the function names up a few times then you learn not to do it. But when an API is just fundamentally unergonomic and inflexible, you can't really get past that. Maybe you get used to it after a while but the pain will always be there....
phkahler 2 days ago [-]
Why GL_EXT_mesh_shader and not GL_mesh_shader?
And why for ES? I thought ES was for less advanced hardware.
OnionBlender 2 days ago [-]
EXT is a Khronos naming convention. It means it is a generic extension, intended to be implemented across implementations.
I just wish there was more literature about this, especially about perf implications. Also synchronization is very painful, which may be why this is hard to do on a driver level inside OpenGL
But I haven’t written OpenGL since Metal debuted over a decade ago.
Regarding bindless textures, they're really ugly to use. Shader buffer load is so much better, being able to access everything with simple pointers.
I wanted to say that with some compiler hacking it should be possible to lower SPIR-V using GL_EXT_buffer_reference into bindless image loads, but SPIR-V doesn't have standardized bindless texture, duh!
If you are using Slang, then you just access everything as standard pointers to chunks of GPU memory.
And it's mostly Intel and mobile dragging their feet on VK_EXT_descriptor_buffer ...
If you want that extension you're going to have better luck convincing the Zink folks:
https://docs.mesa3d.org/drivers/zink.html "The Zink driver is a Gallium driver that emits Vulkan API calls instead of targeting a specific GPU architecture. This can be used to get full desktop OpenGL support on devices that only support Vulkan."
However, you're still probably going to have to come off of GLSL and use Slang or HLSL. The features you want are simply not going to get developed in the GLSL compiler at this point.
They exist in GLSL on Nvidia devices. If other vendors refuse to implement them, then I will be an Nvidia-only developer. Fine by me. I no longer care about other vendors if they completely ignore massive quality of life features.
Also I wish that GL_LINES was open-sourced for other platforms. Maybe it is in the OpenGL spec and I just haven't looked. I've attempted some other techniques like having the fragment shader draw a border around each triangle, but they all have their drawbacks.
This works for most other regular shapes too, like a relatively tight bounding box for circles if you're drawing a bunch of them.
Draw in screen space based on projected points in world space.
set gl_Color to your desired color vec and bam, line.
As far as other platforms, there's VK_EXT_line_rasterization which is a port of opengl line drawing functionality to vulkan.
https://github.com/mattdesl/webgl-lines
https://hundredrabbits.itch.io/verreciel
PS— I still play Retro, and dream of resuscitating it :)
nvidium is using GL_NV_mesh_shader which is only available for nVIDIA cards. This mod is the only game/mod I know of that uses mesh shaders & is OpenGL. & so the new gl extension will let users of other vendors use the mod if it gets updated to use the new extension.
What is the current state of OpenGL, I thought it had faded away?
Can you provide a reference for this? I work in the GPU driver space (not on either of these apis), but from my understanding Vulkan wasn't meant to replace OpenGL, it was only introduced to give developers the chance at getting lower level in the hardware (still agnostic from the hardware, at least compared to compiling PTX/CUDA or against AMD's PAL directly, many still think they failed.) I would still highly advocate for developers using OpenGL or dx11 if their game/software doesn't need the capabilities of Vulkan or dx12. And even if you did, you might be able to get away with interop and do small parts with the lower api and leave everything else in the higher api.
I will admit I don't like the trend of all the fancy new features only getting introduced into Vulkan and dx12, but I'm not sure how to change that trend.
Those are the main reasons IMO why most people say it's deprecated.
Closer to the hardware, more control, fewer workarounds because the driver is doing something "clever" hidden behind the scenes. The tradeoff is greater complexity.
Mere mortals are supposed to use a game engine, or a scene graph library (e.g. VulkanSceneGraph), or stick with OpenGL for now.
The long-term future for OpenGL is to be implemented on top of Vulkan (specifically the Mesa Zink driver that the blog post author is the main developer of).
To what hardware? Ancient desktop GPUs vs modern desktop GPUs? Ancient smartphones? Modern smartphones? Consoles? Vulkan is an abstraction of a huge set of diverging hardware architectures.
And a pretty bad one, on my opinion. If you need to make an abstraction due to fundamentally different hardware, then at least make an abstraction that isn't terribly overengineered for little to no gain.
Hence why on Android, even with Google nowadays enforcing Vulkan, if you want to deal with a less painful experience in driver bugs, better stick with OpenGL ES, outside Pixel and Samsung phones.
I fully expect it to be split into Vulkan ES sooner or later.
I would be very surprised if current Vulkan drivers are any different in this regard, and if yes then probably only because Vulkan isn't as popular as D3D for PC games.
Vulkan is in a weird place that it promised a low-level explicit API close to the hardware, but then still doesn't really match any concrete GPU architecture and it still needs to abstract over very different GPU architectures.
At the very least there should have been different APIs for desktop and mobile GPUs (not that the GL vs GLES split was great, but at least that way the requirements for mobile GPUs don't hold back the desktop API).
And then there's the issue that also ruined OpenGL: the vendor extension mess.
https://docs.mesa3d.org/drivers/zink.html
The last OpenGL release 4.6 was in 2017... I think that speaks for itself ;)
And at least on macOS, OpenGL is officially deprecated, stuck at 4.1 and is also quickly rotting (despite running on top of Metal now - but I don't think anybody at Apple is doing serious maintenance work on their OpenGL implementation).
Look at Mesa release notes for example, there's a steady stream of driver feature work and bugfixes for GL: https://docs.mesa3d.org/relnotes/25.2.0.html (search for "gl_")
A slow moving graphics API is a good thing for many uses.
People are writing new OpenGL code all the time. See eg HN story sumbmissions: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
It's not slow moving. It's completely frozen. The Mesa guys are the only ones actually fixing bugs and improving implementations, but the spec is completely frozen and unmaintained. Apple, Microsoft and Google don't really care if OpenGL works well on their platforms.
but, literally this article is about something new that was added to the OpenGL spec
I suppose the same is true of Direct3D 11, though. Only the Direct3D 12 spec has been updated in years from what I can tell. (I'm not a graphics programmer.)
In other words nothing changed. The OpengGL standard had been well past 4.1 for years when Apple released Metal. People working with various 3D tools had to disable system integrity checks to install working drivers from NVIDIA to replace whatever Apple shipped by default.
As a sidenote, I've very much enjoyed your blog, and developed a similar handle system as yours around the same time. Mine uses 32 bits though - 15 for index, 1 for misc stuff, 8 for random key, and 8 for object type :^)
From macOS 10.7 to 10.9, you'd get an OpenGL 3.2 context. As OpenGL 4.1 is backward compatible to OpenGL 3.2, it's fine that the same code gets OpenGL 4.1 now.
Basically, macOS will provide an "old" API to programs that need it, which is fixed at 2.1, and a "modern" API to programs that know how to ask for it, which has settled at 4.1 and is unlikely to change.
OpenGL 4.1 is harmonised with OpenGL ES 2.0. Almost the same rendering model, features, extensions, etc. On iOS, iPadOS etc you can use OpenGL ES 2.0, and no version of OpenGL (non-ES), so my guess is that's why macOS settled on OpenGL 4.1. Both platforms offer the same OpenGL rendering features, but through slightly different APIs.
But if you request 4.1 over GLX (which uses X11/Xorg/XQuartz), the X11 code only supports OpenGL 2.1. For example, if you're porting some Linux code or other GLX examples over.
Unfortunately, the GLX limitation is probably just due to the Xorg-based XQuartz being open source but only minimally maintained since before OpenGL 3.2 was added to macOS. XQuartz uses Xorg and Mesa, which have all the bindings for 4.1, but some of them are not quite wired up.
It's the only way to ship portable 3D software across the desktop and mobile platforms without platform specific code paths, thanks to the API fragmentation and proprietary platform antics from our beloved vendors.
In some years WebGPU may mature and start gaining parity (webgl took a looooong time to mature), and after that it'll still take more years for applications to switch given older hardware, the software inertia needed to port all the middleware over etc.
In any case it’s not true anyway. WebGPU also does away with the global state driver, which has always been a productivity headache/source of bugs within OpenGL, and gives better abstractions with pipelines and command buffers.
Also, compute could have easily been added to WebGL, making WebGL pretty much on-par with WebGPU, just 7 years earlier. It didn't happen because WebGPU was supposed to be a better replacement, which it never became. It just became something different-but-not-better.
If you'd have to do even half of all the completely unnecessary stuff that Vulkan forces you to do in CUDA, CUDA would have never become as popular as it is.
They are the only set of 3D APIs that have been adopted in the mainstream computing, designed for managed languages, instead of year another thing to be consumed by C.
Technically Metal is also used by a managed language, but it was designed for Objective-C/C++ first, with Swift as official binding.
Microsoft gave up on Managed Direct X and XNA, and even with all the safety talks, Direct X team doesn't care to provide official COM bindings to C#.
Thus that leaves us WebGL and WebGPU for managed languages fans, which even if lagging behind, as PlayCanvas and ShaderToy show, there are enough capabilities on the shader languages that have not yet taken off.
Metal allows to disable refcounted lifetime management when recording commands since it actually adds significant overhead and D3D12 and Vulkan removed it entirely.
Unfortunately WebGPU potentially produces even more garbage than WebGL2, and we'll have yet to see how this turns out. Some drawcall heavy code actually runs faster on WebGL2 than WebGPU which really doesn't look great for a modern 3D API (not mainly because of GC but every little bit of overhead counts).
So we end up in an internal cycle that we cannot get rid of.
Metal and Web 3D APIs add other consumer languages in mind, you also see this in how ANARI is being designed.
Yes every little bit of performance counts, but it cannot be that APIs get designed as if everyone is still coding in Assembly, and then it is up to whoever cares to actually build proper high level abstractions on top, that is how we end up with Vulkan.
Why not though? In the end an API call is an API call, and everything is compiled down to machine code no matter what the source language is.
FWIW, the high-level "OOP-isms" of the Metal API is also its biggest downside. Even simple create-option "structs" like MTLRenderPassDescriptor are fully lifetime-managed Objective-C objects where every field access is a method call - that's simply unnecessary overkill.
And ironically, the most binding-friendly API for high-level languages might still be OpenGL, since this doesn't have any structs or 'objects with methods', but only plain old function calls with primitive-type parameters and the only usage of pointers is for pointing to unstructured 'bulk data' like vertex-buffer- or texture-content, this maps very well even to entirely un-C-like languages - and the changes that WebGL did to the GL API (for instance adding 'proper' JS objects for textures and buffers) are arguably a step back compared to native GL where those resource objects are just opaque handles.
The ANARI effort was born exactly because the visualisation industry refusal to adopt Vulkan as is.
If anything it looks more like an admission by Khronos that Vulkan wasn't such a great idea (but a 3D API that's based on scene graphs isn't either, so I'm not sure what's so great about ANARI tbh).
https://github.com/KhronosGroup/ANARI-SDK/tree/next_release/...
It pretty much is by now if you can use Vulkan 1.4 (or even 1.3). It's a pretty lean and mean API once you've got it bootstrapped.
There's still a lot of setup code to get off the ground (device enumeration, extensions and features, swapchain setup, pipeline layouts), but beyond that Vulkan is much nicer to work with than OpenGL. Just gotta get past the initial hurdle.
Just directly talking to the C API in the tutorials/examples instead of custom wrapper code would be a lot more helpful since you'd don't need to sift through the custom abstraction layers (even if it would be slightly more code).
E.g. have a look at the code snippets in here and weep in despair ;)
https://docs.vulkan.org/tutorial/latest/03_Drawing_a_triangl...
Trying to obfuscate all the options goes against what Vulkan was created for. Use OpenGL 4.6/WebGPU if you want simplicity.
And the rest of the API is full of similar examples of wasting developer time for the common code path.
Metal is a great example of providing both: a convenient 'beaten path' for 90% of use cases but still offering more verbose fallbacks when flexibility is needed.
Arguably, the original idea to provide a low-level explicit API also didn't quite work. Since GPU architectures are still vastly different (especially across desktop and mobile GPUs), a slightly more abstract API would be able to provide more wiggle room for drivers to implement an API feature more efficiently under the hood, and without requiring users to write different code paths for each GPU vendor.
Also, CUDA is targeting a single vendor, whereas Vulkan is targeting as many platforms as possible.
Between OpenGL ES 3.1 and Vulkan 1.1, I would certainly go with OpenGL ES.
https://www.qualcomm.com/products/internet-of-things/robotic...
Vulkan is also trying to expose as many options as possible so as to be extensible on as many platforms as possible. Also, Vulkan isn't even trying to make it more complex than it need be--this is just how complex graphics programming is period. The only reasons people think Vulkan/DX12 are overly complicated is because they're used to using APIs where the majority of the heavy lifting comes from the drivers.
Actually, it would be kinda neat to see an API that's fully designed assuming a coherent, cached, shared memory space between device and host. Metal I guess is closest.
Desktop GPUs. Tiling GPUs are still in use on mobile and you can't use the tiling hardware effectively without baking the description into pipelines.
> You could cut things down a ton with a new API.
VK_KHR_dynamic_rendering is what you are looking for
> Actually, it would be kinda neat to see an API that's fully designed assuming a coherent, cached, shared memory space between device and host.
You can just ask for exactly that--even on Vulkan. If you don't want to support computer systems that don't support RBAR, you can do that.
Have you used Vulkan? Specifying required hardware support for your physical device is literally one of the first thing you do when setting up Vulkan.
Tbf, the distinction between rendering and compute has been disappearing for quite a while now, apart from texture sampling there isn't much reason to have hardware that's dedicated for rendering tasks on GPUs, and when there's hardly any dedicated rendering hardware on GPUs, why still have dedicated rendering APIs?
Note that it's not always better. The task shaders are quite hardware specific and it makes sense to ship defaults inside the driver.
More traditional rendering APIs can then be build on top of such a "compute-first-API", but that shouldn't be the job Khronos.
I believe the llama.cpp Vulkan backend is inoperable on Adreno GPUs
The worst thing about OpenGL is probably the hilariously non-typesafe C API.
AFAIK mesh shaders also get rid of (the ever troublesome) geometry shaders and hull shaders, but don't quote me on that :)
By far most traditional triangle rendering use cases should only see minimal performance improvements though, it's very much the definition of 'diminishing returns'.
It's definitely more straightforward and 'elegant' though.
PS: this is a pretty good introduction I think https://gpuopen.com/learn/mesh_shaders/mesh_shaders-from_ver...
Imagine you hired a robot artist to draw.
Before Shaders (The Old Way): The robot had a fixed set of instructions. You could only tell it "draw a red circle here" or "draw a blue square there." You could change the colors and basic shapes, but you couldn't change how it drew them. This was called the fixed-function pipeline.
After Shaders (The New Way): You can now give the robot custom, programmable instructions, or shaders. You can write little programs that tell it exactly how to draw things.
The Two Original Shaders This programmability was primarily split into two types of shaders:
Vertex Shader: This program runs for every single point (vertex) of a 3D model. Its job is to figure out where that point should be positioned on your 2D screen. You could now program custom effects like making a character model jiggle or a flag wave in the wind.
Fragment (or Pixel) Shader: After the shape is positioned, this program runs for every single pixel inside that shape. Its job is to decide the final color of that pixel. This is where you program complex lighting, shadows, reflections, and surface textures like wood grain or rust.
For example, hair meshes (lots of small strands) are usually generated on the CPU from some basic parameters (basic hairstyle shape, hair color, strand density, curliness, fuzziness etc) and then the generated mesh (which could be quite large) is loaded onto the GPU. But the GPU could do that itself using mesh shaders, saving a lot of memory bandwidth. Here is a paper about this idea: https://www.cemyuksel.com/research/hairmesh_rendering/Real-T...
However, the main application of mesh shaders currently is more restricted: Meshes are chunked into patches (meshlets), which allows for more fine grained occlusion culling of occluded geometry.
Though most these things, I believe, can already be done with compute shaders, although perhaps not as elegantly, or with some overhead.
In this regard, 3d scenes offer the elements but shaders can design them much more efficient than a engine ever could.
Is that accurate?
Btw, can objects modified by shaders signal collisions?
3D APIs are more on the level of 'draw this list of triangles, and the color of a specific pixel in the triangle is computed like this: (hundreds of lines of pixel shader code)" - but even this is slowly being being replaced by even lower level code which implements completely custom rendering pipelines entirely on the GPU.
Collisions aren't part of a graphics API.
You can do occlusion queries though, which is a form of 2D collision detection similar to what home computer sprite hardware provided ;)
- https://alteredqualia.com/css-shaders/article/
- https://developer.mozilla.org/en-US/docs/Web/API/Houdini_API...
OpenGL was a very nice API and even despite its shortcomings, it is quite telling that VK didn't fully replace it 10 years later.
Cross-vendor mesh shader support is great - we had NV_mesh_shader for quite a while but it's great that it's also supported on AMD now. It's good for voxel games like this - the shape of the vertex data is fairly fixed and very compressible, mesh shaders can really cut down on the VRAM usage and help reduce overhead.
Most minecraft optimisation mods generally try to reduce drawcalls by batching chunks (16x16x16) into bigger regions and use more modern OpenGL to reduce API overhead.
This mod does GPU-driven culling for invisible chunk sections (so the hidden chunks aren't rendered but without a roundtrip to the CPU) and also generates the triangles themselves with a mesh shader from the terrain data, which cuts down on the vertex size a lot. (EDIT: I reworded this section because the mod does only a few drawcalls in total so my wording was inaccurate. Sorry!)
Sadly, optimising the game is a bit tricky due to several reasons - the first big one is translucency sorting, because there are translucent blocks in the game like stained glass, which have to be properly sorted for the blending to work. (the base game doesn't sort correctly either by default....)
The second is that it's quite overengineered, so improving it while also not breaking other mods and accidentally fixing vanilla bugs is quite hard.
There are further improvements possible but honestly, this is day and night compared to the vanilla renderer :)
For us mere mortals (not working at Unity or Unreal), the complexity is just too much. Vulkan tries to abstract desktop and mobile together, but if you're making an indie game, there's no value for you in that. The GL/GLES split was better because each could evolve to its strengths instead of being chained to a fundamentally different design.
The global state in OpenGL is certainly an annoyance, but I do not think that replacing it with fixed pipelines is an improvement, especially considering that most of that state is just a register write in desktop GPUs. Luckily, they eased up on that, but the API is still confusing, the defaults are not sane, and you need vendor-specific advice to know what's usable and what isn't. Ironically, writing Vulkan makes you more vendor-dependent in a sense, because you don't have OpenGL extension hell - you have Vulkan extension hell AND a bunch of incidental complexity around the used formats and layouts and whatnot.
On a more positive note, I seriously hope that OpenGL won't be entirely abandoned in the future, it has been a great API so far and it only really has small issues and driver problems but nothing really unfixable.
I think this is an extremely subjective take :) If you haven't been closely following OpenGL development since the late 1990s it is a very confusing API, since it simply stacks new concepts on top of old concepts all the way back to GL 2.0. E.g. if anything good can be said about Vulkan it's that at least it isn't such a hot mess of an API (yet) like OpenGL has become in the last 25 years ;)
Just look at glVertexAttribPointer()... it's an absolute mess of hidden footguns. A call to glVertexAttribPointer() 'captures' the current global vertex buffer binding for that attribute (very common source of bugs when working with vertex-input from different buffers), and the 'pointer' argument isn't a pointer at all, but a byte-offset into a vertex buffer. The entire API is full of such weird "sediment layers", and yes there are more recent vertex specification functions which are cleaner, but the old functions are still part of the new GL versions and just contribute to the confusion for new people trying to understand the API.
Okay fair but that's all takes on this site :)
Yes, vertexAttribPointer is a footgun (in my project I wrote an analyser to generate a compiler error when you write it down...) but luckily in modern OpenGL it doesn't matter because you have separated vertex format. The names are confusing because it's legacy shit but the functionality is there. It's very much not as clean as other APIs but it gets the job done.
If you stick to the modern versions (so bindVertexBuffer / vertexAttribFormat / VertexAttribBinding) and do one VAO per vertex format, it's quite nice. And just forbid using the old ones. ;)
More broadly, I admit it's a subjective thing but I find these issues much smaller than like, broader conceptual issues. You mix the function names up a few times then you learn not to do it. But when an API is just fundamentally unergonomic and inflexible, you can't really get past that. Maybe you get used to it after a while but the pain will always be there....
And why for ES? I thought ES was for less advanced hardware.
https://wikis.khronos.org/opengl/OpenGL_Extension#Extension_...