If you push the shim to JavaScript then it isn't direct DOM access. But you could push the shim into the C++ browser code, providing one DOM API to JavaScript and slightly smaller shimmed version to WASM. Then you cut out the JavaScript middle-man and can call it "direct DOM access".
Of course you couldn't provide idiomatic versions for every language, but the JS shims also can't really do that. Providing something close to idiomatic C would be a huge step up, the language libraries can then either offer a C-like API or choose to build new abstractions on top of it
> But you could push the shim into the C++ browser code
That's easier said than done because of details like that you can't build a JS string on the C++ side, the translation from string data on the WASM heap into a JS object needs to happen on the JS side.
But this is how all the Emscripten web API shims work, and they do this quite efficiently, and some of those shims also split their work between the JS and C/C++ side.
So to the programmer it does look like with Emscripten there's direct access to the (for instance) WebGL or WebGPU APIs, but in reality there's still quite a bit of JS code involved for each call (which isn't a problem really as long as no expensive marshalling needs to happen in such a call, since the call overhead from WASM into JS alone is really minimal).
Of course you couldn't provide idiomatic versions for every language, but the JS shims also can't really do that. Providing something close to idiomatic C would be a huge step up, the language libraries can then either offer a C-like API or choose to build new abstractions on top of it