So even though OpenGL has been expanded dramatically since the days of the original wglCreateContext, this coupling has been retained.
This causes a nasty problem for multi-target output.
With the benefit of hindsight, I think it’s fair to say that this implicit behavior was always a bad idea, it just never really caused that much of a problem because most early OpenGL usage involved a single thread targeting a single window on a single monitor. Since this trivial scenario only involves one device context, there was never an issue.
However, now that it is typical for modern applications to be multi-threaded and target multiple windows on multiple monitors, the implicit device context targeting becomes a real issue.
Let’s start with the common case.
The most obvious problem the implicit device context target causes is that it is by definition impossible to draw to more than one window from the same OpenGL device context without repeatedly calling wglMakeCurrent() to switch its target. Even if all the application wants to do is draw one full-screen image to one monitor, and one full-screen image to another monitor, Windows treats this as two separate windows with two separate device contexts, thus at least one wglMakeCurrent() per window is required every frame.
This is bad for two reasons. First, wglMakeCurrent() is not assumed to be a lightweight operation on OpenGL drivers, and in general the OpenGL driver has no way of knowing a priori that all you’re doing with the wglMakeCurrent() is flip-flopping between two consistently targeted windows. Thus, if a driver wants to support this path efficiently, it introduces yet more “pattern discovery” into the GPU driver as it tries to determine that the user is not actually targeting a fresh window requiring new setup, but rather simply switching between two consistent windows in an attempt to update them both.
Second, wglMakeCurrent() is not an OpenGL call per se, it’s a Windows call, thus it is not platform-independent and cannot be used by the platform-independent part of a graphics layer. While this may seem like a minor issue, it turns out to be rather unfortunate, since operations that target the window back buffer now must thunk to the platform layer and back in order to ensure that the magic “target” of the operation has been set up before they can continue issuing what would otherwise have been platform-independent OpenGL code.
Things get worse when more threads get involved.
Things don’t get any better when multi-threaded code is involved. As per the original intention with “one GL context per thread”, it is actually well-specified in OpenGL for an application to make each of its threads (say, one per CPU core) have its own GL context so that streams of commands can be constructed separately. Assuming these threads work off a job queue, whichever thread dequeues some OpenGL work from the queue can simply do it on their context, and everything is fine.
However, all that falls apart once you need to target an output surface. Because each thread’s OpenGL context is implicitly targeting only one device context, a thread that tries to do OpenGL work that targets a specific device context must call wglMakeCurrent() using the HDC of the window in question. But it is illegal for two OpenGL contexts to target the same HDC at the same time, so now the two competing threads must introduce a mutex around the wglMakeCurrent().
While it may seem like this would be necessary in any case, the truth is that no such mutex is required during construction of the frame’s GPU commands — rather, the mutex (or more specifically, fence) required here is one on the GPU command retirement, namely that the streams of commands built by the threads should retire in a specific order on the GPU, regardless of when they were articulated to the driver.
There is a very simple API change that would solve these problems.
All of these problems come from one root cause: the fact that OpenGL contexts implicitly target a single Windows device context. But there’s no real reason that has to remain true. We could add a single pair of API calls and fix it.
OpenGL already has a robust set of APIs that deal with the concept of framebuffers. This was a necessary addition to the API because developers wanted to be able to render to textures, so instead of always having rendering target the implicit device context, they needed ways of targeting a texture instead. Thus we already have: