Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Graphics Pipelines for Young Bloods (jeremyong.com)
136 points by ingve on May 22, 2021 | hide | past | favorite | 17 comments


This is somewhat of an incomplete analsysis, because you can use light tiling with either forward and deferred. And that's pretty common nowadays. The real advantage to deferred is to save on quad overshading and balance the use of the GPUs. By making lighting homogeneous across the scene you'll have a lot better utilization of the GPU's resources, since there's no wasted pixel effort during the expensive lighting loop.

Downside is of course transparency and the need to dump your full material model into a G buffer...


Hey author here :) I actually deliberately kept the tiled lighting section hopefully devoid of any consideration for deferred vs forward since as you say we can (and do) do tiled light culling with both algorithms. Quad overshading and the use of "helper lanes" is worth mentioning in the article maybe as a footnote. It certainly is another way in which the GPU's silicon can be underutilized, but I'd hesitate to say it is the "main reason" for deferred so much as "one of the reasons". I definitely struggled to walk the balance between summarizing the concerns at a coarse level, vs providing all the gory details. So to your point, definitely wasn't comprehensive, there's RTR4 for that :). "... for young bloods" is in the title after all


Sweet. I made an engine to learn OpenGL 2.0 and vertex shaders way back when, but moved on to other things in life. A couple of weeks ago I was curious what’s changed with the latest pipeline techniques. This feels like it will be a great way for me to catch up. Thanks for writing this!


Awesome! Note that there are better writeups for learning the specific implementation details for any given technique, and this article was mainly written to give people a rough “mental model” when assessing a whitepaper or siggraph talk or something. If I have time I’ll go back and add a section with references to the core techniques I see in use today.


> so we’re on the hook for handling things like computing finite-difference gradients for texture sampling, interpolating attributes, and computing barycentric coordinates (either via intrinsic/SV_Barycentrics or manually).

> While we lose a bunch of work that hardware may have done for us, cutting the pipeline here

From what I’ve read, modern GPUs have moved most of this work from fixed function hardware to implicitly inserted shader assembly. So, we’re on the hook for manually writing that code. But, it’s not missing out on as much hardware magic as one might think.

If I understand correctly, there has been so much focus of cramming general compute into GPUs that the fixed function hardware for pixel work has been stripped down to visibility rasterization, barycentric generation (accessible through the SV_barycectric instruction), and texture sampling (given coordinates and derivatives).


Great points but I want to say that this is partially true. Yes you’ll see code in the shader assembly needed to sample a texture for example, but it will use instructions for vertex interpolation you just don’t have access to directly if you cut the pipeline for a visibility buffer technique. Also for barycentrics, we can query that from the hardware rasterizer, but the tradeoff there is storage. Also, if you do this in compute, unless you have access to the recently added compute derivatives, you’re still responsible for managing the bookkeeping needed to compute the finite differences.

The general trend has been towards having general compute yes, but check the ISA and read your shader disassemblies before drawing any conclusions.


All great points as well. Thanks for putting all this together.


My pleasure! Thanks for reading and your comments


Every time I look at anything about graphics programming, my head implodes.

I just don't get it. I don't get how the matrix operations somehow magically work out. I don't get how it is that light bouncing/reflectivity works itself out in any reapectable amount of time. It all just sounds like a bunch of jargon, jumbled together, and pretty arrangement of pixels comes out.

Like I get the camera being set in a scene. I get it forms a frustrum with everything in the field of view. I get that you have a bunch of geometry in it's own independent coordinate systems, that are then glued together by transformation matricies.

I get U, V, W unwrap basically taking a 2D texture and mapping it to a 3D model. I get annotating fragments with material properties to aid on subsequent processing passes treating using linear trajectories and other Voodoo to figure out where reflections, shadows, and brightening should occur, and that part of what makes all of that doable is massively parallel co-processing devices divide and conquering the processing, which are orchestrated from the host system's memory space through driver API's.

But that's where my internal stack blows out.


All you see at a pixel is an estimation of the reflected light from all the light sources from the surfaces that provide a potential light path from the camera to the source (bounced accordingly off intermediate surfaces.)

In practice, it’s just a ton of hacks to estimate this value. It turns out if you do this reasonably accurately across all pixels and surfaces you get a pretty picture. Anything else is just optimization.

Read PBRT to learn how basic irradiance estimation should work and then rasterization becomes less about understanding the goal and more about understanding the hacks needed to achieve the goal in real time.


It sounds like you’re quite nearly there actually! One important skill imo is the ability to treat a certain layer of the stack or process like a black box (inputs and outputs). A fully generalized scene pipeline is layer upon layer of boxes, many of which will be opaque when you start off. How are clouds rendered? Fog? Reflections? Shadows? Ambient light? Each of these areas is an active research topic and you definitely couldn’t hope to understand all of it all at once.


You’re hitting a wall moving from mathematical theory to hardware implementation. I hit a similar wall at some point. When writing a rendering pipeline the problem is not the math, but making real world devices do that math.


> I don't get how the matrix operations somehow magically work out.

There's two secrets to this. The first trick is remembering what matrices do to vectors, boils down to equations of this form:

    x' = M[0] * x  +  M[1] * y  +  M[2] * z
    y' = M[3] * x  +  M[4] * y  +  M[5] * z
    z' = M[6] * x  +  M[7] * y  +  M[8] * z
I won't write the function for matrix/matrix multiplication, but when we invented it, we chose a function that's associative with respect to matrix/vector multiplication. That means that Ma * (Mb * v) = (Ma * Mb) * v. This turns out to be a very good property, because we can then "collapse" every single linear transform we ever want into a single matrix. Stack these however high you want: (Ma * Mb * Mc * Md * Me * Mf * ...) will always collapse down to a constant number of numbers.

In computer graphics, we tend to have a few "standard" matrices converting up the chain of spaces like this:

  vProjectiveSpacePosition = mProjectiveFromView * mViewFromWorld * mWorldFromModel * mModelFromBone * vBoneSpacePosition
And you can add your own if you want!

The second thing is that we cheat when it comes to perspective. The "correct" way to think about it is that your 4x4 matrix transforms into a 4-dimensional space known as "projective space" where parallel lines meet at some point.

But that's confusing, the better way to think about it is that we can't elegantly represent perspective with just 3 coordinates, so we cheat and add a weird fourth one. Don't worry about it too much, the math is mostly handled by higher-level libraries. Once you need to figure things out yourself, you should be able to learn what you need to learn. But for now, just get comfortable with the above.

> I don't get how it is that light bouncing/reflectivity works itself out in any reapectable amount of time

We actually don't do "light bouncing" in real-time, we typically emulate it. Think of it like this: if you shoot a bunch of light rays at a thing in the real world and measure the light back, you'll come back with a bunch of light reflected back (towards a sensor, but it could also be towards your eye).

Now try to curve fit that data to some equation. That's one of the most popular models for object materials, "Trowbridge-Reitz": https://www.desmos.com/calculator/8otf8w37ke . In practice, it's a combination of curve fitting to real-world data and a lot of smart people thinking about how to make it make physical sense. This models parameterized to work on some known set of materials (the working theory is "microfacets", which are tiny grooves on an object assumed to be mirrors). There are other models you can have, too, but this is the workhorse of the games industry.

To get reflections, we play tricks: pre-compute the incoming light for a given reflection vector, so it doesn't have to be done at runtime. But this comment is already long enough. Contact info is in profile if you want to ask me more questions.


I hope to one day move from web dev to graphics programming but it seems hard to break in to.

I feel like backend dev has turned my brain to mush and I miss the math we used to do in university.

Seems there a very few jobs in software dev where you actually get to use math/complex algorithms, graphics being one of them.


Don’t be too down on yourself. There’s actually more overlap between graphics programming and other disciplines than you might image. Imagine the gpu is your “server” and each cpu core is your “client.” We have a difficult job of feeding tons of data to keep the server occupied. We have to worry about memory hazards, as well as cpu-cpu synchronization and cpu-gpu synchronization. Like server engineers, we think about pipelining, data consistency, caching, compression, and more. The assets themselves need to be streamed from disk/network to memory, so that’s another systems programming concern.

This all happens independently of the actual kernels/shaders we dispatch on the GPU itself (where the 3d math and all that comes into play). There are definitely rendering programmers that specialize more in the stuff above though.


> we think about pipelining, data consistency, caching, compression, and more.

but i think all of these are incidental complexity brought on by the physical world, practicality of the hardware and api calls.

The _actual_ complexity in graphics rendering is the maths required,as well as understanding of the physics/modelling of the materials etc, and how to cut down unnecessary computation etc to achieve the framerate target.


My point is that there are plenty of engineers that work with graphics pipelines that don’t actually think about the 3d stuff that much. It’s one of many available avenues of specialization.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: