This was a fun read! I didn't expect to read an article about HAM6 today. Now onto the mandatory quibble/question. My gut feeling was that this kind of problem is a combinatorial one and that selecting the palette entries one by one isn't necessarily going to find the solution that has the least error globally?
As an analogy. Suppose we have a simple gradient from black to white (and that we already have those two shades locked down). Now we're tasked with finding a third shade. This type of algo should find the shade at 50% as the one that reduces error the most. If we then find another, it'll be one either at 25% or 75%. An algo that goes through all the combinations of two entries should find that picking 33% and 66% might reduce the overall error.
Agreed, you might be able to get better results brute forcing multiple colors at once at a lower color resolution, then gradient descent the optimal colors from there.
One other obvious optimization is to use a more sophisticated metric for "closeness" based on human color perception, not just simple pixel value delta- that could make a huge difference for the skin tones in the final example image.
This does not actually brute force the entire search space - the search space for all possible palettes is 2¹⁹² palettes, 16 times 12 bit, modulo 16! permutations and some other things like repeated colors. The algorithm described performs a greedy search picking the best fit for each palette entry in turns but it is at the very least not obvious that this will yield the optimal solution - maybe picking a suboptimal color for the first palette entry will allow a better fit later than picking the optimal color for the first palette entry. The same applies for choosing the color of the next pixel - it could be possible that making one pixel worse might allow subsequent pixel to be approximated better. Which then also leads to the choice of the error function - what is better, many medium deviation or many small deviation plus some big deviations?
Without careful analysis it is really hard to tell, at least for me, how a greedy approach compares to the optimal solution, but I would guess that one gets pretty close but I do not think that one is guaranteed to find the optimal solution. I would also guess that one could create pathological cases where the greedy approach produces surprisingly bad results.
> The same applies for choosing the color of the next pixel - it could be possible that making one pixel worse might allow subsequent pixel to be approximated better.
I was thinking the same thing: what if we looked at the next two pixels and tried to find the values that minimized total error. If done right that should essentially give horizontal 2x1 dithering for free.
The entire thing is essentially already a threshold dithering process, pick the closest available color and throw away the error. So one could certainly extend this to a more sophisticated dithering process and maybe even get rid of the first dithering process from 24 bit to 12 bit and do it all in one go. This also made me realize that there is no good reason why the author is iterating through all red, green and blue values, one could just compute the closest ones.
Huh, that's a really good point: wouldn't using a simple error diffusion dither give decent results here? We basically have 16 palette colors, plus three individual channel-delta ones, so all we have to do is check which of our nineteen options has the lowest error, instead of the normal threshold approach.
The thing is, you do not have the palette in advance, that is kind of the main point of the article. As the author says, one could build a palette in advance, but it is not clear that one of the common algorithms for that would yield a really good one as they look at the entire image but in this case we mostly need colors for strong edges that are not handled well by the HAM encoding. So maybe a good solution would be to first only use HAM and record all the errors, the build a palette that reduces those errors as much as possible, and then encode it again with HAM and the palette.
The thing is of course that the errors propagate with HAM as you can always only adjust one channel and so any change to a pixel will or at least might also affect errors of subsequent pixels. On top of that you might have error propagation from a dithering algorithm. But I think here I have reached the limit of my reasoning capability without having to pull out pen and paper and really digging into it. Final note, I would probably try to find a iterative algorithm that tries to reduces the error in several iterations as I think jumping directly to the best solution is probably pretty hard.
Scale down the problem and analyze it that way if one still wants to play with brute force. Smaller image and smaller palette. See what can be learned.
I have played with HAM conversion myself[0], but my approach has been to deliberately avoid brute-force. That's not just because it'd be less fun, but also as I'd like to get these algorithms working on the Amiga itself at some point.
My current local code has new strategies and also does dynamic hires, but it's an uncommitted mess.
The original Macintosh reserved something like 22K of the RAM for screen memory. I remember reading a little tech note about page-flipping: where you can pull in another 22K of RAM for a secondary screen buffer, write to it, and then with a single assignment to a 32-bit memory location, point the screen refresh to your secondary screen buffer. If you did it during the vertical screen blanking interval, pop!, instant "page flip".
I was playing around with 68K assembler to test this out: cleared one screen buffer to white, drew a diagonal black line — cleared the other screen buffer to white and also drew a diagonal black line ... a mirror of the first one though.
Watching it page flip between the two screen buffers for every screen refresh caused me to notice something interesting: you could actually get a pseudo 50% gray where a black pixel and white pixel shared the same row and column on the two buffers.
It might have made an interesting demo, but eating another 22K on a computer that was struggling (not the lower config models) with 128K seemed like a bad design decision. I suspect too, at 30 FPS, the gray would have been a bit "strobish"?
I definitely remember cheap green-phosphor CRTs that took a second or more to fade from full-brightness to zero-brightness. If you tried to strobe a pixel back and forth it would just look full brightness because it didn't have enough time to visibly fade before being turned on again.
Depending on how much you want to stress it, you could do what I call "time dithering". You have a sequence of b/w images in several buffers. The first buffer represents pixels that are meant to be more than 50% black. The next represents 25% black. Then 12.5% black and so on. At each frame refresh you use the least significant digits in the system clock to decide which buffer to point at the screen. A pixel that should be x% black will be rendered black x% of the time. If it refreshes often enough then you've made whatever bit depth of color you were after.
There are many hacks that can be used to improve this method I just outlined. Eg instead of picking a new buffer every frame, you can decide "switch or stay" from the current buffer. Or even just change which ones are displayed in some predetermined repeating sequence.
I did that on my PET 2001 after I added 16K of memory, since the screen was only 1K in size it was easy to rewrite the screen in assembler each time. Problem I tried do more than two frames and the fickler was too much to watch.
Finding the right palette for pictures like that is a good application of k-means clustering[1]. Remember to convert the colors into a perceptually uniform color space first, though.
Pretty cool flashback :) But to nitpick a bit: the ternary operator doesn't instruct the compiler to use a conditional move instruction, from the compiler's point of view, it's just a condition, no difference to a regular if statement.
...any reasonably "modern" C compiler should generate the same code for both functions.
(also I wonder if an even better image quality could be achieved by generating a copper list next to the image data which updates the color palette mid-image?)
> First, I keep testing all 16 possible R changes, G and B. And then, I try to find an even better solution using the current palette.
> It means 64 color distances to compute
Isn't it unnecessary to test all 16 R, G, and B values? I know the title of the post says "brute force", but you already know the desired R, G, or B value, and any other value seems guaranteed to have a larger color distance. (The distance function is subtracts actual R from desired R, right?)
Seems like you could compute only 19 color distance instead of 64: the closest R, G, and B plus the 16 palette entries.
It occurs to me that you could do error diffusion during the HAM-conversion stage. This would probably help break up the 3-pixel wide horizontal bands that occur during a big color change, causing a different component to switch first in successive scanlines. Specifically, imagine that you diffuse all error _down_. Changing from (0,2,3) to (31, 31, 31) you'd change R first, then G, then B. But with a portion of the error (0, -29, -28) added to the white pixel of the 2nd row, you'd change G first, B second, and R third.
After all these years I have an understanding of HAM, haha! I always noticed the color fringing when I was a teenager but never really bothered to find out why, and besides, this was before the Internet and I didn’t have a modem until fairly late with my Amiga. All I knew was that there was some wonky, hacky encoding going on that “made things complicated”.
As an analogy. Suppose we have a simple gradient from black to white (and that we already have those two shades locked down). Now we're tasked with finding a third shade. This type of algo should find the shade at 50% as the one that reduces error the most. If we then find another, it'll be one either at 25% or 75%. An algo that goes through all the combinations of two entries should find that picking 33% and 66% might reduce the overall error.
I feel like I'm missing something.