I first tried porting everything to float, but it turns out that that
makes a compare-render run (with all 1520 tests succeeding) 9s slower
so I decided to keep the existing U8 code.
A side benefit is that saving the diff to PNG will continue creating
U8 PNGs.