July 28, 2010
While VC 2005/2008 targeting x64 generates SSE code for floating point code, fmod() still uses the x87 FPU, and more importantly it assumes that the divide by 0 exception flag is clear going in (meaning if it is set prior to the call, the call will throw an exception or return #.IND regardless of the input). Apparently they assume that since the compiler won't possibly generate code that would cause the divide by 0 floating point exception flag to be set, then it would safe to assume that flag will always be clear. Or it could be a typo. If you use assembly code, or load a module compiled with another compiler that generates x87 code, this can be a huge problem.
Take this example (hi.cpp):
#include <stdio.h> #include <math.h> extern "C" void asmfunc(); int main() { asmfunc(); printf("%f\n",fmod(1.0,2.0)); return 0; }
and hihi.asm (compile with nasm -f win64):
SECTION .text global asmfunc asmfunc: fld1 fldz fdivp fstp st0 ret
Compiling this (cl.exe hi.cpp hihi.obj) and running it does not print 1.0, as it should.
The solution we use is to call 'fclex' after any code that might use the FPU. Or not use fmod(). Or call fclex before fmod() every time. I should note that if you use ICC with VC200x, it doesn't have this problem (it presumably has a faster, correct fmod() implementation).
6 Comments
LICEcap has a nice UI (in that you position/size the window where you want to capture, and can move it around while recording). We support writing to .GIF directly (big thanks/credit/blame to Schwa for getting the palette generation working as well as it does), as well as to a new format called .LCF.
LCF compresses by taking a series of frames, say, 20 frames, and then dividing each frame into slices, approx 128x16px each. Each slice is then compared to the same slice on the previous frame, and (if different) encoded directly after the previous frame. zlib is used to remove redundancy (often slices don't completely change from frame to frame, i.e. scrolls or small updates will compress very well). This is all done in 16bpp, and the end result is quite good compression, and lossless (well, 16bpp lossless) quality. REAPER supports playing back the .LCF files, too. The biggest down side is high memory use during compression/decompression (20 frames of 640x480x16bpp is about 12MB, and for smooth CPU distribution you end up using twice that).
I should mention that the primary reason for us making this tool was the desire to post animated gifs of new features in REAPER with the changelog. Hopefully we'll follow through on that.
On a related note, tomorrow (or soonish), I plan to post my latest additions on how to make OS X applications not perform terribly (new one: avoid avoid AVOID CGBitmapContextCreateImage() like the plague. HOLY CRAP it is bad to use). Apple: please, for the love of God, either make your documentation a Wiki, or hire someone who actually writes
(multi-platform) applications with your APIs to write documentation.
9 Comments