vrijdag 29 september 2017

Floating point numbers made simple

Based on a great post by Fabien Sanglard, over at http://fabiensanglard.net/floating_point_visually_explained/, I figured I should add some more information about floating point numbers. The information is universally applicable to any programming language and comes in handy in many weird cases. I recently helped somebody debug a weird memory corruption where his data was being overwritten by floating point numbers, and this knowledge of recognizing one on sight helped me to be able to do that.

 Basic floats (recap)

Fabien's basic explanation comes down to a floating point number being in a "window" of numbers, and then encoding which window it is, and what offset in that window it is at. In short, you have a window from a given power of two, to the next power of two. For example, a window from 4 to 8, or from 1024 to 2048. Each of these ranges of numbers forms a single window of numbers. Every window is the exact same amount of numbers - 8388608 to be exact, for a 32-bit float.

This has three important implications. First off, not all numbers can be represented. If your window is really big, there will not be enough numbers to go around to represent all the numbers accurately, so (for example) the number 16777217 is not representable as a float, no matter how hard you try. I picked this number specifically because it is 1 higher than 2^24, so it falls in the window from 16777216 to 33554432 - that's a 16.7 million number range, and there are only 8388608 numbers in a window so only half of them get to be representable accurately as a float.

Second, not all common numbers are in a window. The floating point standard defines 5 types of numbers:
  • Normal numbers that follow the rules as set out, and these are by far the most common number to be found
  • Zeroes - and yes, that's plural, because we have positive zero and negative zero.
  • Infinities, again plural because we have positive infinity and negative infinity
  • Not-a-number, the representation that's used when you've made a calculation that results in mathematical nonsense
  • Denormals. I will give a short intro, but it boils down to not being able to use them.
Third, there is a finite amount of windows to go around. In a float there are 8 bits reserved for the window and this allows us to have 256 different positive windows (and 256 different equivalent negative windows). Two of these windows, the first and the last one, are used for the special number encodings. That leaves 254 different windows for normal numbers... these are balanced around the middle point so window #127 ends up mapping from 1 to 2, window #126 maps from 0.5 to 1 and window #128 maps from 2 to 4.

Doing a bit of math shows that window #1, the first one we can use for normal numbers, maps from 1.1754944e-38 to 2.3509887e-38. That first number is 2^-126, and is exactly what FLT_MIN represents - the smallest normal floating-point number. Given that all windows represent the same amount of numbers, we now also know that there are 8 million floating point numbers smaller than 2.3509887e-38 ! We can do the same for window #254 and see it represents numbers from 1.7014118e+38 to 3.4028237e+38 - and that latter number is of course FLT_MAX. We also know that there are only 8 million numbers in this window - so they are really far apart!

Assembling floats

Now that we know how the parts of a float work together, we can assemble one by hand. A float has the window identifier (exponent), the offset in the window (mantissa) and the sign bit (whether it's positive or negative). Quite bluntly, it's literally the three of these concatenated as bits - one bit for the sign, then 8 bits for the exponent, then 23 bits for the mantissa. So for example:

The number positive 1 is a 0-bit for the sign, then 127 for the exponent (as it's the 1st number in window #127). Then we put in 0 for the mantissa (as it's the first number in the window, again) and we get 0x3F800000.
Let's try a harder one - pi. Pi is a positive number (so 0 for the sign), then 128 for the exponent (as it's in the window from 2 to 4). It's slightly over halfway there, so we calculate (pi - 2) / (4 - 2) * 8388608 and get 4788187 (by just copy/pasting the calculation to google - try it yourself!). Assembling this gets us

0x40000000 for the exponent (128 (0x80), the value, shifted left by 23 positions)
0x00490FDB for the mantissa (just converted to hexadecimal)

Assembled that becomes 0x40490FDB. Now let's see what the computer thinks of it:

    int values[] = {
      0x3F800000,   // We expect 1.0
      0x40490FDB    // We expect pi
    int main() {
      float* f = (float*)values;
      printf("%f %f\n", f[0], f[1]);

    $ ./test
    1.000000 3.141593

Looks to be pretty accurate. Try it yourself, but turn off optimization - your compiler can normally assume you're not doing these kinds of hacks, so the optimizer will break this code.

Special numbers

So there are four kinds of special numbers. For all of these we have to make special corner cases in doing maths, because they don't behave like normal numbers. The first two types are pretty simple ones:

- Positive zero is the first number in positive window #0.
- Negative zero is the first number in negative window #0.
- Positive infinity is the first number in positive window #255.
- Negative infinity is the first number in negative window #255.

We treat them special so that anything times zero is a zero, and anything times infinity is infinity. That brings the question - what's infinity divided by zero? zero times infinity? Or infinity divided by infinity? Or zero divided by zero?

All of these are of course nonsense questions - they're more of a mathematical oddity in doing derivatives and integrations. Your computer is not able to reply in any other way to such a question other than giving back a number though, so a number it will be - but it will be the "Not-a-number" reply. That's your computer telling you "This was nonsense and I cannot give you a numerical answer". These are encoded similar to infinity, except that it's not the first number. It doesn't really matter which number it is in window #255 - as long as it's not 0 it's a NaN.


This leaves us with the 5th type of number. A denormal is a number that's not normal, but only *just* outside of the normal range. A long time ago somebody thought about the number representations and noticed that there are 8 million numbers in window 0 that we're not using at all! What if we use this for a special "final window" of numbers from 0 to FLT_MIN, they thought. Then you can encode those numbers as well...

In practice though, this is not used often at all and only happens in very weird corner cases. In those corner cases it's not even that useful, because the precision suffers. There are only 8 million numbers between 0 and FLT_MIN, so unlike the normal numbers where dividing by two will go to the next window, every division by 2 will halve your precision instead. This leads to computer chip makers to make your computer really fast at doing anything that's not a denormal number - and implicitly, to go really slow when it is, because it requires special attention. Some computer makers then allowed you to "turn off" denormal handling, so that it would just treat it as zero - also called DaZ, Denormals-as-Zero. Some computer makers then just flat out don't implement denormal numbers any more, because nobody wants it and people turn it off anyway.

So this is why there are denormal numbers, and why you really shouldn't care. Your computer probably doesn't support them, and if it does it will slow down disproportionally if you use them making it a useless feature.

Further reading

All of these bits of information are from a standard that describes floating point numbers. Nearly all computers nowadays implement IEEE-754 floating point numbers. These are the numbers I described above. The standard also has 64-bit and larger numbers, which do the same basic idea as described above, but will just result in higher and harder to read numbers.

You can also take these insights and read 16-bit floating point numbers - they do the same logical things. Some omit the NaN/Infinity window to get just one more window of normal numbers.

vrijdag 3 februari 2017

How to execute data

Somebody recently inquired about recursion, and I wanted to provide him an example showing that you could do infinite recursion if you use tail recursion. Since there's no way to make a C++ compiler *have* to do tail recursion, I made the example inline bytecode to ensure it was a tail recursion.

#include <sys/mman.h>
#include <stdint.h>

int main() {
    char function[2] = { 0xEB, 0xFE };
    mprotect(((intptr_t)function | 0xFFF) - 0xFFF, 0x1000, PROT_EXEC | PROT_READ | PROT_WRITE);

Now the thing you will notice is mprotect. This is a function to tell your operating system that a given range of memory should have its properties modified. In this case, the function array is on the stack and sane operating systems map that as non-executable, to prevent you from executing your stack contents. The mprotect call circumvents this by explicitly asking it to be executable, and my OS is nice enough to honor such a request. You can try this and you will see it crashes if you omit it.

So then the big question comes with this bit of code

const char function[] = { 0xB8, 0x2A, 0x00, 0x00, 0x00, 0xC3 }; 
int main() { 
  return ((int(*)())function)(); 

As you can see, no mprotect call. The array is not modified in any evil way (such as "__attribute__((section(".text")))" to force it to be executable, and it is not. Yet, this runs just fine!

So what's going on? Let's investigate the binary to see what's up.

$ objdump -d test | grep 2[aA]

There's definitely no 2A byte in the code. This would show it if it were code - so it is definitely in the data section. In fact, it'll be in the .rodata section:

$ objdump -h test

test:     file format elf64-x86-64

Idx Name           Size         VMA                LMA       File off Algn
 12 .text       000001c2 0000000000000530  0000000000000530  00000530 2**4
 14 .rodata     0000000a 0000000000000700  0000000000000700  00000700 2**2
 16 .eh_frame   000000f4 0000000000000740  0000000000000740  00000740 2**3
 17 .init_array 00000008 0000000000200de0  0000000000200de0  00000de0 2**3
                CONTENTS, ALLOC, LOAD, DATA
 22 .data       00000010 0000000000201000  0000000000201000  00001000 2**3
                CONTENTS, ALLOC, LOAD, DATA

I've omitted the irrelevant sections for readability. To note is that all READONLY sections are grouped together and mapped to the first page of memory, and the writable sections are grouped up and mapped to the second (2MB) page of memory. Look at the VMA column for the target Virtual Memory Address.

The .text section requires it to be executable - so the page that contains it will be executable. That page runs until beyond the end of .rodata though - so by extension, all of .rodata will be executable! The runtime doesn't even notice that we have data there, as the page granularity of the protection does not allow it to tell the OS. You can remove the "const" from the array; that will move it into .data and by extension a different page, causing it to be guarded by the non-executability of the .data section.

Note that this assumes you have a processor with NX (No-Execute) or XD (eXecute-Disable) bit. If you do not, the code will work regardless as your processor has no way to stop it.