This post begins a series on how sin and cos are calculated in the c language standard math library on a typical Linux system. Because Python floats are really c doubles, what I write here applies to Python too. I will assume you know a little bit about c, a tiny little bit of real analysis, and have some idea how IEEE754 floats work and what an ULP is. You can get everything you need to know about floating point from the links below or by reading a bit of What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg.

Which c code is used for trig functions?

It depends on the specifics of your system. To find out, you can write a piece of c code with a sin or cos call that gcc can’t optimize out and then examine it with the debugger gdb. If you know gcc well you can probably pass the right options to stop it doing this kind of optimization, but I don’t, so here is an example of what I mean:

#include <math.h>
#include <stdlib.h>
#include <time.h>

int main() {
  srand(time(0)); // now gcc can't know the random seed
  int r = rand();
  double s = sin(r);
  return 0;
}

Compile with gcc -g filename.c -lm -o sin (the g option lets you use the debugger, lm links the math library). Start the debugger with gdb sin and set a breakpoint at the sin call with break 8, 8 being the line number on which sin is used. Run the program with run and when it breaks, hit s then enter to step through the code. You’ll see a path like ../sysdeps/ieee754/dbl-64/s_sin.c indicating which source file is used. The ieee754/dbl-64 means that my system is using 64 bit double precision IEEE754 floating point numbers which I will refer to as doubles. The source file specified won’t normally be present on your computer - under Debian you can do apt-get source libc6 to install it (in the current directory).

If you step through further with gdb you probably see a lot of file not found errors. This unix and linux SE question has answers explaining how to fix that, if you want to get rid of them.

Instead of installing the source package, you can look at the official glibc repository on sourceware.org for the most up-to-date code. The file mentioned above is here. There’s a github repository here which says it is a mirror of the sourceware repo updated daily; it might be easier to browse.

This and subsequent posts will explain some of the details of how the calculation in the file s_sin.c mentioned above works and describe some helpful tools in Python and unix for working with floating point numbers and arbitrary precision arithmetic.

What is happening in s_sin.c and why is it so complex?

The exact accuracy claimed for the approximate versions of sin and cos computed in this code is a bit difficult to pin down. Older versions claimed it was correctly rounded-to-nearest but this was removed in 2020 and is most likely wrong - see later for a candidate for explicit example - although it still calls itself an “ultimate sin and cos routine”. Other comments in the source suggest a worst-case error of not much more than half an ULP, but again it’s not clear if this is an estimate or a known bound. Considering the huge range of values that can be represented by a double, it’s no surprise you need complex code to get this sort of accuracy.

If you need correctly rounded elementary functions, the now-defunct CR-LIBM (source mirror) claimed to provide them along with a proof that they really were correctly rounded. There are Python bindings.

Outline of s_sin.c

The general sin function is double __sin (double x) beginning on line 200.

Two auxiliary functions are used:

  • do_sin(x, dx) line 124 computes an approximation to \(\sin(x+dx)\) when dx is small relative to x. Depending on the magnitude of x it either uses polynomial approximations of degree 6 and a lookup table or a higher degree polynomial approximation computed by a macro.
  • do_cos(x, dx) line 100 which always uses a degree 6 polynomial and a lookup table to approximate \(\cos(x+dx)\).

__sin(double x) is computed using different techniques according to the magnitude of x.

  • For small enough |x| it just returns x.
  • For larger |x| up to about 0.86, it returns the value of do_sin(x, 0).
  • For larger |x| up to about 2.4, the do_cos function is used to compute an approximation to \(\cos(\pi/2 - \vert x\vert )\).
  • For larger |x| up to about 100 million, x is first reduced modulo \(\pi/2\) using reduce_sincos and either do_sin or do_cos is used.
  • For larger finite |x| a different reduction function is used, followed by either do_sin or do_cos.
  • Finally, infinities and NaNs are dealt with separately.

In the next section and in future posts I’ll examine these cases separately.

How does the branching on $\vert x \vert$ work?

The main conditional in __sin(double x) starts on line 212, but it isn’t a simple branch on the absolute value of x. Before the conditional we have, omitting a few lines,

mynumber u;
int4 k, m, n;

u.x = x;
m = u.i[HIGH_HALF];
k = 0x7fffffff & m;    /* no sign           */

An int4 is just an int and a mynumber is a c union

typedef union { int4 i[2]; double x; double d; } mynumber;

defined in the mydefs.h header file. It lets you access the two halves of the bit representation of a double x interpreted as two 32b ints stored in an array i. An IEEE754 64b double is stored in memory as a sign bit, 11 bits representing the exponent, then 52 bits of significand. If we split one of these into two 32b pieces, one contains the sign bit, exponent, and the most significant 20 bits of the significand and the other has the least significant 32 bits of the significand. Which of these two pieces ends up as i[0] and which as i[1] (when a mynumber is created from a double) depends on whether your system is big or little endian . For little-endian systems like mine, the sign bit, exponent, and most significant part of the significand will be in i[1]. I don’t know where HIGH_HALF is defined, but from context it must be 1 on little-endian systems so that m is a 32b int formed from this part of the bit representation of x.

(To check the endianness of your system, in Python do from sys import byteorder then print(byteorder), or in a terminal lscpu | grep "Byte Order".)

The integer constant 0x7fffffff has a bit representation of 0 followed by 31 1s, so the result of anding it with m is to set the sign bit of m to 0 if it wasn’t already and not change anything else. So k will be the absolute value of m. The branching on the size of |x| is done by branching on k. For example, the first condition (line 216) is

if (k < 0x3e500000)    /* if x->0 =>sin(x)=x */

To see how this translates into an inequality on |x|, note that the 32b int 0x3e500000 is

s eee eeee eeee ffff ... ffff
0 011 1110 0101 0000     0000

The labelling on the top row shows how the bits would be interpreted if this were the first part of a double: s means the sign bit, the 11 es are exponent bits, and the 20 fs are the 20 most significant bits of the significand or fraction part. The bits of k are the first 32 bits of |x|. Thus the inequality is equivalent to |x| being less than the double with bit representation the same as that of k, followed by 32 zeros. If we treat the bits of the exponent part of a double as an unsigned integer \(e\) then (assuming the double isn’t denormal) the number it represents is \((-1)^s 1.\mathbf{f} \times 2^{e-1023}\) where $s$ is the sign bit and \(\mathbf{f}\) the sequence of significand bits. 0x3e5 is 997 in decimal and 997-1023=-26, so the inequality is equivalent to |x| being less than \(2^{-26}\).

This is sufficient for x to be the nearest double to \(\sin(x)\). Using Taylor’s theorem with the Lagrange remainder, \(\vert\sin(x) - x\vert \leqslant \vert x\vert^3 / 3!\). If \(\vert x\vert < 2^{-26}\) then an ULP for \(x\) is at most \(2^{-26-52} = 2^{-78}\) so half an ULP is at most \(2^{-79}\), while \(\vert x\vert ^3/6! < 2^{-78 -2} = 2^{-80}\).

In the next post I’ll explain how \(\sin(x)\) is calculated in the next branch.