Boolean Bits in C - CodeDromeCodeDrome

The C99 standard introduced the _Bool type as well as stdbool.h which allows you to use bool, true and false. _Bool uses a byte to store true/false, yes/no, on/off or whatever the semantics of your program might be, but of course you only really need 1 bit so 7 bits are wasted. Most of the time this isn't worth worrying about but in some rare cases where you need a lot of Booleans it might be worthwhile looking into ways of being a bit more efficient with memory usage. This article presents my particular approach to doing this.

Solution Outline

For this project I will be using a struct to hold a block of memory, the bits of which will be used as individual Boolean values. The struct will also record the number of bits and the number of bytes being used; the reason for this seeming duplication will become clear later.

We also need a few functions to handle this struct. The main ones are:

create_bits - this will allocate a block of memory large enough for a specified number of bits

set_bit - sets the specified bit to true or false

get_bit - gets the value of the specified bit

free_bits - frees the allocated memory when we have finished with it

The data structure is rather confusing so I have also written two functions to print out the values held so we can see exactly what is going on.

The Data Structure

The data structure used to hold the bits is best explained with a diagram.

It is basically an array of bytes (unsigned chars to be precise, but of course we are not using them to store characters), the number of bytes being the number of bits needed divided by 8, and then rounded up to the nearest multiple of 8. The top row of the diagram represents a straightforward array of 2 bytes indexed 0 and 1.

The individual bits, however, are indexed right to left. This is to simplify the arithmetic and bitwise operations needed to set and get the values of individual bits, as will become clear later. Keep this diagram in mind when working through the code.

Setting and Unsetting Bits

At the core of this project is the ability to set individual bits to 1 or 0. To do this we need a byte with its value calculated by raising 2 to the power of the index of the bit we want to set. This should become clear from the following table.

bit index	7	6	5	4	3	2	1	0
2^{bit index}	128	64	32	16	8	4	2	1

This shows that if you raise 2 to the power of a bit index (0 based, right to left) you get the value represented by that bit. This is the reason the bits are indexed right to left in the data structure diagram.

For example, if we want to set bit 4, we need a byte set to 2⁴ = 16, which has the following bit pattern.

Now we have that byte we can use it with the bitwise OR, NOT and AND operators to set or unset a bit. The following tables show the processes for setting or unsetting bit 0 for the four possible combinations of existing value and new value. In each case we calculate 2⁰ = 1, and then either use OR or a combination of NOT and AND with the current byte to obtain the required bit pattern.

Setting bit 0 from 0 to 1
Current bits	1	1	1	1	0
Required bits	1	1	1	1	1
2⁰ = 1	0	0	0	0	1
OR'ed with current	1	1	1	1	0
=	1	1	1	1	1

Setting bit 0 from 1 to 0
Current bits	0	1	0	1	0	1	0	1
Required bits	0	1	0	1	0	1	0	0
2⁰ = 1	0	0	0	0	0	0	0	1
NOT'ed	1	1	1	1	1	1	1	0
AND'ed with current	0	1	0	1	0	1	0	1
=	0	1	0	1	0	1	0	0

Setting bit 0 from 1 to 1
Current bits	1	1	1	1
Required bits	1	1	1	1
2⁰ = 1	0	0	0	1
OR'ed with current	1	1	1	1
=	1	1	1	1

Setting bit 0 from 0 to 0
Current bits	1	0	1	0	1	0	1	0
Required bits	1	0	1	0	1	0	1	0
2⁰ = 1	0	0	0	0	0	0	0	1
NOT'ed	1	1	1	1	1	1	1	0
AND'ed with current	1	0	1	0	1	0	1	0
=	1	0	1	0	1	0	1	0

These operations will be used later when we get to the set_bit function. You may be thinking that two of the four operations described above are superfluous as they are just setting existing values. However, to check whether a bit is already set to its requested value requires a separate operation each time so it is more efficient to just set the bit to its new value irrespective of its current value.

Getting Bits

The same principles are used to read individual bits within a byte. This time though we use the AND operator to retrieve the value of a bit. This is illustrated by the following table, where we retrieve the value of bit 4.

Bit pattern used to hold Booleans	1	1	1	1	1	1	1	1
Bit pattern for 2⁴ = 16	0	0	0	1	0	0	0	0
Previous 2 bytes AND'ed = 16	0	0	0	1	0	0	0	0

We start off with a byte with all the bits set to 1. We then AND it with 16 which gives us 16, which, not being 0, is "truthy". Of course if the bit were not set the two bytes AND'ed would be 0 or false.

Coding

After all those explanations we can finally get stuck into the coding. Create an empty folder and within it create these files. You can download the source code as a zip or clone/download from Github if you prefer.

booleanbits.h
booleanbits.c
main.c

Source Code Links

ZIP File
GitHub

Open booleanbits.h and enter or paste this code.

booleanbits.h

#include<stdlib.h>
#include<stdbool.h>
#include<stdio.h>
#include<math.h>

//--------------------------------------------------------
// STRUCT bits
//--------------------------------------------------------
typedef struct bits
{
    unsigned char* bits;
    unsigned int bitcount;
    unsigned int bytecount;
} bits;

//--------------------------------------------------------
// FUNCTION PROTOTYPES
//--------------------------------------------------------
bits* create_bits(unsigned int bitcount);
void print_bits_unfriendly(bits* b);
void print_bits_friendly(bits* b);
void set_bit(bits* b, unsigned int bit, bool value);
bool get_bit(bits* b, unsigned int bit);
void free_bits(bits* b);

The struct in the header file contains a char* which will point to malloc'ed memory to hold the actual bits. We also have a couple of integers to hold the size in bits and bytes.

That's all there is to the header file so now let's move on to booleanbits.c.

booleanbits.c part 1 - #includes, create_bits and free_bits

#include<stdlib.h>
#include<stdbool.h>
#include<stdio.h>
#include<string.h>
#include<math.h>

#include"booleanbits.h"

//------------------------------------------------------------
// FUNCTION create_bits
//------------------------------------------------------------
bits* create_bits(unsigned int bitcount)
{
    unsigned int bytecount = ceil(bitcount / 8.0);

    bits* b = malloc(sizeof(bits));

    if(b != NULL)
    {
        *b = (bits){.bits = malloc(bytecount), .bitcount = bitcount, bytecount = bytecount};

        if(b->bits != NULL)
        {
            for(int i = 0; i < bytecount; i++)
            {
                b->bits[i] = 0;
            }

            return b;
        }
        else
        {
            free(b);

            return NULL;
        }
    }
    else
    {
        return NULL;
    }
}

//------------------------------------------------------------
// FUNCTION free_bits
//------------------------------------------------------------
void free_bits(bits* b)
{
    free(b->bits);
    free(b);
}

Firstly we need to calculate the number of bytes needed. Of course we can't just divide bitcount by 8 but also need to round it up to the nearest integer with the ceil function, which incidentally lives in math.h.

The first malloc gives enough memory for a bits struct, which, if successful, is assigned to an actual struct. Within the struct's initialization malloc is called again to get us memory for the bits being stored, and bitcount and bytecount are also set.

Finally, if the second malloc worked we can just set all the bytes in b->bits to 0 which, if you are just thinking of the bytes as a row of Booleans, sets them all to false.

The free_bits couldn't be simpler: it just takes a bits pointer and frees up the memory it uses.

booleanbits.c part 2 - print_bits_unfriendly

//------------------------------------------------------------
// FUNCTION print_bits_unfriendly
//------------------------------------------------------------
void print_bits_unfriendly(bits* b)
{
    printf("bitcount:  %d\n", b->bitcount);
    printf("bytecount: %d\n", b->bytecount);

    char s[(b->bytecount * 8) + 1];
    int si = 0;

    for(int bit = (b->bytecount * 8) - 1; bit >= 0; bit--)
    {
        if(bit >= b->bitcount)
        {
            s[si] = 'X';
        }
        else
        {
            if(get_bit(b, bit) == true)
            {
                s[si] = '1';
            }
            else
            {
                s[si] = '0';
            }
        }

        si++;
    }

    s[si] = '\0';

    printf("%s\n", s);
}

The print_bits_unfriendly function, after printing the sizes in bits and bytes, creates a char array or string large enough for all the bits plus the null string terminator.

The string is populated left to right, ie starting at 0, but as you saw in the data structure diagram above the bits are indexed right to left. Therefore we start the for loop at the highest bit index and decrement that index.

Within the loop we check if the bit is higher than the bitcount, which can happen if the number of bits isn't a multiple of 8 in which case some bits of the first byte are unused. For these bits we print out an X. For used bits we just print out 0 or 1 using the get_bit function which we'll get to later. Finally si, the string index, is incremented. After the loop exits the '\0' character is added to the end of the string which is then printed.

I mentioned earlier that the bits struct holds both bitcount and bytecount. The second of these is redundant as it is a function of the first but bytecount is used several times throughout the code, including three times in print_bits_unfriendly, so calculating it once and storing it makes sense.

booleanbits.c part 3 - print_bits_friendly

//------------------------------------------------------------
// FUNCTION print_bits_friendly
//------------------------------------------------------------
void print_bits_friendly(bits* b)
{
    printf("bitcount:  %d\n", b->bitcount);
    printf("bytecount: %d\n", b->bytecount);

    for(int bit = 0; bit < b->bitcount; bit++)
    {
        printf("bit %2d = %s\n", bit, get_bit(b, bit) == true ? "true" : "false");
    }
}

The print_bits_friendly function is far simpler. After printing out bitcount and bytecount it iterates the bits in a for loop, printing out the bit index and true or false.

The string printed by printf is actually a ternary operator testing the return value of calls to get_bit. This is slightly more complicated than you usually see in printf so some people might like to split that bit of code out to a separate line for increased clarity.

I have saved the "best" functions until last, starting with set_bit.

booleanbits.c part 4 - set_bit

//------------------------------------------------------------
// FUNCTION set_bit
//------------------------------------------------------------
void set_bit(bits* b, unsigned int bit, bool value)
{
    unsigned int byte = abs((floor(bit / 8.0)) - (b->bytecount - 1));

    unsigned char localbit = bit % 8;

    if(value == true)
    {
        b->bits[byte] = b->bits[byte] | (unsigned char)pow(2, localbit);
    }
    else
    {
        b->bits[byte] = b->bits[byte] & ~ (unsigned char)pow(2, localbit);
    }
}

Looking at set_bit, you will probably want to know what this line is all about.

unsigned int byte = abs((floor(bit / 8.0)) - (b->bytecount - 1));

Take a look at the data structure diagram again. To set a bit, say bit 5, we need to know which byte it is in.

Let's plug in bit 8 and a bytecount of 2 into the formula.

abs((floor(5 / 8.0)) - (2 - 1))
= abs((floor(0.625)) - (1))
= abs(0 - 1)
= abs(-1)
= 1

So we have calculated that bit 5 is in byte 1, which can be confirmed with a glance at the table above.

Next we need to know where in the byte a specific bit is located. Let's take bit 12 as an example: looking at the diagram we know it is bit 4 of byte 0, reading right to left from bit 0. This is calculated using the % modulus (remainder) operator. In this example 12 % 8 = 4.

Finally we set the approriate byte using the methods demonstrated in the tables in the Setting and Unsetting Bits section above. If we are setting the bit to true we use OR (the "|" symbol in C), and if we are setting it to false we use the AND and NOT operators (the "&" and "~" characters in C).

We can finish off booleanbits.c with the get_bit function.

booleanbits.c part 5 - get_bit

//------------------------------------------------------------
// FUNCTION get_bit
//------------------------------------------------------------
bool get_bit(bits* b, unsigned int bit)
{
    unsigned int byte = abs((floor(bit / 8.0)) - (b->bytecount - 1));

    unsigned char localbit = bit % 8;

    return b->bits[byte] & (unsigned char)pow(2, localbit);
}

This calculates the byte and localbit variables in the same way as set_bit but then uses the AND operator "&" with the relevant byte to retrieve the value of the individual bit, as explained in the Getting Bits section above.

Now we just need a main function to try everything out.

main.c

#include<stdio.h>
#include<stdlib.h>
#include<stdbool.h>

#include"booleanbits.h"

//--------------------------------------------------------
// FUNCTION main
//--------------------------------------------------------
int main(int argc, char* argv[])
{
puts("-----------------");
puts("| codedrome.com |");
puts("| Boolean Bits |");
puts("-----------------\n");

    bits* b = create_bits(25);

    if(b != NULL)
    {
        puts("Set bit 0 from false to true\n----------------------------");
        print_bits_unfriendly(b);
        set_bit(b, 0, true);
        print_bits_unfriendly(b);
        puts("");

        puts("Set bit 1 from false to false\n-----------------------------");
        print_bits_unfriendly(b);
        set_bit(b, 1, false);
        print_bits_unfriendly(b);
        puts("");

        puts("Set bit 2 from true to false\n----------------------------");
        set_bit(b, 2, true);
        print_bits_unfriendly(b);
        set_bit(b, 2, false);
        print_bits_unfriendly(b);
        puts("");

        puts("Set bit 3 from true to true\n---------------------------");
        set_bit(b, 3, true);
        print_bits_unfriendly(b);
        set_bit(b, 3, true);
        print_bits_unfriendly(b);
        puts("");

        print_bits_friendly(b);

        free_bits(b);
    }

    return EXIT_SUCCESS;
}

Here we call create_bits and then, if the return value is not NULL, sets bits and calls print_bits_unfriendly a few times to demonstrate the four combinations of true/false. After that print_bits_friendly is called to show the final state, followed by free_bits to give back the memory allocated by the create_bits function.

We can now compile and run the program with these commands...

Compile and Run

gcc main.c booleanbits.c -g -std=c11 -lm -o main
./main

...which will give us this output.

Program Output

-----------------
| codedrome.com |
| Boolean Bits |
-----------------

Set bit 0 from false to true
----------------------------
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000000
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000001

Set bit 1 from false to false
-----------------------------
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000001
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000001

Set bit 2 from true to false
----------------------------
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000101
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000000001

Set bit 3 from true to true
---------------------------
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000001001
bitcount:  25
bytecount: 4
XXXXXXX0000000000000000000001001

bitcount:  25
bytecount: 4
bit  0 = true
bit  1 = false
bit  2 = false
bit  3 = true
bit  4 = false
bit  5 = false
bit  6 = false
bit  7 = false
bit  8 = false
bit  9 = false
bit 10 = false
bit 11 = false
bit 12 = false
bit 13 = false
bit 14 = false
bit 15 = false
bit 16 = false
bit 17 = false
bit 18 = false
bit 19 = false
bit 20 = false
bit 21 = false
bit 22 = false
bit 23 = false
bit 24 = false

The whole process of writing this code has clearly been fiddly and complex, but looking at main we can see all that complexity is hidden. Anyone using this code just needs to call a few functions to obtain as many memory-efficient Booleans as they need.