Monday, September 22, 2008

Bitflags Part 3: Getting two dimensional

In my previous posts regarding bitflags I have discussed implementing bit flags in small groups and also larger collections in a way that efficiently stores the flags while providing intuitive interfaces and access to them.

Sometimes we need to think of flags in more complex fashions to provide efficient implementation strategies for problems at hand. In a previous post I suggested scanlines in Group 3 Fascimile images are actually just fixed length arrays of bitflags and may be implemented as such. Extending this metaphor suggests any black and white image is just a collection of fixed length arrays of bitflags. An array of arrays of bitflags is certainly a simple way of viewing this metaphor and could be represented by this interface (I've omitted the constructor and other interface elements for clarity):

type
TBWImage = class(TObject)

private
FLines : Array of TBitFlags;

Function GetBitFlag(X, Y : Integer) : Boolean;
Procedure SetBitFlag(X, Y : Integer);

public
property BitFlags[X, Y : Integer] : Boolean
read GetBitFlag write SetBitFlag; Default;
end;

As a simple implementation, FLines is a dynamic array allowing us to flexibly cater for varying sizes of two-dimensional sets of bit flags. The public property permits the caller to conceptually think of the image as an array of booleans, handily hiding the implementation details via the setter & getter methods, and the use of the default modifier that allows the caller to treat TBWImage instances as if they were a direct implemention of the 2D bitflag array without needing to even know the name of the BitFlags property itself.

However, implementation of a 2D bitflag array in this way isn't as efficient as it could be. Construction of this class requires the independent construction of the entire array of bitflags instances where each those allows storage of arbitrarily long series of bitflags. This implies repeated overhead in each TBitFlags instance which is not strictly required when considering a 2D array of bitflags.

So, we still want to provide the same conceptual interface, but need to improve the implementation to improve algorithmic and storage efficiency.

One was of solving this is by allocating a single block of memory that stores N by M bits, meaning ~(N x M) DIV 8 bytes of storage in total and modifying the accessors to access the single block of memory, rather than individual set of bitflags representing each line. This reduces memory overhead and potential fragmentation, and makes construction and destruction of the class simpler too. However, in the context of this example (that of a black and white image) these issues are unlikely cause a problem in typical use cases.

So now we have a class that implements two dimensional sets of bitflags. It may be used for a variety of purposes and is efficient in both CPU and memory requirements.

But it has a weakness! It's rectangular. More to the point, it requires one bit of memory to be allocated for each location within the rectangular N x M coordinate system required. This is not an issue for most problem spaces, but what if the problem space is very large in terms of the N x M coordinate system? What if you need to create and manage many instances of such large 2D arrays of bitflags? Now we need to radically rethink how we implement the storage and access mechanisms for this data.

Which is the topic of the next article in this series :-)

Regards,
Raymond.

No comments: