librangetree

librangetree is a C++ template, implementing a 2d range tree for both counting and reporting queries.

1. Intro

So what’s this and why would I need it?

It’s a solution for a specific type of searches - those done on relatively static data, in two dimensions.

For example:

Give me the names of all people between the ages 35-56 earning 20k-50k
How many cities exist between coordinates 56,67 and 77,89

2. What it looks like

Figure 1. In-memory layout

In this example, we added three points to the tree: (1, 0), (10, 0), and (100, 0).

As can be seen, any kind of range search can be efficiently answered from this tree.

The usual amount of points ranges from a few hundred thousand to a few dozen million. More than that wouldn’t fit into current amounts of RAM.

3. Performance

This was measured using the included benchmark, bench/countspeed, on a Phenom II 2.8GHz, using the latest released version.

These are all counting searches, since the reporting overhead depends on the libc implementation among other things.

The timer accuracy on this computer is about 70 ns, but due to other noise, consider the margins of error to be 5%.

Using 10 (10^1) points, creation took 37 us (0 ms)
        and 1M searches took 51179 us. (0.05 us/search)
Using 100 (10^2) points, creation took 38 us (0 ms)
        and 1M searches took 77785 us. (0.08 us/search)
Using 1000 (10^3) points, creation took 427 us (0 ms)
        and 1M searches took 116971 us. (0.12 us/search)
Using 10000 (10^4) points, creation took 4963 us (4 ms)
        and 1M searches took 241700 us. (0.24 us/search)
Using 100000 (10^5) points, creation took 55503 us (55 ms)
        and 1M searches took 660244 us. (0.66 us/search)
Using 1000000 (10^6) points, creation took 534722 us (534 ms)
        and 1M searches took 1493930 us. (1.49 us/search)
Using 10000000 (10^7) points, creation took 3735129 us (3735 ms)
        and 1M searches took 2286166 us. (2.29 us/search)

From these results, it looks as if the creation time increases sub-linearly, and the search increases logarithmically.

These are much better than what was promised by the theoretical efficiency. The reason for this is assumed to be the optimizations for the real world, such as for better cache usage.

Additional results are welcomed. If you’d like to submit yours, it’s fairly easy:

cd bench
make
./countspeed | tee log

4. Example usage

#include "ranget.h"
...
/* Creating the tree. The first template argument is the coordinate type,
   and the second is the type of your data.

   The coordinate type must be an integer, but how big and whether it is
   signed is up to you.

   Note that the tree stores pointers to the data, not the data itself. */
rangetree<u32, struct topsecret> mytree;

/* The tree's constructor has two optional values for memory allocation
   guidance: rangetree(estimatedTotal, estimatedResult).

   If you have advance knowledge of your dataset, giving these will speed
   up the creation and searches a bit due to more accurate memory
   allocation.

   estimatedTotal is the total number of points the tree will include. It's
   not a limit, but merely a guiding value.

   estimatedResult is the number of points returned in an average search.
   It only affects the search interface that returns a std::vector. */

...

/* Adding points. Very simple, the X coordinate, the Y coordinate, and a
   pointer to the point's data. */
mytree.add(0, 8, ptr);

/* Once all points have been added, the tree needs to be finalized before
   it can be searched. Points cannot be added after this. */
mytree.finalize();

...

/* Searching. We have three search interfaces:
   - counting query - how many points exist between these values
   - search returning a std::vector
   - search using a given array

   Note that all searches are inclusive. So if you give an X range of 5-6,
   both 5 and 6 match.

   If you only need the number of points, the counting query is the
   fastest. If you need high-performance searches, the last interface is
   recommended, due to avoiding repeated memory allocation. */

u32 numberOfPoints = mytree.count(minX, maxX, minY, maxY);
std::vector<struct topsecret*> *myvec = mytree.search(minX, maxX, minY, maxY);
// Note that you must delete the returned vector when you're done.

u32 preallocated = 16;
struct topsecret *array[preallocated];

mytree.search(array, preallocated, minX, maxX, minY, maxY);
/* Now array contains pointers to all the found data.
   The given array size is updated to contain the number of found points.

   If the updated number is more than the size of the array, the array
   wasn't big enough to hold all the results. The array is filled, and the
   remaining results are discarded. */

5. Where do I get it?

librangetree is open source, under the AGPLv3 license.

You can download it at https://sourceforge.net/projects/librangetree/files/latest/download.

Commercial licensing is also available. Contact us for details.

6. Contributing

Found a bug, or have a patch? Open a ticket at http://sourceforge.net/p/librangetree/tickets/

7. Limits

Limits

Once created, it is read-only
The maximum dataset it can hold is 2^32 (4 billion points)

Tip
If you need more, consider splitting your data at a higher level.
Creation is not thread-safe - searches are
The coordinates must be integers of some size

8. FAQ

What are the search and space efficiencies?

The space usage is O(n log n), and the search efficiency is O(log^2 n). The creation time is O(n log n).

Note that while the theoretical numbers are these, the numbers in practise seem to be better.
I hear fractional cascading enables O(log n) search?

True, but it would also nearly double the RAM use. This was considered too excessive.

Using one million points on a 64-bit platform, the tree takes about 267 mb of RAM currently. Pointers are a major part of that, but due to an optimization on 64-bit platforms, RAM use is only slightly more than on 32-bit (about 11% more).

Talking in real-world terms, the average search would save about four binary searches (in the array you’re reporting from), in exchange for few ten pointer hops (in arrays that aren’t used otherwise). As the cache effects favor the current implementation, the speedup from this technique would not be as big as theoretically promised.
How is data ownership handled?

librangetree does not touch your data. It is never deleted or freed.

Since librangetree only stores pointers, you’ll need to manage your data outside the tree.

Tip
If you don’t have a list of all the pointers anymore, you can get it by doing a search for the full range of the data type.
Yay, finally!

Tell me about it. When I discovered I needed this type of searches, I found there were 40 years of papers on the structure, but no code.

Later on I found there was one implementation, in CGAL, but given I needed it in a proprietary program, that was a no go.

In case you’re curious what my case was, I needed a way to list all visible planets in a huge universe.

9. Changelog

v1.3.1

Removed the arbitrary limit on having at least two points. Trees of a single point can now be created.

v1.3

Added a memory pool for ptys. 0-17% speedup in creation, 0-10% speedup in searches.

This drops RAM use by about 4%. For 1M points, the tree now takes 267mb instead of 278mb.

v1.2

Added a memory use optimization for 64-bit platforms.

This brings RAM use more in line with 32-bit platforms, speeding up counting queries by 0-15%. The effect on reporting speed is not known, but I estimate it’s slightly slower. This optimization can be disabled if desired.

Before this, a tree with 1M points took 430mb. Now it takes 278mb.

v1.1

Improved the use of negative values
Removed heap allocation from searches, improvement of 0-12%
Documentation updates