B+Tree Performance

The B+Tree classes provide in the Virtual Machinery libraries are very efficient. The table below shows the times taken (in milliseconds) for the standard test (description of standard test follows table) on a number of operating environments and CPUs.

OS JDK 6.0 Windows XP JDK 6.0 Windows 7 JDK 6.0 Mac OS X JDK 6.0 Mac OS X

CPU Intel Core 2 Duo 2 GHz Intel i7 1.60 GHz Intel Core Duo 1.83 Ghz Intel Core Duo 3.06 Ghz

Initial Addition (ms) 1.895 1.148 2.383 1.033

Random Addition (ms) 1.870 1.123 2.436 1.045

Random Deletion (ms) 1.308 0.766 1.775 0.752

Random Fetch 0.498 0.439 1.050 0.436

The standard test consists of 10000 random additions of entries with an average index length of 30 characters and an average data length of 52 characters. This is then followed by a further 20000 random operations divided randomly and roughly equally among additions, deletions and fetches.

Cacheing

This implementation of the BTree includes a cache. The cache allows any number of index and data pages in the tree to be held in memory. This significantly reduces access speeds to data in the tree since the number of relatively slow disk accesses is reduced. While both index and data pages can be held in memory if there is insufficient memory for all the pages to be held it is better to hold index pages rather than data pages. This is because index pages are accessed more frequently since any access to the data must first go through the index page 'tree'. The following table shows comparative figures are shown for various levels of cacheing of index and data pages.

If you wish to use a cached BTree then you can set the number of index and data pages to be buffered in the methods to create new and use existing BTrees.

OS	JDK 6.0 Windows XP	JDK 6.0 Windows 7	JDK 6.0 Mac OS X	JDK 6.0 Mac OS X
CPU	Intel Core 2 Duo 2 GHz	Intel i7 1.60 GHz	Intel Core Duo 1.83 Ghz	Intel Core Duo 3.06 Ghz
Initial Addition (ms)	0.017	0.008	0.019	0.015
Random Addition (ms)	0.014	0.012	0.024	0.011
Random Deletion (ms)	0.017	0.012	0.018	0.016
Random Fetch (ms)	0.040	0.007	0.025	0.009

Key Compression

Furtherspace efficiency is provided by compressing the keys. A number of algorithms are available to do this depending on the nature of the keys. The method used here is to substitute the common key prefix in subsequent keys e.g. if you have three keys in the one block -

ACCOUNT_BLOGGS_ANNE
ACCOUNT_BLOGGS_JOE
ACCOUNT_BLOGGS_JOE JR

These are represented in the file as -

+ACCOUNT_BLOGGS_ANNE
+JOE
+ JR

Where the ASCII value of the character at + indicates the number of common characters shared with the previous record (up to a maximum of 255). If + is 0 then there is no compression. The first index record in a page is never compressed. Compression introduces an overhead to the manipulation of data in the BTree but shows definite advantages in very large trees where there is sufficient commonality of index prefix to allow a reduction in the number of index levels and hence access time. It also allows the programmer to provide sensible keys to the data without worrying unnecessarily about the space overhead.

Click here for pricing and licensing details and to buy Virtual Machinerys B+Tree implementation online.

Some of the icons in this page are generously provided by MySiteMyWay