Please use this identifier to cite or link to this item:
|Title:||Clustering techniques for minimizing external path length|
|Publisher:||MORGAN KAUFMANN PUB INC|
|Citation:||PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES,342-353|
|Abstract:||There are a variety of main-memory access structures, such as segment trees, and quad trees, whose properties, such as good worst-case behaviour, make them attractive for database applications. Unfortunately, the structures are typically 'long and skinny', whereas disk data structures must be 'short-and-fat' (that is, have a high fanout and low height) in order to minimize I/O. We consider how to cluster the nodes (that is, map the nodes to disk pages) of main-memory access structures such that although a path may traverse many nodes, it only traverses a few disk pages. The number of disk pages traversed in a path is called the external path length. We address several versions of the clustering problem. We present a clustering algorithm for tree structures that generates optimal worst-case external path length mappings; we also show how to make it dynamic, to support updates. We extend the algorithm to generate mappings that minimize the average weighted external path lengths. We also show that some other clustering problems, such as finding optimal external path lengths for DAG structures and minimizing weights for optimal height mappings, are NP-complete. We present heuristics for these problems. We present a performance study (using quadtrees on actual image data as an example) which shows that our algorithms perform well. Our algorithms can also be applied for clustering complex objects in object-oriented databases.|
|Appears in Collections:||Proceedings papers|
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.