door in the open

Range Tree

2015-11-06

OK, so you have a bunch of markers on your map, let us say one for each human language on the planet. Now you choose a random point, let us say 42 N 42 E, and you want to know which language markers fall within a certain radius. What is a beautiful way to achieve this?

Here is an ugly solution: loop over all the markers checking the great circle distance between each of these and 42 N 42 E. This is not only ugly but also human-noticeably slow, especially if it is not a single point you are interested in, but thousands of points, e.g. enough for a nice heatmap.

Unsurprisingly, the article's title is a conspicuous hint to a more elegant approach. If we store all the markers' latitudes in a range tree, we would be able to quickly pin down the ones lying within the band [42 -r; 42 + r]. And if we have another range tree for the longitudes, then the intersection of the sets returned by both searches will be the set of markers lying within the tetragon defined by the latitude and longitude tangents to the circle we are interested in. Of course, we still have to do the great circle distances check in order to cut the tetragon's edges and make it into a proper circle, but the loop iterations will be significantly fewer.

How would our range tree look like? The leaves are the sorted latitudes/longitudes (i.e. the values we will be searching for). From there rootwards, each node contains the largest leaf (value-wise) found on its left branch. Thus, when we want to use the tree, we do binary search for the range boundaries and collect every leaf between. Read the code:

class RangeTree:
	
	def __init__(self, leaves):
		"""
		The leaves must be [] of (value, item,) tuples.
		"""
		assert len(leaves) > 0
		leaves = sorted(leaves)
		
		self.tree = RangeTree.create_tree(leaves)
	
	
	@staticmethod
	def create_tree(leaves):
		"""
		Recursively create a tree out of the leaves given.
		"""
		if len(leaves) in (1, 2,):
			left = {
				'value': leaves[0][0],
				'item': leaves[0][1],
				'left': None,
				'right': None
			}
			
			if len(leaves) == 2:
				right = {
					'value': leaves[1][0],
					'item': leaves[1][1],
					'left': None,
					'right': None
				}
			else:
				right = None
			
			value = leaves[0][0]
		
		else:
			half = int(len(leaves) / 2)
			
			left = RangeTree.create_tree(leaves[:half])
			right = RangeTree.create_tree(leaves[half:])
			
			value = left['value']
			node = left['right']
			
			while node is not None:
				if node['value'] > value:
					value = node['value']
				node = node['right']
		
		return {
			'left': left,
			'right': right,
			'value': value
		}
	
	
	@staticmethod
	def search_tree(tree, a, b):
		"""
		Recursively search the tree given.
		"""
		if tree['value'] < a:
			if tree['right'] is None:
				return set()
			return RangeTree.search_tree(tree['right'], a, b)
		
		if tree['value'] > b:
			if tree['left'] is None:
				return set()
			return RangeTree.search_tree(tree['left'], a, b)
		
		
		if tree['left'] is None and tree['right'] is None:
			return set([tree['item']])
		
		
		items_left = set()
		if tree['left'] is not None:
			items_left = RangeTree.search_tree(tree['left'], a, b)
		
		items_right = set()
		if tree['right'] is not None:
			items_right = RangeTree.search_tree(tree['right'], a, b)
		
		return items_left.union(items_right)
	
	
	def search(self, a, b):
		try:
			assert a < b
		except AssertionError:
			return set()
		
		return RangeTree.search_tree(self.tree, a, b)



tree = RangeTree([(1, 'one'), (2, 'two'), (3, 'three')])
print(tree.search(2, 4))

If you are not reading this article just for fun but actually need this, then you might end up wondering how many degrees north exactly is 42 N + 500 km. Solving this is a very nice spherical trigonometry refresher which I will leave to you :)