Structuring our data for streamingBy Paul Harris on 14th March 2016 | Development
When it comes to presenting a whole world’s worth of data in real-time on a mobile device, unsurprisingly there are a lot of constraints and challenges. One such challenge is getting that data from our servers to a user’s application in a timely manner.
To avoid wasting network time, we want to send as little data as possible. That means we don’t want to download every building in Texas while we’re in the middle of London, we don’t want to look for street names as we pan over the Atlantic Ocean, and we definitely don’t want to fetch everything on the planet whenever the camera is zoomed out too far.
What we want is to minimise the amount of data that needs to be downloaded. Also, to avoid the overhead of multiple web requests, we want to minimise the number of individual downloads.
To achieve this, the client app needs to know where to look for just the resources it needs. We accomplish this by using a few data structures that make it easy to index those resources. Essentially, we split the world into tiles, and then only require the client to download the tiles that it needs at any given time.
We use a kind of quadrilateralized spherical cube – or quad sphere – to organise our data. At the highest level, we partition the resources of our world into six cube faces and build a cube map. So Face 0 contains the Americas, Face 1 contains parts of Asia and Australia, Face 2 is the the Arctic, the UK, and Canada, and so on.
So now we know that if we’re looking at Africa, the tiles that we want are going to be filed under Face 4.
There are advantages of using a cube map over other possible ways to split up the data. Most notably, it means that each tile is the same shape and size as all the others. This property approximately holds even when we further split up those tiles. This is good because we don’t want the number of downloads to vary wildly between the north pole and the equator.
How do we split things up further? We make a quadtree structure out of each face. We do this by splitting each tile, or node, into four children. Then we can do that about 13 more times. This gives us 15 different resolutions of cube map, from one tile per face at the lowest level of resolution (call it level 0) to 268,435,456 tiles at the highest resolution (level 14).
That might seem like a lot of tiny tiles, but each one is still about 500 metres long on each side.
So how does this help us? Let’s imagine we’re taking a really close look at the centre of London. We can probably see a handful of those level 14 tiles. Let’s say around 10 of them, each with some shiny, detailed buildings. So the app has to make about 10 web requests to download each of those tiles.
Now let’s say we want to zoom out, far enough that we can see four times as many tiles. Instead of doing four times the work and downloading 40 level 14 tiles, we can download 10 level 13 tiles, which are four times bigger. These larger tiles can contain all the same buildings as the smaller ones, but at a lower level of detail.
This means that (ideally) no matter how large an area we look at, we have roughly the same amount of resources to go and fetch. That’s great! Now we just need some kind of index to identify those tiles.
Morton keys (or Morton codes) are simply a single number used to index a 2D space. They have the nice property of preserving locality, so each Morton key represents a point in 2D space that is close to the previous Morton key. The result is what’s known as a Z-order curve.
Now if we draw a quadtree over it, we can see that the curve passes through every child node of each tile before moving on to the next tile of the same size. This is equivalent to a depth-first traversal of the quadtree, which is why Morton keys like this can also be known as quadkeys. Breadth-first traversal is also possible with slightly different structuring or ordering of the Morton keys.
But what does a Morton key actually look like? Basically, we just number the four children of each quadtree cell from 0 to 3. Then we number the four children of cell 0 from 00 to 03. The four children of cell 1 from 10 to 13, and so on.
As in the above image, our Morton keys are more N-order than Z-order. The principle is the same, but with the coordinates rotated slightly.
Now because we want to index the quadtrees on each face of a cube, we simply prepend the face number to the Morton key. So all Morton keys on face 5 will begin with 5.
So now every tile of our map can be indexed by a Morton key. The level 0 tiles (the entire cube faces) are numbered 0, 1, 2, 3, 4, and 5. The level 1 tiles on face 2 are numbered 20, 21, 22, and 23, and so on. We host the tile resources themselves in a directory structure that matches up with the Morton key. So if you’re looking for the buildings at tile 500123, then the URL for it looks something like: http://<our_cdn>/5/0/0/1/2/3/Buildings
All the client has to do in order to fetch map resources now is this:
- Determine the level of resources to fetch, based on how zoomed out the camera is.
- Determine the set of tiles at that level that are visible.
- Use the Morton keys of those tiles to make web requests for the resources.
Simple, right? And that’s how we get all this to the client:
There are a few other advantages we gain from using quadtrees and Morton keys, as well as a few other optimisations we make when streaming resources, such as caching tiles that we’ve downloaded previously. We might cover some of these in future posts.