Published by breki on 23 Oct 2009 at 06:33 pm
Last couple of months I’ve been working on database support for Kosmos. This work started as a request to produce bitmaps of the whole UK in the British national grid reference system from the OSM data.
The current version of Kosmos loads all the OSM data into memory. This obviously proves to be a problem for larger areas. Some time ago I worked on optimizing Kosmos memory usage and it proved to be quite successful, but there is still a physical limit of how large an area can be rendered.
The latest UK OSM data takes about 200 MB of zipped XML (2.5 GB unzipped). This is way too much for the existing in-memory system, so I had to find another way to render UK without reading the XML file directly.
After a bit of investigation, the obvious candidate for OSM data storage proved to be PostgreSQL / PostGIS. By “obvious” I mean that there were already some tools which could import OSM data into the PostgreSQL database, namely Osmosis. I was totally new to PostgreSQL, but the actual installation of the database engine was pretty easy.
The next step was importing of OSM data using Osmosis. Although this can be achieved with a few simple steps, there were quite a few “tricks” I had to learn the hard way before the performance of the database was satisfactory enough to produce map renderings in a realistic time:
- The PostgreSQL database engine uses non-optimal default settings, so you need to do a bit of investigation to set certain things up. One tip: if you have several disks, make sure your data is stored on a disk separate from the system.
- Osmosis OSM DB import command has a few settings on its own which can have a dramatic effect on the DB query performance. First of all, you need to use the extended OSM DB schema which contains bbox column for ways, otherwise the spatial queries would always have to contain multiple table joins which will terribly slow things down.
- Kosmos rendering rule engine was designed for instant-access in-memory data source. The way the data was fetched wasn’t really compatible with slow DB sources, so I had to do a lot of refactoring on the engine before the performance was good enough for any serious work.
After a few weeks of work, I finally managed to get some decent results. The rendering code generated UK for approximate zoom levels 7, 8 and 9 in about two hours on my machine, with the level of detail similar to Mapnik’s layer of the OSM main map. This may seem a lot of time, but considering there were about 11 million nodes and 1.5 million ways to process, I’m quite pleased. And there are further improvements still possible in the rendering engine which could reduce this time at least by half.
British National Grid
One of the biggest worries I had was how the rendering engine will behave when rendering using a map projection and reference system different from the “standard” Mercator and WGS-84. This actually proved to be the easiest thing to solve: the rendering engine internally uses the National grid coordinates and the spatial DB queries transform those transparently to WGS-84 (and back). I also needed to implement a new map projection, but this was just a matter of writing a few lines of math code.
Since the map was supposed to be “professional” looking, there were a few things that needed to be implemented or improved on the existing rendering engine. These features will probably be included in the next generation of Kosmos, which will be released some time next year.
One of the first (and hardest) things needed for a good looking map is to make sure the text labels do not overlap each other. This is called (automatic) label placement. Good algorithms for label placement are quite complex and I didn’t really have time to implement a full-blown algorithm, so I chose to do a simple point-selection: the algorithm removes labels for smaller places (towns) until there are no more overlaps (see the sample map at the beginning of the post). I’ll write more about this feature some other time.
Better Relations Support
Some of the features needed to be rendered are now defined in OSM using relations. One example of this is national parks. So a new algorithm for consolidating relation’s ways into a single polygon was implemented.
Better Sea Filling Support
The coastline processing algorithm is now more resilient: it will ignore poorly connected coastlines and will still render all of those which are properly defined. This improvement is already in use in the new GroundTruth version released a week ago.
During the work on PostgreSQL support I learned about SpatiaLite, a spatial extension to sqlite. As you may know, sqlite is a popular self-contained database engine which stores the data into a single file. So I decided to do a bit of playing around with it and see if it can also be used in Kosmos. I was a bit skeptical whether SpatiaLite will be fast enough for map rendering, but I can already say that it managed to achieve some very good results. I’ll write about SpatiaLite in the next blog post.