Published by breki on 18 Jan 2011 at 11:03 pm
Maperitive: Reading OSM PBF Files
UPDATE: the post below was based on premature assumptions that my new PBF code is actually working. It turns out it had a number of serious bugs which made reading look faster than it actually is. Here’s a followup post.
For the last couple of days I’ve been working on a PBF file reader for Maperitive. PBF file is a binary file for storing OSM geo data using Google’s protocol buffers.
It’s been a steep learning curve, since I had to learn three things at the same time: protocol buffers, using protobuf-net library for .NET and understanding the PBF format. I’m mostly satisfied with the protobuf-net library, although the lack of any new development activity worries me a little bit.
I’ve finished most of the PBF reading stuff this evening and I was eager to test the new code against the old XML reader. I’ve used Geofabrik’s Denmark data, here are some rough results:
PBF file loads 7.6 times quicker than the .OSM.bz2 file. This is a really good result, mostly thanks to the way the PBF format has been designed.Loading of PBF data uses a quarter less memory than the XML file. I’m talking about the memory used in the process of loading, not for storing the loaded OSM data – the data is internally stored in the exactly same way both for PBF and XML reading. This result surprised me a bit, I guess the extra memory consumed by the XML reader is due to the XML parser itself and/or the fact that a lot more strings are generated when reading XML OSM tags. PBF uses string tables and thus saves a lot of space by reusing common strings.


Chris Hill on 19 Jan 2011 at 14:22 #
I look forward to being able to use the same .pbf files in Maperative that I can use in mkgmap. More support for the pbf format will reinforce it as the best way to share OSM in binary format.
breki on 19 Jan 2011 at 20:21 #
I agree. Although my yesterday’s post was a bit overoptimistic – today I discovered a couple of bugs in the reading code. so the end result is much less glamorous. I’ll update the post when I finish the implementation.
igorbrejc.net » Maperitive Build 1108 on 20 Jan 2011 at 22:17 #
[...] My previous post about PBF reading successes was written way too prematurely. It turned out my PBF reading code had some serious bugs which made reading look much faster than it actually was (one of the reasons was that I neglected to read OSM node keys/values when written in PBF dense node format). [...]