Archive for the 'development' Category

Published by breki on 12 Feb 2012

GroundTruth News

Alexander Ovsov has kindly translated my GroundTruth posts into Romanian. Quoting Alexander:

Our global volunteer project is called “Translation for Education”, located in all-around-the-world, with no headquarters. Its purpose is translating articles dealing with interesting science subjects such as biology, chemistry, geology, medicine, and information technologies in order to assist students and university staff who are not very good at foreign languages and help them become familiarized with relevant scientific news from abroad. Translation from English to small aborigine European (Indo-European) languages, not the big ones.

Thanks, Alexander, and keep up the good work!

Although I haven’t done any GroundTruth work for a long time, a couple of days ago I’ve started migrating (and upgrading) the source code from my local Subversion repository to Mercurial on Bitbucket. The code is in turmoil, but I hope I’ll be able to take some time from other work to clean it up and fix some bugs that have been found by users in the last year. I will also try to incorporate some new stuff that I’ve developed for Maperitive.

Published by breki on 20 Jan 2011

Maperitive Build 1108

Supporting the Liberty (fries?)
Creative Commons License photo credit: Omar Eduardo

My previous post about PBF reading successes was written way too prematurely. It turned out my PBF reading code had some serious bugs which made reading look much faster than it actually was (one of the reasons was that I neglected to read OSM node keys/values when written in PBF dense node format).

I’ve subsequently written some extensive tests, comparing OSM database contents from XML and PBF file of the same area (thanks Geofabrik) on an object by object basis, so I’m now 95% sure the PBF code works OK. Performance-wise the (final?) results are much less glamorous than it looked initially: PBF reading is “only” 2.5 times faster than reading OSM.bz2 files, while in memory consumption terms, they are pretty much the same. I curious what other OSM software like osmosis has to say about these results.

I had hoped I could speed the PBF reading by spreading the work on several processor cores. What I did is to use Microsoft’s Parallel Extensions library to separate the fetching of PBF file blocks from the actual parsing of them into two (or more) cores. This resulted in only about 10% increase of the overall speed (tested on my two-core machine, so on more cores the result could be better).

It actually proved pretty hard to do a decent job of separating work in some balanced fashion. Since the file reading is sequential, this can only be done by one thread/core, so you want to put as little other work to that core as possible. As soon as file block bytes are fetched from the file, they are delegated to another core to parse it (in terms of protocol buffers) and then extract OSM objects from it. The problem is that you don’t want to enqueue too many file blocks at the same time, since this takes up valuable memory (which is already filled with extracted OSM objects). So I ended up using a blocking queue, which means the main thread (which reads the file) will wait until at least one core is available before filling the queue with another file block.

I’ve also tried micro-management strategy – using multiple cores to extract individual OSM objects, but this only really works for ways and relations. Current PBF extracts use dense nodes format, which is delta-encoded and thus forces you to read things sequentially on a single thread of execution. I guess this is the price of having a format that wants to satisfy two different (and inherently conflicting) goals: less space and less CPU.

I’m fairly new to Parallel Extensions and there are probably better ways of handling this, but I’ll leave it for the future.

Anyway, a new Maperitive release is out, grab it from the usual place.

Published by breki on 18 Jan 2011

Maperitive: Reading OSM PBF Files

UPDATE: the post below was based on premature assumptions that my new PBF code is actually working. It turns out it had a number of serious bugs which made reading look faster than it actually is. Here’s a followup post.

For the last couple of days I’ve been working on a PBF file reader for Maperitive. PBF file is a binary file for storing OSM geo data using Google’s protocol buffers.

It’s been a steep learning curve, since I had to learn three things at the same time: protocol buffers, using protobuf-net library for .NET and understanding the PBF format. I’m mostly satisfied with the protobuf-net library, although the lack of any new development activity worries me a little bit.

I’ve finished most of the PBF reading stuff this evening and I was eager to test the new code against the old XML reader. I’ve used Geofabrik’s Denmark data, here are some rough results:

  • PBF file loads 7.6 times quicker than the .OSM.bz2 file. This is a really good result, mostly thanks to the way the PBF format has been designed.
  • Loading of PBF data uses a quarter less memory than the XML file. I’m talking about the memory used in the process of loading, not for storing the loaded OSM data – the data is internally stored in the exactly same way both for PBF and XML reading. This result surprised me a bit, I guess the extra memory consumed by the XML reader is due to the XML parser itself and/or the fact that a lot more strings are generated when reading XML OSM tags. PBF uses string tables and thus saves a lot of space by reusing common strings.

Published by breki on 18 Oct 2010

Web Testing & Gallio: A Little Helpful Trick

When doing automatic testing of Web apps using unit testing frameworks, it can be a pain in the butt to pinpoint the proper HTML element. A lot of times tests will fail because you used a wrong locator, but since the browser will automatically close after the test, you don’t have an access to the HTML code of the page to look at what’s actually there.

Fortunately Gallio provides a class called TestContext which contains the current information about the running test and which you can use to determine if the latest test is successful or not. This can then be used to run your custom handling code during the test teardown:

        [TearDown]
        protected virtual void Teardown()
        {
            if (TestContext.CurrentContext.Outcome.Status == TestStatus.Failed)
            {
                using (TestLog.BeginSection("Failed web page HTML"))
                    TestLog.Write(WebDriver.PageSource);
            }
        }

In the above snippet, we record the current Web page’s HTML code into Gallio’s log (the TestLog class). To avoid spamming the log, we do this for failed tests only.

Gallio provides a powerful framework which I think is very much underused, mostly because the documentation is not very detailed (to say the least).

Published by breki on 25 Sep 2010

Poor Man’s Task Tracking Tool, Revisited

Back in the days before Maperitive had been released for the first time, I wrote a post about how I use simple text files to keep the track of things I have to implement (and things already implemented).

It turns out the to-do list has grown so much that it is very difficult to decide which things to implement in which order. Some features or bugs come in the middle of implementing other features and I’ve frequently had to make use of SVN branches to be able to work things out.

So I got an idea of using Google Docs spreadsheets to create a list of tasks. But a simple list was not enough: I wanted the spreadsheet to be able to tell me which tasks should be implemented first and which can wait. I’ve added two columns to the list: priority and complexity. Then there’s a third column called score, which calculates a score based on the priority and complexity using a simple formula. The complexity is measured in “ideal hours” the task is supposed to take (a rough estimate, of course), while the priority is some value (usually an integer from 1 to 5) which denotes how important the task (or feature) is.

"to do" list using Google Docs

After entering tasks, I simply use spreadsheet’s “Sort sheet Z –> A” function to make the tasks with the highest score appear at the top of the list.

Simple, but effective.

Published by breki on 11 Sep 2010

Storing Your Source Code

bb source
Creative Commons License photo credit: eisenrah

UPDATE: I received a very helpful comment, which seems to invalidate some of my statements in the post. Be sure to read the comment. I’ll make further updates when I do some more investigating on the matter.

For the past three or four years I’ve been using Subversion installed on my local development machine. Initially I used a custom installation on Apache, which took me quite a few hours to set up (basically if you want to have more than one repository, Apache is a must). Later, after couple of years, I started using VisualSVN Server, which is a great free self-contained SVN server installation.

This all works great, but the biggest problem is accessing the repository from the outside world, both in terms of security and in terms of me not wanting to have my SVN repository computer running all of the time (I’m a believer in keeping machines turned off if you’re not using them).

On the other hand, I started using distributed VCS systems like git and Mercurial for my open-source projects. The biggest benefit I see in the fact that you keep your own repository on your development computer, so you can have the history of changes, which isn’t really an option when you’re working with SVN in an offline mode.

So I started thinking about experimenting with a commercial VCS hosting solution like Assembla or xp-dev.com for my closed-source projects. Apart from the decision on which provider to use (they both seem to get good reviews), the biggest question is: which VCS?

Although git and Mercurial are all the rage now, I don’t see much benefit in using them for a one-man projects on a VisualStudio platform. Let me explain why.

Integration With VisualStudio

I got so much used to AnkhSVN, that I simply cannot work without it. Renaming files, moving them around the solution, automatic refactoring using Resharper, that’s all handled pretty well by AnhkSVN. I still use TortoiseSVN for commits, but in VisualStudio, Ankh is the king. I never use SVN from the command line and I don’t need to. AnkhSVN is simply a great productivity booster.

And this is why using git or Mercurial is such a pain in VisualStudio. I frequently use “Rename class” refactoring in Resharper and it renames the class file, too. This gets undetected by git and Mercurial and I end up with “missing files” when committing.

Local Repositories

While having your own repositories on development computers is a truly great thing, not having them isn’t such an issue if your online SVN repository is available most of the time. And the problem with VS integration far outweighs other benefits of a distributed VCS when you’re running a one-man shop.

Decisions, Decisions…

So I’ll probably start using a commercial SVN hosting option, at least as a trial. Most of the providers offer limited free plans, so it’s a good place to start…

Published by breki on 06 Sep 2010

Random Thoughts

I feel I’ve been neglecting my blog lately and it’s a shame. It is not that I don’t have stuff to write, it’s just that I’m so immersed into developing various projects (mostly Maperitive) that I don’t seem to find the time and energy to take some time to write something interesting.

Yes, I know one of the first rules in writing blogs is not to apologize for not writing. But anyway, I’ve decided I should write something more regularly but deliver it in smaller packages. This way writing shouldn’t look so intimidating and it should make it easier for me to write.

How’s Maperitive

There has been a lot going on behind the scenes in Maperitive. I’ve been opening many different fronts, but I’ve mostly worked on improving the GUI. Maperitive started out as a (more or less) command line application and now I’m slowly trying to improve its usability. Right now the GUI is still too intimidating for a non-technical user and a lot of work is still needed to improve this.

Like the most of other code in Maperitive, the GUI framework has been written mostly from scratch, with some reusing of the existing code of Kosmos. I had to reinvent the wheel on each step, since there aren’t many good WinForms GUI frameworks out there (in fact I only know of one which is a beast and I didn’t feel the urge to invest a huge amount of time to try to learn it). The advantage of this is that it forced me to get to know the problems I’m trying to solve and not just sweep them under the rug using some 3rd party library.

All in all I’m quite satisfied with the new architecture. One of the main reasons I decided to pull the plug on Kosmos and start with clean code was to ensure the new architecture allows me to add new features more easily and to make the whole code base more manageable. I think I’ve achieved this, mostly by sticking to dependency injection and using Windsor Castle, an inversion of control container.

The application framework being built for Maperitive is generic enough to be reusable for other desktop applications, which could come in handy if I find the time to work on anything else. But right now I have so many ideas for new features in Maperitive that I doubt I’ll run out of work in the next year (or more).

Published by breki on 24 Aug 2010

Windsor Castle: Strange Resolving Behavior

A user reported a bug in Maperitive – it throws

Castle.MicroKernel.Resolvers.DependencyResolverException: Could not resolve non-optional dependency for ‘Karta.DataSources.OsmFileMapDataSource’ (Karta.DataSources.OsmFileMapDataSource). Parameter ‘fileName’ type ‘System.String’

I tried to reproduce this behavior using a simple unit test, but I couldn’t, so I’m posting the actual code. This is where the exception occurs:

return windsorContainer.Resolve<OsmFileMapDataSource>();

And this is how OsmFileMapDataSource constructors look like:

        public OsmFileMapDataSource(
            string fileName,
            IFileSystem fileSystem,
            IMapDataLayerFactory layerFactory)
        {
            ...
        }

        public OsmFileMapDataSource(IMapDataLayerFactory layerFactory)
        {
           ...
        }

Needless to say, both IFileSystem and IMapDataLayerFactory are registered in the container (IMapDataLayerFactory is registered as a typed factory, by the way). OsmFileMapDataSource is also registered as an implementation of itself. And I’m using version 2.1.0.6655 of the library.

What’s strange about this is that if I move the second constructor in front of the first one, the component is resolved without problems. I’m not sure if this is intended behavior, but I doubt the order of constructors should be a determining factor on how components are resolved.

But as I said, I couldn’t reproduce this behavior using a simplified test code, so I guess I should start debugging it instead.

Published by breki on 06 Aug 2010

Maperitive: Enhanced Usability & Scripting Support

Maperitive running scripts

For the last week or so I’ve been busting my fingers with one of the harder things to implement in a desktop GUI: application responsiveness when executing longer-running tasks. By longer-running I mean something that takes more than a couple of seconds.

Simple desktop applications tend to run everything synchronously: when the user presses a button, the action gets run. After the action finishes, the control is given back to the user. Simple, but totally crappy. The problem is that the action gets run on the same thread that services the GUI, so until the action finishes, the application will not be able even to refresh itself or respond in any meaningful way to user clicks or key presses. And since there is no refresh, you cannot show any progress indicators to the user. I know I wouldn’t want to wait half an hour for something to finish without some reassurance that the application is still alive and not waiting for the electricity to run out.

Maperitive already had a lot of responsiveness code implemented, but it was still an “under the construction” design. The additional complication was the fact I want Maperitive to have a good script runner and scripts can take a very long time (downloading OSM data, generating tiles etc.). After a lot of trials and errors I finally managed to implement the whole thing in a consistent package. And believe me when I say it was not easy.

So what is new:

  • When running scripts (or longer tasks), Maperitive draws an indicator on the map (see the screenshot above) and launches a “sort of modal” mode – most of GUI controls are disabled so the script doesn’t get confused by some inadvertent user action. However, the application is still responsive: you can view the progress of the script in the command log.
  • Aborting scripts: as the indicator says, you can press the escape key to abort the script. If you prefer torturing your mouse instead of your keyboard, there’s an “Abort task” button in the bottom right corner which does the same thing.

Not all of running tasks have been switched to the new system. Loading OSM files is one example of a task that will still block the GUI, but I will gradually improve these things.

You can download the latest release at http://maperitive.net/download/

Enjoy!

Published by breki on 21 Jul 2010

Too Much Version Control?

Too much version control?

Next »