A Blog About Programming, Search & User Experience

C/C++ is still the only way to have high performance on Mobiles

[Edit 15-Nov-2012] I had questions on reddit about the data-structures and algorithms we used. We develop an embedded search engine for mobiles and tests were done on our own data-structure that is far more efficient than SQLite or other platform options for this use-case. [/Edit]

When it comes to programming languages and performance, you can read all and its opposite on the web. It’s definitely a very controversial topic!

For Algolia the story started when researching an instant suggest algorithm and I used Java for two reasons:

  • Main reason: our first client was using Java on Google App Engine
  • Secondary: at that stage, I was doing a lots of refactoring and Eclipse is very efficient for these tasks

Once our algorithm was designed, I started to optimize performance on a desktop computer (core I7 950). For this, I indexed all titles of the english version of wikipedia (4 millions titles) and I optimized the Java code mainly by reducing the number of allocations. All instant suggest queries were then faster than 10ms.

As we planned from the begining to support iOS and Android, I then ported the Java code to Android. I ended up with a few modifications (we needed to support old Android versions which have not a full support of Java SDK).

In order to test performance, I created a small “demo” app: CitiesSuggest, a mobile application based on Geonames database to look for a city name. I filled the index with all places with a known population. In the end, the database contained 270 000 city names.

As could be expected, the resulting application was quite sluggish on my Galaxy S2. Worst queries could take more than one second of CPU.

I then applied all possible methods described in android documentation and was able to reduce the response time to 700ms for the longuest queries. This is better but still gives end-users an impression of sluggishness! Fortunately common queries worked well enough to present our very first demos at LeWeb’12 London. I was subsequently able to much improve the user experience by adding asynchronous support. At that point, we decided to start implementing of the objective-C version for iOS.

I started from scratch a new implementation fully done in objective-C without any premature optimization. I then developed the same CitiesSuggest application for iOS. I first got stuck with some basic UI stuff. For example I needed to reimplement an UILabel that supports highlighting! The standard UILabel does no support this and other implementations like Three20 just have too many dependencies (you can download AlgoliaUILabel on github, I released  the code under Apache licence). Once the UI was ready, I could see the actual performance was catastrophic! Instant suggest queries were between 200ms and 10 seconds on an iPhone 4S!

After profilling, I discovered that 95% of the time was spent in Objective-C messaging and ARC. Actually, I have millions of calls to very small methods with 1 or 2 lines of code and I found a very good explanation in the internal of objc_msgSend (mainly on mikeask.com). There is in fact a hash table lookup behind objc_msgSend! This explains most of my performance problems.

To fix these problems, I started to replace all this low level objective-C implementation by a C++ implementation with inlined methods. Eventually, I finished with Objective-C code for high level classes while all the core is C++.

This change has dramatically improved performance with instant-search queries between 2ms and 90ms on a iPhone 4S. I struggled to find complex enough queries to reach 90ms! This resulted in a very nice user experience with a remarkably slick demo :)

After this succes, I decided to use the same C++ code on android with Android NDK. With this approach I reduced our maximum query time from 700ms to 80ms on a Galaxy S2. But I was really disappointed by the Android display as I did not reached the same interactive experience than with iOS. The display of results stays slower on Android, probably because I did not spend enough time to optimize this part.

In the end, I lost a lot of time with Java and Objective-C trying to optimize code but the real solution was to use C/C++. I am fully convinced that this is just not possible to reach the same speed with Objective-C only or Java only.

And there is also another good property with C++ code: Blackberry supports C++ and there is a lot of chance that Windows Phone 8 SDK will support C++ when released in a few weeks. Actually I do not see any other alternative than C/C++ when you are looking for performance on mobile devices, which is our case :)

I hope this article will avoid you to spend as much time on Java/Objective-C optimizations as I did.

  • http://www.algolia.com/ Julien Lemoine

    This blog post is on reddit !
    http://bit.ly/THfWmg

  • Ciprian Mustiata

    Sorry, but did you try the Mono for Android? I mean performance is obviously relatively comparable to C++. I agree that Android’s lower end optimizing JIT is way slower than a LLVM based compilation backend. The single downside of Mono for Android/MonoTouch is that the IDE + platform costs money, but as for me performance was not an issue. And if you think about Windows Phone 8 SDK, they are smart enough to use a static compiler too (using the cloud to compile) so it should offer a good performance too.

    • http://www.algolia.com/ Julien Lemoine

      No we haven’t tried Mono for Android. The big advantage of C/C++ is that your optimization is working for Android, iOS and Windows Phone 8.

      In fact we have also done a lot of optimizations that are not possible in C# (code inlining, reduction of page faults on a mmaped file, ….).

      • Ciprian Mustiata

        Code inlining is done in may ways by the backend optimizer, isn’t so? And as Android from SDK 4.0 uses LLVM, as Mono for Android, if they match good patterns they both are inlined. And don’t take my word for it, Microsoft blog announced that both Mono and .Net offers it: http://blogs.microsoft.co.il/blogs/sasha/archive/2012/01/20/aggressive-inlining-in-the-clr-4-5-jit.aspx I fully agree that are optimizations that are C++ only, as I think there are optimizations that work smooth only only (like creating of small objects in a fairly cheap fashion, if for example you will have a parsing). Also, JIT on demand can create some nice code cases (as Mono for Android is statically compiled, but new code is JIT-ed) so you can supposedly create an expression to evaluate a record in your DB as working, and at the end JIT it (using for example C# hosted compiler) so you will have a “native” evaluator. What I mean, that the title “… C++ the only way” is it true to some extent, but still it matter a lot on your application and coding style.

    • dD

      Oh yeah, “relative comparable to C++” on what scale factor? Raw number crunching C# mono is at its best twice as slow as C++, 5-6 times slower on average including up to 10+ times slower at specific workloads. On top of what you have to pay for it. Plus it is far from Qt in the graphics area.

  • http://www.itoctopus.com/ itoctopus

    Hi Julien,

    C is just one level above assembly language (so it’s normal for it to be much faster) – although I have to wonder how it works within Android.

    Is there a guide that explains how to use C/C++ in Android? It would be very interesting to try something like this to see myself.

    • http://www.algolia.com/ Julien Lemoine

      Yes there is a SDK for native development on Android that is called NDK, it contains all cross compilers to compile your C/C++ code to all architectures (arm, armv7, x86 and mips) :http://developer.android.com/tools/sdk/ndk/index.html

  • Rigoberto

    Was it possible to avoid the use of the hash table while still use objective C?

    • http://www.algolia.com/ Julien Lemoine

      Yes there is one but it is painful compare to using C++, and you have no inlining:

      SEL theSelector = @selector(methodName:);
      IMP theImplementation = [self methodForSelector:theSelector];
      theImplementation(self, theSelector);

  • http://mooseyard.com/Jens Jens Alfke

    I agree that for really low-level performance-sensitive code, like your search engine, you’ll want to go down to the metal (which today means C or carefully-limited C++; I remember the days when C was derided and you had to go down to assembly for “real” work.)

    But most people aren’t writing code like that, so your experience — while valid for your project — is of limited use in general. In most application code, performance bottlenecks come from poor choice of algorithms, overuse of memory, or lack of concurrency … not from raw CPU limitations. I’ve been writing Objective-C code for 13 years and it’s quite rare in my experience for message-sends to be more than a blip in profile results. (And where they are, you can rewrite that piece of code in C/C++ without having to abandon Obj-C for the rest.)

    It sounds from your description (“without any premature optimization”, “millions of calls to very small methods with 1 or 2 lines of code”) as though you’re very used to C++ and inline methods, or at least an inlining JIT. As with any other language, it takes time to learn the right way to write Objective-C code to make it work well, especially on a mobile platform.