Hacking Mapcache with ImageMagick

To generate tiles for the map stack used by FieldTrip GB we are using 4 Mapserver instances deployed to an OpenStack private cloud. This means we can get all our tiles generated relatively quickly using inexpensive commodity hardware. A problem we have is that the resulting PNG tile images look beautiful but are way too big for users to download to their mobile device in any quantity. So we looked to using Mapserver’s built in JPEG format but our cartographers were not happy with the results. One of my colleagues came up with the bright idea of using ImageMagick to compress the PNG to JPEG instead, and the result (using 75% compression) was much better. We can use the ImageMagick command line  with the following syntax:

convert_png_to_jpeg_delete_png.sh

for var in "$@"
do
echo "converting $var to jpg";
convert $var -quality 75 `echo $var | tr '.png' '.jpg'`;
# rm $var
done

and pipe this script using xargs to traverse an existing cache with the PNG generated tiles.

find . -name '*.png' -print0 |  xargs -0 -P4 ../convert_png_to_jpeg_delete_png.sh

So the cartographers finally relented and we now have much smaller files to download to devices. The only problem is that the script to run the ImageMagick convert takes for ever to run ( well all right – 2 days). It’s not because ImageMagick is slow at compression – it’s super fast. It’s just that the IO overhead involved is huge as we are iterating over  16 million inodes. So our plan of scaling up commodity hardware (4 CPU virtual machine) is failing. A solution is to do the jpeg conversion at the same time as the tile caching – this way you are only dealing with one tile at the point you are writing to the cache – so there is much less overhead.

So it’s time to hack some of the Mapcache code and get ImageMagic to add the above compression just after it writes the PNG to the cache.

This just involves editing a single source file found in the lib directory of the Mapcache source distribution  ( mapcache-master/lib/cache_disk.c ). I’m assuming below you have already downloaded and compiled Mapcache and also have downloaded ImageMagick packages including the devel package.

First of all include the ImageMagick header file

#include  <wand/magick_wand.h>

Then locate the method  _mapcache_cache_disk_set. This is the method where Mapcache actually writes the image tile to disk.

First we add some variables and an Exception macro at the top of the method.

MagickWand *m_wand = NULL ;
MagickBooleanType status;

#define ThrowWandException(wand) \
{ \
char \
*description; \
\
ExceptionType \
severity; \
\
description=MagickGetException(wand,&severity); \
(void) fprintf(stderr,”%s %s %lu %s\n”,GetMagickModule(),description); \
description=(char *) MagickRelinquishMemory(description); \
exit(-1); \
}

Add then right at the end of the method we add the MagickWand equivalent of the convert command line shown above. The compression code is highlighted

if(ret != APR_SUCCESS) {
ctx->set_error(ctx, 500, "failed to close file %s:%s",filename, apr_strerror(ret,errmsg,120));
return; /* we could not create the file */
}

// *******ImageMagick code here ********

ctx->log(ctx, MAPCACHE_INFO, “filename for tile: %s”, filename);
MagickWandGenesis() ;
m_wand=NewMagickWand() ;
status=MagickReadImage(m_wand,filename);
if (status == MagickFalse)
ThrowWandException(m_wand);
// MagickSetImageFormat(m_wand, ‘JPG’) ;
char newfilename[200];
strcpy(newfilename, filename) ;
int blen = strlen(newfilename) ;
if(blen > 3)
{

newfilename[blen-3]=’j’ ;
newfilename[blen-2]=’p’ ;
newfilename[blen-1]=’g’ ;
MagickSetImageCompression(m_wand, JPEGCompression) ;
MagickSetCompressionQuality(m_wand, 75 ) ;
ctx->log(ctx, MAPCACHE_INFO, “filename for new image: %s”, newfilename);
MagickWriteImage(m_wand, newfilename ) ;
}
/* Clean up */
if(m_wand)m_wand = DestroyMagickWand(m_wand);
MagickWandTerminus();

And that’s it. Now just the simple matter of working how to compile it, link it etc.

After a lot of hmm’ing and ah-ha’ing (and reinstalling ImageMagick to more recent version using excellent advice from here ) it meant making the following changes to the Makefile.inc in mapcache src root dir.

INCLUDES=-I../include $(CURL_CFLAGS) $(PNG_INC) $(JPEG_INC) $(TIFF_INC) $(GEOTIFF_INC) $(APR_INC) $(APU_INC) $(PCRE_CFLAGS) $(SQLITE_INC) $(PIXMAN_INC) $(BDB_INC) $(TC_INC) -I/usr/include/ImageMagick
LIBS=$(CURL_LIBS) $(PNG_LIB) $(JPEG_LIB) $(APR_LIBS) $(APU_LIBS) $(PCRE_LIBS) $(SQLITE_LIB) -lMagickWand -lMagickCore $(PIXMAN_LIB) $(TIFF_LIB) $(GEOTIFF_LIB) $(MAPSERVER_LIB) $(BDB_LIB) $(TC_LIB)

Then run make as usual to compile Mapcache and you’re done! The listing below shows the output and difference in compression:

ls -l MyCache/00/000/000/000/000/000/
total 176
-rw-r–r–. 1 root root 4794 Jul 23 13:56 000.jpg
-rw-r–r–. 1 root root 21740 Jul 23 13:56 000.png
-rw-r–r–. 1 root root 2396 Jul 23 13:56 001.jpg
-rw-r–r–. 1 root root 9134 Jul 23 13:56 001.png
-rw-r–r–. 1 root root 8822 Jul 23 13:56 002.jpg
-rw-r–r–. 1 root root 46637 Jul 23 13:56 002.png
-rw-r–r–. 1 root root 8284 Jul 23 13:56 003.jpg
-rw-r–r–. 1 root root 45852 Jul 23 13:56 003.png
-rw-r–r–. 1 root root 755 Jul 23 13:55 004.jpg
-rw-r–r–. 1 root root 2652 Jul 23 13:55 004.png

original PNG tile

converted to JPEG at 75% compression

Fourth International Augmented Reality Standards Meeting

I’m just back from the Fourth International AR Standards Meeting that took place in Basel, Switzerland and trying hard to collect my thoughts after two days of intense and stimulating discussion. Apart from anything else, it was a great opportunity to finally meet some people I’ve known from email and discussion boards  on “the left hand side of the reality-virtuality continuum“.

Christine  Perry, the driving spirit, inspiration and editor at large of  AR Standards Group has done a fantastic job bringing so many stakeholders together representing Standards Organisations such as the OGC, Khronos, Web3d Consortium, W3C, OMA and WHATWG  Browser and SDK vendors such as Wikitude, Layar, Opera, ARGON and Qualcomm AR and hardware manufacturers ( Canon, SonyEricsson, NVIDIA) as well as several solution providers such as MOB Labs and mCrumbs – oh and a light sprinkling of academics ( Georgia Tech, Fraunhofer iDG ).

I knew I’d be impressed and slightly awe struck by these highly accomplished people, but what did  surprise me was the lack of  any serious turf fighting. Instead, there was a real sense of pioneering spirit in the room.  Of course everyone had their own story to tell (which just happened to be a story that fitted nicely into their organizational interests), but it really was more about people trying to make some sense of a confusing landscape of technologies and thinking in good faith about what we can do to make it easier.  In particular, it seemed clear that the Standards Organizations felt they could separate the problem space fairly cleanly between their specialist area of interest (geospatial, 3d, hardware/firmware, AR content, web etc). The only area where these groups had significant overlap was on sensor APIs, and some actions were taken to link in with the various Working Groups working on sensors to reduce redundancies.

In seemed to me that there was some agreement about how things will look for AR Content Providers and developers (eventually). Most people appeared to favour the idea of  declarative content mark-up language working in combination with a  scripting language (Javascript) similar to the geolocation API model. Some were keen on the idea of this all being embedded into a standard web browsers Document Object Model. Indeed, Rob Manson, from MobLabs has already achieved a prototype AR experience using various existing (pseduo) standards for web sensor and processing APIs. The two existing markup content proposals ARML and KARML are both based on the OGC’s KML, but even here the idea would be to eventually integrate a KML content and styling model into a generic html model, perhaps following the html/css paradigm.

This shared ambition to  converge AR standards with generic web browser standards is  a recognition that the convergence of hardware, sensors, 3d, computer vision and geo location is a bigger phenomenon than AR browsers or augmented reality. AR is just the first manifestation of this convergence and “anywhere, anytime” access to the virtual world as discussed by Rob Manson on his blog.

To a certain extent, the work we have been discussing here on geo mobile blog, using HTML5 to create web based mapping applications, is a precursor to a much broader sensor enabled web that uses devices such as camera, GPS, compass etc. not just to enable 2d mapping content but all kinds of application that can exploit the sudden happen-chance of  millions of people carrying around dozens of sensors, cameras and powerful compute/graphic processors in their pockets.

Coming back from this meeting, I’m feeling pretty upbeat about the prospects for AR and emerging sensor augmented web. Let’s hope we are able to keep the momentum going for the next meeting in Austin.

App Ecosystem

Earlier this week I attended the Open Source Junction Context Aware Mobile Technologies event organized by OSS Watch. Due to a prior engagement I missed the second day and had to leave early to catch a train. It was a pity as the programme was excellent and there was some terrific networking opportunities, although it sounds like I was fortunate to miss the geocaching activity which the twitter feed suggested was very wet and involved an encounter with some bovine aggression.

During the first two sessions I did attend there were quite a few people, including myself, talking about the mobile web approach to app development. I made the comment that the whole mobile web vs. native debate was fascinating and current and that mobile web was losing. But everyone seemed to agree that apps are a pretty bad deal for developers and that making any money from this is about as likely as winning the lottery. This got me thinking on the train to Edinburgh about the “App ecosystem” and what that actually means. A very brief Google search did not enlighten me much so I sketched my own App food chain, shown below.

It no surprise that the user is right at the bottom as all the energy that flows through this ecosystem comes from the guy with the electronic wallet.

But I think it’s going to be a bit of a surprise for app developers ( content providers ) to see themselves at the top of this food chain (along with Apple and Google) as it doesn’t feel like you are king of the jungle when the App retail cut is so high and prices paid by users is so low.

It will be interesting to see if Google, who are not happy with the number of paid apps in the Google Marketplace cut the developer a better deal. Or if the Microsoft Apps built on top of Nokia try to gain market penetration by attracting more high quality content. My guess is not yet. The problem for developers is that the App retailers can grow at the moment just by the sheer number of new people buying smartphones. This is keeping prices artificially low and means app retailers are not competing all that much for content. But smartphone ownership is in fact growing so fast that pretty soon ( approx 2 years?) everyone who wants or can afford a smartphone is going to have one. How do app retailers grow then? They are going to have to get users to part with more money for apps and content either by charging more or attracting advertsing revenue. Even though there are a lot of app developers out there, apps users will pay for are scarce and retailers are going to have to either pay more to attract the best developers and content to their platform, or make life easier for content providers by adopting open standards. So maybe the mobile web might emerge triumphant after all.

OpenLayers Mobile Code Sprint

Last week EDINA had the opportunity to take part in the OpenLayers Mobile code sprint in Lausanne. A group of developers from across the world gathered to add mobile support to the popular Javascript framework.

After a week of intensive development we have been able to add a number of new features allowing OpenLayers to function on a wide range of devices, not only taking advantage of the touch events available on iPhone and some Android mobiles to allow touch navigation, but also enabling the OpenLayers map to be responsive and useful on other platforms, or even unexpected devices!

Jorge Gustavo Rocha and myself worked on adding support for HTML offline storage. Covering storing maps and feature data on the users local browser using the Web Storage and Web SQL standards. Here is the example sandbox which allows the user to store map tiles for the area they are viewing, which are automatically used instead of downloading the online image when possible.  More details on this and other features added can be found on the OpenLayers blog.

I have to say I wasn’t sure what to expect, and I have certainly found it rewarding contributing to OpenLayers and working with such a dedicated and talented team of developers. Far more was achieved than I would have thought possible in such a short space of time. Very inspiring stuff!

Comparing AR Browsers

I’m currently researching capabilities of augmented reality browsers as part of a future JISC Observatory Report on Augmented Reality In Smartphones, aimed at helping developers and content publishers working in Higher Education exploit this technology to create novel and exciting new learning experiences. If all goes well this report will be in its final stages in the New Year.

I’ve started off by developing a classification of augmented reality browsers that aims to assist developers and content publishers to navigate the confusing landscape of applications and frameworks that has emerged as augmented reality technology becomes increasingly ubiquitous. There are some existing taxonomies of augmented reality applications in general, such as Papagiannakis et al. , but as far as I know there is no existing classification of the browsers and application frameworks recently made available on smartphone devices.

The table below shows the results of several days research trawling through documentation and experimenting with the browsers on our development phones (iPhone 3GS and Android Legend).

classification of AR browsers

AR browsers

A full explanation of the classification criteria I’m working with would take several blog posts, so you’ll have to wait for the JISC report, but below I give a very brief summary and report some early findings. it will be interesting to get feedback from members of the geo mobile community on whether the classification ( alway contentious) makes sense to them.

Criteria 1: Registration and tracking

GPS/sensor yes / no
Markerbased yes / no / API / src / plugin
Markerless yes / no / API / src

Criteria 2: Built in User Actions

Post text user can post a text message to the current location / orientation of the handset. Often users can choose a 2d icon or sometime a 3d icon to represent the message in the browser reality view.
Post image user can choose an image already in the handset gallery to the current location /orientation of the handset.
Post snap user can take a picture using the device camera and then upload the image to the POI server
Post 3d user can select a 3d model and make it viewable to public or friends at the current location orientation of the handset.
WebView the developer can offer arbitrary web based services  to user through an embedded web viewer.
Social the user has access to social network platform including common actions such as follow, invite, comment etc. Typically user generate content such as posting text messages can be configured so that only friends in the users social network can see the content.
Visual Search user can take photo of an real world object such as a book cover and obtain information about the object using image recognition technology

Criteria 3: Publishing API

Crowd crowd sourced content is published by regular users using facilities available in the browser itself, Typically, images, audio clips and text as well as a predefined gallery of 3d objects are available for crowd sourced content publishing.
Open [key] Platform provides an API that allows developers to publish their own data. For open keys there is no registration fee for developers and no practical limit on the users access to the published content. This also includes platforms that allow developers to publish their content without any key or registration at all.
Commercial [key] A publishing API is available but some kind of fee or restriction on use is applied by the platform provider.
Bundled data is bundled into the app itself. This assumes developer has access to the browser source code and can therefore create and publish their own apps for download.

Criteria 4: Application API

Open [key] A developer can reuse browser code and APIs to create their own version of the browser and are free to publish the application independently of the platform provider.
Restr[icted] A developer can create their own version of the application but license restrictions apply to publishing the app.
Com[mercial] a commercial license is required to develop applications using the framework/API.
Custom[ize only] The developer cannot add any real functionality to the application but the visual appearance can be changed and optional functionality switched on or off.

Criteria 5: AR Content

2d POI are represented by 2d image icons, text or bubbles. Typically the icons can be touched to activate an action ( more information view, map view , directions, call etc.)
3d a 3d object can be superimposed on the reality (camera) view in 3d space to give the impression that the object is part of the natural environment.
3d-anim[ated] a 3d object is superimposed on the reality (camera) view and parts of the model can be made to move using 3d animation techniques.

Criteria 6: [P]oint [O]f [I]nterest actions

Info ability to link to a web page with more information about the object
Audio ability to play a sound clip
Video ability to play a video clip
Music play music track on device music player
Map/Take Me There see POI on as pin on map with option to show route from current location to POI location.
Search [shop] ability to find search results using search engine or shopping channels
Call can click button to make phone call to number in POI response
Email can write email message to email address provided in POI response
SMS can write SMS text message to mobile number provided in POI response
Social various social network actions such as comment, share, profile
Events allows developer to define their own events when user interacts with POI

Criteria 7: offline mode

Online only application requires a network connection at all time to work properly
Offline application also works offline – data is updated by obtaining a new version of the application
Cached layer Channels or layers can be cached while online.

It’s possible to view the criteria in the table from the perspective of the user ( I want to do cool stuff) and from the perspective of the developer ( I want more control over want user can see and do). To visualize this I scored the browsers against user and developer preferences, using a fairly arbitrary scoring system against the criteria discussed above and adding an extra “build quality” criteria to score the user axis and “developer tools” criteria to the developer axis to get the bubble chart below. The size of the bubbles represent the corporate strength of the organisation behind the browser. As the AR browser game is a winner takes all land grab, having some strong venture capital or solid private investment is not an irrelevant consideration in choosing a tool that meets your needs. It goes without saying that charts such as that below need to be taken with a pinch of salt. It all really depends what your needs are as a developer or content publisher.

What we left out

We did not include the following browsers in our evaluation:

SREngine: this looks like a promising framework for performing visual search but appears to be focused on the Japanese market for the moment. As a result we struggled to get enough English language documentation to fill in our classification matrix and also could not get the app from the UK AppStore.
GeoVector World Surfer: World surfer provides an appealing browser that allows you to point the handset and discover POI. There is a reality view but this only works on a single POI the user has already selected. There does not appear to be any developer access to either publishing or application framework at this point.
AcrossAir: Is an AR browser that is sold as a marketing tool to content providers. The vendor controls both publishing and application development so there is not much scope for developers to utilize this platform unless they have a marketing budget.
RobotVision: One of the most impressive independent offerings. But there are no APIs for developers and the project seems to have stalled with no recent updates on the AppStore.