Comparing AR Browsers

I’m currently researching capabilities of augmented reality browsers as part of a future JISC Observatory Report on Augmented Reality In Smartphones, aimed at helping developers and content publishers working in Higher Education exploit this technology to create novel and exciting new learning experiences. If all goes well this report will be in its final stages in the New Year.

I’ve started off by developing a classification of augmented reality browsers that aims to assist developers and content publishers to navigate the confusing landscape of applications and frameworks that has emerged as augmented reality technology becomes increasingly ubiquitous. There are some existing taxonomies of augmented reality applications in general, such as Papagiannakis et al. , but as far as I know there is no existing classification of the browsers and application frameworks recently made available on smartphone devices.

The table below shows the results of several days research trawling through documentation and experimenting with the browsers on our development phones (iPhone 3GS and Android Legend).

classification of AR browsers

AR browsers

A full explanation of the classification criteria I’m working with would take several blog posts, so you’ll have to wait for the JISC report, but below I give a very brief summary and report some early findings. it will be interesting to get feedback from members of the geo mobile community on whether the classification ( alway contentious) makes sense to them.

Criteria 1: Registration and tracking

GPS/sensor yes / no
Markerbased yes / no / API / src / plugin
Markerless yes / no / API / src

Criteria 2: Built in User Actions

Post text user can post a text message to the current location / orientation of the handset. Often users can choose a 2d icon or sometime a 3d icon to represent the message in the browser reality view.
Post image user can choose an image already in the handset gallery to the current location /orientation of the handset.
Post snap user can take a picture using the device camera and then upload the image to the POI server
Post 3d user can select a 3d model and make it viewable to public or friends at the current location orientation of the handset.
WebView the developer can offer arbitrary web based services  to user through an embedded web viewer.
Social the user has access to social network platform including common actions such as follow, invite, comment etc. Typically user generate content such as posting text messages can be configured so that only friends in the users social network can see the content.
Visual Search user can take photo of an real world object such as a book cover and obtain information about the object using image recognition technology

Criteria 3: Publishing API

Crowd crowd sourced content is published by regular users using facilities available in the browser itself, Typically, images, audio clips and text as well as a predefined gallery of 3d objects are available for crowd sourced content publishing.
Open [key] Platform provides an API that allows developers to publish their own data. For open keys there is no registration fee for developers and no practical limit on the users access to the published content. This also includes platforms that allow developers to publish their content without any key or registration at all.
Commercial [key] A publishing API is available but some kind of fee or restriction on use is applied by the platform provider.
Bundled data is bundled into the app itself. This assumes developer has access to the browser source code and can therefore create and publish their own apps for download.

Criteria 4: Application API

Open [key] A developer can reuse browser code and APIs to create their own version of the browser and are free to publish the application independently of the platform provider.
Restr[icted] A developer can create their own version of the application but license restrictions apply to publishing the app.
Com[mercial] a commercial license is required to develop applications using the framework/API.
Custom[ize only] The developer cannot add any real functionality to the application but the visual appearance can be changed and optional functionality switched on or off.

Criteria 5: AR Content

2d POI are represented by 2d image icons, text or bubbles. Typically the icons can be touched to activate an action ( more information view, map view , directions, call etc.)
3d a 3d object can be superimposed on the reality (camera) view in 3d space to give the impression that the object is part of the natural environment.
3d-anim[ated] a 3d object is superimposed on the reality (camera) view and parts of the model can be made to move using 3d animation techniques.

Criteria 6: [P]oint [O]f [I]nterest actions

Info ability to link to a web page with more information about the object
Audio ability to play a sound clip
Video ability to play a video clip
Music play music track on device music player
Map/Take Me There see POI on as pin on map with option to show route from current location to POI location.
Search [shop] ability to find search results using search engine or shopping channels
Call can click button to make phone call to number in POI response
Email can write email message to email address provided in POI response
SMS can write SMS text message to mobile number provided in POI response
Social various social network actions such as comment, share, profile
Events allows developer to define their own events when user interacts with POI

Criteria 7: offline mode

Online only application requires a network connection at all time to work properly
Offline application also works offline – data is updated by obtaining a new version of the application
Cached layer Channels or layers can be cached while online.

It’s possible to view the criteria in the table from the perspective of the user ( I want to do cool stuff) and from the perspective of the developer ( I want more control over want user can see and do). To visualize this I scored the browsers against user and developer preferences, using a fairly arbitrary scoring system against the criteria discussed above and adding an extra “build quality” criteria to score the user axis and “developer tools” criteria to the developer axis to get the bubble chart below. The size of the bubbles represent the corporate strength of the organisation behind the browser. As the AR browser game is a winner takes all land grab, having some strong venture capital or solid private investment is not an irrelevant consideration in choosing a tool that meets your needs. It goes without saying that charts such as that below need to be taken with a pinch of salt. It all really depends what your needs are as a developer or content publisher.

What we left out

We did not include the following browsers in our evaluation:

SREngine: this looks like a promising framework for performing visual search but appears to be focused on the Japanese market for the moment. As a result we struggled to get enough English language documentation to fill in our classification matrix and also could not get the app from the UK AppStore.
GeoVector World Surfer: World surfer provides an appealing browser that allows you to point the handset and discover POI. There is a reality view but this only works on a single POI the user has already selected. There does not appear to be any developer access to either publishing or application framework at this point.
AcrossAir: Is an AR browser that is sold as a marketing tool to content providers. The vendor controls both publishing and application development so there is not much scope for developers to utilize this platform unless they have a marketing budget.
RobotVision: One of the most impressive independent offerings. But there are no APIs for developers and the project seems to have stalled with no recent updates on the AppStore.