Friday 19 October 2007

The Potential of Smartphones

So often in the mobile phone business, people have approached these devices as merely mobile versions of immobile technology. Thus the "mobile web", "mobile mail", "mobile phone", and so on. But what if we approached smartphones from the perspective of what they are, what they can do, and what we could do with that?

An example of a technology that is completely "home grown" to the mobile community is SMS. This was not a planned service in the same way as MMS or WAP -- the operators and handset manufacturers did not carefully design and market SMS. It was simply a capability available on the phones and network that suited people's needs, and so it took off. (Note that SMS was not "mobile instant messaging", it was simply a messaging facility for operator use which turned out to work wonderfully peer-to-peer. It's mode of operation is, in fact, quite different from IM, and it is only recently that efforts have been made to fit it into an IM type of UI, such as on the iPhone.)

So what could we come up with in the future, and how do we go about it? Or do we have to rely on accidents?

I think the first thing we need to understand with this approach is exactly what a phone is, and how it fits into people's lives. So let's tackle that one first.

What is a smartphone?
If you are intimately familiar with smartphones, you can skip this section. Still, it's fascinating to take stock of just how much functionality modern smartphones pack into them, and to think about the uses of that technology independent of the actual features.

A smartphone:
  • Is a small computer with substantial CPU and memory resources
  • Carried almost everywhere
Smartphones have:
  • Relatively small screen (sometimes touch sensitive)
  • Small keypad or keyboard
  • Built-in phone, for telecommunications with other people
  • Microphone and the ability to record from it
  • Speaker and the ability to play music and sounds
  • Usually a camera (or two) with the ability to capture stills and video
  • Internet connection which is usually, but not always, available
  • Bluetooth connection which can detect and communicate with neighbouring devices
  • Infrared connection which can communicate with neighbouring devices
  • Positioning information, available via cell ID or built-in GPS
  • Databases for contacts and calendar information

Some smartphones have:

  • Light sensor
  • Motion detectors (eg. iPhone, Nokia 5500)
  • Near field RFID units (eg. mobile Suica)

What can we do with this?

So, the question is then, what can we do with the (fairly impressive) bundle of functionality that millions of users carry around with them every day?

Well, some ideas are pretty straight-forward:

  • Use it as a PDA (to keep contacts, calendar, and notes)
  • Use it as a mobile web browser (slowly starting to become viable as screens get bigger and, more importantly, CPUs get fast enough to present the web in readable ways on a small screen)
  • Use it as a mobile email terminal (RIM has been particularly successful in this area, although something like an E61 or P990i/M600i on an operator-provided IMAP push service is just as good, and much cheaper)
  • Use it as a constantly-updated weather chart
  • Use it as a navigation unit, with maps, current location (via GPS), dynamic routing, and even dynamically updated traffic status
  • Use it as an e-book reader

What's obvious about these ideas are that they have all been transferred from devices that already exist. PDAs, web browsers (on desktops and laptops), email, online weather, GPS navigation, and e-book readers are all technologies that have been around for quite some time. Putting them on a smartphone certainly makes them more accessible, and thus more useful, but doesn't really transform the way they integrate with people's lives.

Are there other ideas that can be built from the smartphone's capabilities itself? Yes, of course there are, and here are some we've seen:
  • Lifeblog: using the camera, location and time information, and recording snippets of your life along with some comments on it, creating a multimedia "life blog". http://www.nokia.com/lifeblog
  • Sensor: using bluetooth and a personalised profile to discover and meet people in your
    immediate vicinity with matching interests. http://www.nokia.com/sensor

The problem with these ideas, and why they haven't taken the market by storm, is that there really just curiosities. They don't meet a real need. How many people complain that they don't have a sufficiently rich record of their lives? Not many.

New ideas

Are there other ideas that take advantage of the capabilities of a smart phone and meet a real need? I think there are many, and I'll talk here about one, which I would love to see implemented.

Imagine a service on your smartphone which took advantage of its computing ability, its knowledge of your location, and its connection to the internet. Imagine that you could specify a destination and a desired arrival time, and this service could go off and discover all the ways you could get to that destination at that time, then present you with the options, and then book your chosen options, and finally remind you about when you needed to get moving to get there, and guide you through the process.

For example, I might want to fly from my home on the Gold Coast, Australia, to a hotel in Hong Kong. The software would start by attempting to find a route from start to end. Once it knew this, it would attempt to find services on this route, starting with the most irregular and expensive (ie. the flight from Australia to Hong Kong), and then work down to the simplest (eg. getting to and from the airports). Then it would present me with a range of "best-case" options (no point confusing me with lots of almost-identical options), and I would choose what I wanted for the various legs. Remembered preferences (for example, I prefer the train over driving) would make the choices easier by prioritising them so my most likely choices are the first ones I see. Finally, it would take my choices, book them for me, and keep the e-ticketing information.

Then, when the time came to make the trip, the software would remind me when to start (maybe by putting appointments in my calendar), would present the e-ticket information when I needed it, and would guide me through the confusion of interchanges, and so on.

All of this information is available on-line today. All of the technology required to do this is available. Much of the infrastructure, such as mapping and routing technology is freely available for this type of "mash-up".

What's missing is a good UI running on the phone, which integrates well with the phone's capabilities (its small keyboard and screen, its calendar database and positioning technology), and provides fluid, friendly interaction.

There are obvious add-ons to this service, such as the ability to find and book accommodation given your parameters and choices. An even more powerful addition would be the ability to carpool with others who are heading in the same direction. Sharing aggregate data with transport providers could even allow them to improve the quality and efficiency of their services.

Mass market?

The question is, are these types of services useful for many people? Do they have mass market appeal?

If the purpose is to plan large-scale trips like Austalia to Hong Kong (or even interstate within large countries), the answer would have to be no. However, if the technology can handle small-scale trips like meeting someone in an unknown pub at a certain time, then this is far more useful to the general user.

The tipping point is based on usability and price. It has to be easier to use the phone to discover, book, and schedule your trip than it is to do it yourself. If you are looking at a regular trip (such as a commute), it's unlikely that you'd use such technology, unless it gave you benefits such as carpooling or a discount ticket (from the transport provider to encourage use of such services so that they could better implement their services). But an irregular, but still planned trip using public transport -- such as a weekend outing, or a meeting with the mates or for work -- presents planning and information gathering demands that a smartphone could easily perform.

Personally, the idea of being able to quickly and easily find my way to a meeting, without wasting time at connections or stressing about figuring out the best way there, sounds like a dream come true.

The key to these ideas

Perhaps the key to this approach is to understand the smartphone as an "invisible" tool. A tool that is simply the conduit for desires and information. The idea I've mapped out can include peer-to-peer functionality (with carpooling), which many believe to be a key to success, as well as information and service delivery.

This is the beauty of smartphones: they can span so many "worlds" that they can do all sorts of exciting things. Let's not just create "mobile" versions of existing, desk-bound services, let's try to create truly unique services with the capabilities available right now!

Thursday 18 October 2007

A Future for Symbian Smartphones

Come dream with me for a while. Dreaming about technology, especially when those dreams are firmly rooted in currently reality, is a truly helpful way to determine what trends and features are important to the big picture.

Imagine it's ten years in the future.

The mobile internet is a given, devices are tightly integrated, with precise positioning available, along with sophisticated mapping information, routing, and so forth. RFIDs (radio frequency IDs) abound, allowing RFID readers in smartphones to detect what is around them at any time. Business have moved much more of their catalogs online, encouraging richer consumer-level e-commerce.

But your smartphone is no longer the monolithic device we're used to. It's functionality has now spread much farther. Your watch alerts you, shows simple information (such as what this alert is about), and allows simple interaction (for example, accept, reject, or delay). Your headset now sits inside your ear, offering noise cancellation or transparency, depending on context, as well as voice recognition. Your glasses offer a large, high resolution heads-up display. A lightly textured piece of cloth on the side of your leg is pressure sensitive, allowing discrete, chorded input (pressing different combinations of fingers together to indicate a single letter). All of these are networked via a low-power personal area network (the descendant of Bluetooth) to the phone which you hardly ever take out of your pocket.

Your phone itself roams from network to network, using whatever resources are available to it at the time, depending on your preconfiguration. It has Terabytes of storage, a powerful CPU, and the various high-speed wireless data connections. But you rarely take it out of your pocket, except to take a high-resolution photo or video.

Imagine grocery shopping
It's grocery shopping time. Can't remember what to buy? A few quick commands brings up your phone's memory of the RFIDs it detected in the fridge that morning (you have an old fridge which isn't on the internet), your Food Management software compares that to your observed weekly requirements (the average of your requirements each week), and makes a tentative list.

As you walk through the supermarket shelves, the heads-up display brings up alerts when you approach items you want (as the phone detects their RFIDs), even showing you what they look like, and emphasizing if they're on sale (as specified in the stores online catalog).

While shopping, your list suddenly gets some urgent updates. Your spouse, Kris, has just added some triggered shopping requests for you, and since you're already in the shop, the trigger has added the items to your list. At the same time your watch buzzes subtly. A quick glance shows that those shopping items have a purpose: friends are coming to dinner in two hours, the recipe will take an hour to prepare (according to the online database it was snarfed from), and you're half an hour from home. Not much time to waste. You accept the alert.

Imagine travelling
As you leave the shop, pushing the trolley through the RFID reader, and typing your PIN in acceptance of the charge to your account (offered by your phone), your schedule suddenly changes, though you're not yet aware of it. Your friend's flight has been delayed and their phones have informed yours.

You become aware of the extra time you have when you walk past the newsagent, and a triggered event reminds you that you want to check out the range of birthday cards. Surprised, you actually look at your schedule, and note that the dinner has been delayed by half an hour. Still, you're not in the mood to look at cards, so you reject the suggestion and continue home.

On the way home you listen to the latest e-book on your phone, read out over the car's sound system. Close to home you need to make a detour to avoid an accident. The e-book is interrupted with a gentle request: "Jenny's going to be ready to go home from school in four minutes, the detour could go by the school, would you like to take her home with you?"

You query the system, "Where is Kris?" And it responds that your spouse is still at home.

You accept the opportunity to share journeys with your daughter, and stop at the school, listening to the e-book for the couple of minutes until Jenny gets into the car. Then on the way home you talk about your days. Waiting at a set of lights, Jenney unfolds her phone's display to show you a diagram she drew at school -- it's clarity of layout impresses you, although Jenny points out that it works even better on a display bigger than the unfolded A4 display of her phone.

At home
When you arrive home, your spouse asks you to prepare the dinner. You find the activity for cooking already tentatively in your schedule, with a link to the recipe. The cooking schedule is linked to the ETA for your friends, and needs to start soon (they travelled a little earlier than expected). So, you put away the food you don't need, and start cooking, following the instructions and pictures on your glass's heads-up display (even for things you don't need glasses for, the heads up display is so useful that you tend to use what used to be called "cosmetic spectacles", ie. glasses that have no corrective function).

While waiting for a particular stage you check out the afternoon's headlines (who wastes time sitting down to watch the news anymore?). As you approach the last stage, you see that your friends are scheduled to arrive in a few minutes, and so they do. You spend the rest of the evening happily offline, catching up. Your phone withholds all but emergency alerts (of which, thankfully, there are none).

How does it work?
This sort of scenario seems so much science-fiction to us, doesn't it? And yet there's not much there that current technology can't manage (the prevalence of RFIDs, the integration of data systems, and the more advanced interfaces to the phone are the major advances).

The point is that this seamless integration, of phone software and services with online services, and of phone hardware and systems with other interface systems, is the way that smartphones are heading.

For example, doing voice recognition in a headset has significant advantages in keeping personal area (or near field) communications to a minimum. Having glasses with heads-up displays capable of running substantial chunks of software has the same benefit, in addition to ensuring smooth animations without high network bandwidths or embarrassing dropouts.

Achieving this type of distributed, highly integrated software requires a system that is both a light user of resources (glasses have very little space or weight for CPUs and batteries, and native software, rather than resource-consuming software layers will be required) and able to be distributed (i.e. a microkernel based system with strong Inter Process Communications). That's Symbian (but not Windows or Linux).

And note what happens in the scenario: there is little need for a PC for most everyday tasks. Rather than desktop OS's scaling down, it seems likely to me that powerful device OS's will scale up.

Is this possible? Yes. Is it likely? Well, that depends on the vision of the handset and accessory manufacturers, operators, ISVs, and even retailers. Open interfaces are critical to this vision, and are the most likely thing to never come true.

Let's all hope that there are enough people with vision to help make this a reality.

Friday 12 October 2007

State of Play for Symbian ISVs

This article is a quick summary of the state of play for Independent Software Vendors creating software and services for Symbian OS devices. Each section contains a short history, current situation, and likely future directions (including suggestions).

Issues tackled include why ISVs should be supported at all, the range of devices available to Symbian ISVs, the possibilities for the ecosystem in ISV-provided services, the channels to market for third party software, the range of technical documentation available to support development, the process of Symbian Signing, the state of APIs available to ISVs, and the development tools available for software creation.

While not all of these may be of concern to you, certainly one or two will be of interest to anyone involved in the Symbian ecosystem.

1. Development tools

Development tools are critical to ISVs -- they can make the difference between being able to make an application profitably and being unable to do anything at all.

In the Symbian world, we've had three "preferred" development tools over the last seven or so years:

Visual C++ -> Metrowerks -> Carbide

Carbide has finally become usable in v1.2, and is a quite reasonable IDE. However, it still has some serious limitations that are very obstructive in terms of developing Symbian apps.
  1. On device debugging doesn't work properly (very clunky with static DLLs, which Symbian encourage developers to use, and doesn't work at all for memory shortage debugging)

  2. RAD development for UIQ 3, there are simply no tools, and given the complexity of UIQ 3's very powerful resource files, this would be a great help.

2. Application Program Interfaces (APIs)

Until recently Symbian support three separate APIs, S60, S80, and UIQ. In the last year or so that has narrowed down to two: S60 3rd Edition and UIQ 3. This is obviously an improvement.

However, getting here has been very painful. Constant changes to S60 (including breaking compatibility in many areas) in addition to radical change between UIQ 2 and UIQ 3, has necessitated serious porting effort for many vendors. This effort would have been better spent improving the products.

At least with UIQ 3, the effort spent on porting now works on two classes of phone (touch screen and soft key).

In the future, this should be able to be better managed, and it seems as if this will be the case. Nokia has a "platform evolution" page which shows that future editions of S60 have a "Compatibility promise". So long as Nokia stick to this, ISVs will be much more productive. As far as UIQ is concerned, they seem to be inherently more careful about compatibility than Nokia, and took the opportunity of the forced platform incompatibility caused by PlatSec to totally revamp their APIs, ready for the future.

3. Signing

The whole Platform Security (PlatSec) issue, which requires application signing for certain purposes, started out as a bit of a nightmare. Signing was expensive, clumsy, and non-trivial. The simple fact of an application requiring resigning on even a change to text resources prevented many developers from signing applications due to the way it "locked in" the release version, preventing incremental fixes and improvements. Furthermore, the expense of signing versus the potential return was all out of whack.

Symbian have been slowly working on this, introducing initiatives such as freeware signing (very slow and quite heavily restricted though it is), cheaper Publisher certificates, cheaper test houses, and so on.

They have just announced a new set of proposals, which are very encouraging, especially new signing methods such as the "Express Signed" process (which allows instant signing with the submitted apps being audited later). It seems that Symbian recognise the limitations of the signing process, as well as the strengths, and are making sincere efforts to minimise the pain while maintaining the advantages. I think Symbian are doing a good job here.

However, automated test tools must be delivered (UIQ 3's automated test tool hasn't worked for well over a year, preventing developers from pre-testing their applications before submission), and more precise test plans must be available (or how can you prepare an application for testing?).

4. Documentation

Back in EPOC days, Symbian started with very poor documentation but this was partially mitigated by the availability of much of the UI source (which also helped in debugging).

With the release of S60 and UIQ SDKs, we were moved on to slightly better (but still poor) doco with no UI source -- a significant (and very frustrating) backwards step.

The current situation is certainly much better, with more complete documentation and a growing set of examples. However the SDKs are still incomplete. (For example, where is the UIQ 3 documentation describing how stand-in controls work? This is a fundamental UI concept in UIQ 3, and yet there is nothing to assist an ISV to create their own stand-ins.)

The SDKs need serious work to complete them. The S60 and UIQ developers need to remember that this is an OO framework, so deep knowledge of the framework is needed for subclassing, etc. It seems that none of the OS/UI providers fully understand this.

Microsoft is still the role model in this regard.

5. Channels to market

Symbian development started with no official channels to market. Developers simply sold on their own websites. Over the years, independent resellers (such as Handango) have popped up. Some of these have been adopted by operators or handset manufacturers as official channels (and some have lost that official status).

The situation now is that premium channels (vendor and operator) have higher requirements (eg. signing), making it more difficult to get quick, light product to market, or to provide incremental updates. Generic channels (such as Handango or Mobile2day) offer significantly less barriers, but still demand a substantial cut of the sale (often almost half) for relatively little effort on their part (they do no front-line or Level 1 support for example, merely sales support). Furthermore, agreements between these distributors are often incompatible (for example, Handango requires no references to anything but Handango in software they sell, which, combined with signing required for textual changes such as that, places a premium on dealing with Handango). Finally generic channels don't always have great results, anyway, since they are pretty much invisible to the average smartphone user.

Premium channels now starting to be promoted on phones (such as Sony Ericsson's move to put their application shop on the home screen of the P1i), but this still needs more work.

I think that, in the future, we need time-of-handset-sale channels. Many users still don't realise smartphones are extensible, and so don't try to improve their user experience even though they may want to if presented with that option at time of sale. Operator's shops could do this, but have been very poor at implementing this.

SE (with their choice program in Asia) and Nokia (with their Downloads facility) seem to be attempt to improve this situation, but these initiatives need plenty of promotion and some retail-level training (ie. by the operators) would probably help.

6. Services

In the past there has been no provision of a services platform by the operators or handset manufacturers, so any services provision (such as Mobimate's weather downloads) have been purely ad-hoc and provided by the ISV themselves.

Nokia's Ovi may be the first approach to making service provision easier. Hopefully Ovi will be open to ISV's, providing a framework for services, including access to infrastructure such as mapping, routing, etc. This would allow much richer provision of services, which benefits everyone: users, operators (service provision has to travel over their networks), and handset manufacturers (their handsets have more, richer features).

SE needs to mirror this with a UIQ services facility.

Important service infrastructure that would be very beneficial to provide in such offerings would include: better sync platforms, including open support for sync plugins (at both ends); generic databases for data storage; web-scraping engines for collating information; location, mapping, and routing tools; and so on.

7. Devices

In the area of device availability and capability, the Symbian licensees (at least Nokia and SE) haven't made too many mistakes. Certainly Nokia's proliferation of devices with minor compatibility issues has (and still is, judging from the new N-Gage issues) caused some problems, and SE's struggle to produce new devices has caused it's own problems. But compared to the other platforms, Symbian has been well served.

Some real improvements that I would like to see in the near future would include features like video out that scaled to a larger screen and other features to help the phones double as PCs. Symbian OS is a powerful, real OS (even theoretically capable of being distributed across multiple devices thanks to it's microkernel and IPC model), so there's no reason to limit it.

Of course, better development tools would be helpful here, too, as well as remote testing capability such as already provided by Nokia.

8. Why should ISV's be supported at all?

Now, given how demanding ISV's seem, given the article so far, handset manufacturers may just be asking themselves, "What is the point of all this? Why bother with these demanding ISV's?"

It's not hard to provide an answer. Just look at Microsoft.

How successful would Windows be without it's vast ecosystem of ISV's? Even when MS has been predatory towards particularly successful types of third party software (such as office applications and web browsers), MS have still benefited enormously from all that innovation that they simply cannot do themselves.

In fact, considering how unexciting PC's and Windows are by themselves, could you ever imagine their extraordinary level of success without ISVs? The phone manufacturers have devices that are, inherently, much more exciting than a PC, but ISV's are capable of multiplying that wow factor, transforming these cool little camera/music-player/phones into devices that co-ordinate, communicate, track, instruct, protect, inform, educate, capture, assist, remind, and guide your life.

As for the operators, well, they stand to benefit even more -- what use is a data network without useful applications to use that data?

Wednesday 12 September 2007

Causes of piracy in the Smartphone market

Some time ago, Alex Kac of WebIS wrote an appeal to users of cracked software.

From DreamSpring's perspective, the same issues are very significant. Despite the smartphone market being so large, the market for third party applications seems woefully small. Why is this?

Certainly the prevalence of cracked software is one contributing factor, and a very major one at that.

Our own research has uncovered cracker forums where people gather, praise DreamConnect (our major product and money-maker), and ask when the cracked version is coming out. Can you imagine how galling it is for people who have spent bucket loads of money and months of effort crafting a product to see users praising a cracker for his crack of that product! What are these crackers, and more to the point, those who use the cracked software thinking?

And that's a valid question. Do people really think US$25 is too much to pay for software that they'll freely praise on a cracking forum? It seems so. It seems that they'd rather run the risk of installing trojans on their phones than pay a lunch or two to support the product (and to get product support).

Trying to understand why people do that is critically important. I'm not going to pretend that I have the answers, but here's a few thoughts. If you have anything to contribute, such as your reasons for hesitating to buy software, please leave comments.

My own attitude

Before I get started, I should make it clear that I find myself reluctant to pay money for software to run on my smartphone (a P990i, which I am very happy with). So I can sympathise with those who are reluctant to buy software. However, long ago, I made a decision to not copy software (or CDs, etc.). If I'm not willing to pay for it, I don't deserve to have it (assuming "it" costs money).

As a result, I try to find freeware to do what I need, or just get by without it. On P990i, I use the free version of Swiss Manager, for example, because I don't feel I can justify the pro version. I also use Mobipocket because it's free. I currently only have free Bibles in Olivetree (although I plan to buy a modern translation). I own a copy of Documents To Go (because I finally gave up on QuickOffice and it's buggy Bluetooth keyboard interaction), and that's about it (apart from DreamConnect, of course, which fixes several fatal flaws in the Contacts application).

So I understand a bit of where people are coming from, and these thoughts come from my attitudes as much as observing others.

Volatility of the platform and device

A major factor in my reluctance to spend money is the volatility of the smartphone platform, and the limited working lifetime of the phones themselves.

Take a look at this page from nokia: S60 Platform Evolution. Note that in S60's short life so far (up to S60 3rd Edition), we've had two compatibility breaks (one fairly major, and one complete). (UIQ has had only one break, but that was huge. Our porting effort from DC 2 to DC 3 was equivalent to porting from UIQ 2.1 to S60!)

Add into this the fact that people change phones regularly, and you can see that, even if they remain loyal to the platform, there is no guarantee that their current investment in software will transfer to their next phone. In fact, with such major breaks in compatibility, vendors will almost be forced to charge again for their new versions (as we did, due to the huge effort involved).

And people don't remain loyal to one platform, since these devices are more than just computing platforms. People make decisions on which handset to buy based on lots of features, not just their software platform.

Phones differ vastly compared to PCs. Just think how different an N95 is from an M600i:
  • Different OS

  • Totally different pointing mechanism (nothing -- a joystick is just arrow keys arranged in a certain pattern, it is not a pointing mechanism, unlike IBM's keystick, or a touchpad -- vs. a touch screen)

  • Lots of application-specific buttons (music) vs. one (internet button)

  • Numeric keyboard vs. Qwerty keyboard

  • Thick vs thin (this is the trivial sort of difference laptops manage, but it's more important with phones, because you carry them in your pocket)

  • GPS vs. none

  • Wi-Fi vs. none

  • 5MP camera vs. none

  • Different included applications

These differences are huge compared to the differences between a MacBook Pro (which our marketing department uses) and a Dell laptop (engineering -- how stereotypical, eh?). The main differences between these two is the different platform, different included apps, and different thickness. They both have virtually identical technology included. Oh, the MacBook has a built-in webcam (which is utterly pathetic compared to even the most basic modern cameraphone), but newer Dells have that too.

So, given these differences, platform and software compatibility are almost swamped in the decision-making process.

The end result of all this is that a user of smartphone software only has a very short period to get a return on their investment. This leads to a very tight market for ISVs.

User Attitudes to Phones

Another contributing factor, which is related to the volatility of phone platforms, is the general attitude of users towards phones. Very few users think of phones as software platforms. In fact, most people I talk to view a phone in much the same way they view a washing machine: it's an appliance that does what it says on the box (or not, depending on quality).

So the vast majority of smartphone buyers simply don't look for add-on applications. The only way to change this attitude is by a concerted effort from both the handset manufacturers and operators. The handset manufacturers have finally started doing this, through various measures from branding to prominent positioning of a link to the software shop, but the operators are still bumbling along.

From my observations, it seems likely that the user needs to be presented with the opportunity to extend their phone while they are in the process of purchasing it. This would require the operator's shops to have some form of mechanism (along with limited training for their staff) which would profile the user and present them with applications that they would be likely to find useful. If they could then purchase these apps included in the price of the phone and plan, then I would expect much better uptake. At the very least, a range of applications should be visibly available for purchase at the retail shop.

But I have never yet seen this done, and I've searched across several continents.

Conclusions

Piracy is a serious issue for mobile software developers. The nature of the hardware platform encourages it, and the nature of the retail process discourages proper purchase of software.

Moves to make pirated software easier to detect are only useful so far as users desire to avoid pirated software. Currently, there is not a great desire to do so, and large-scale promotion is required to remedy this.

So, ISV's simply have to hope that the operators and handset manufacturers wake up to the value that independent software adds, and try to protect and encourage the industry before it gets starved to death.

Addendum (27th Sept)

I've just bought a new phone in Hong Kong, one of the most open mobile phone markets in the world (almost all phones are unlocked and the vast majority of new phones are sold SIM-free). It was a new Sony Ericsson P1i, which has just been released here and seems to be selling well. During the sales process I was allowed to see the phone working, shown the IMEI number on the screen and on the box (for guarantee purposes), and given gifts consisting of a box of Chinese add-ons (a third party battery, USB battery charger, and phone case) and an Adidas cap. At no point was I offered any extra software for the phone. At no point was it even hinted that this devices could be used for features beyond what came in the box.

I bought this phone at one of the largest retailers in HK: Broadway (the big shop in Mong Kok, on Sai Yeung Choi St N). So that's the situation in one of the more progressive markets. What hope do we ISV's have elsewhere (apart from very close relationships with the operators or handset vendors)?

Tuesday 3 July 2007

Carnival of the Mobilists #80

Carnival #80 is up over at mobilejones. As always, it's worth checking out. While you're there, have a browse around the blog.

Unfortunately, mobilejones mischaracterises my piece by implying that I condemn all web apps on the basis of a sample size of two: Google Gears (which isn't even an application) and Blogger. It's likely he read in a hurry, so he didn't pick up on two important points:
  • Google Gears is a framework, and I analyse how it will impact the reliability of web applications in a mobile context
  • Google Blogger is used for illustrative purposes -- it's always easier to talk about concepts by using a concrete example, and that's what Blogger was for my purposes.

Obviously I was a little too subtle in these points for a hurried read, but then these analyses have not been intended for quick reads (unlike earlier posts), but rather for careful consideration. Hopefully most readers will understand that.

Note that I'm not claiming that I am 100% right -- that would be rather arrogant of me. Still, at least I've worked through the matter, and even presented some models for evaluating how and when to choose the different software platforms. Hopefully this is useful.

Monday 25 June 2007

Why Web 2.0 won't work on smartphones -- Part III of Smart Phone or Mobile Browser

Since I wrote Part II of Smart Phone or Mobile Browser there has been a bit of activity on the Mobile Web 2.0 front:
  • Google Gears has been released (I noted this in an addendum)
  • Apple has announced that the iPhone's only development platform is its Safari browser.

This has stirred up the silt of online commentary, but hasn't really changed the lay of the land at all. Mobile Web 2.0 is still a poor cousin of native clients. And here are the reasons why.

The Story So Far

In Part I, I looked at the history of web applications, and the special character of markets (or the Japanese market, at least) where they've been arguably successful. In Part II I presented three factors that are required for applications to be enjoyable and practical to use and that Web apps are, by nature, poor at: responsiveness, reliability and privacy.

Has the Story Changed in the last month?

Google Gears changes the reliability equation a bit, but the equation is still in favour of native clients:

R(native clients) = R(client app) * R(client OS) * R(client HW)

while a Google Gears based app has a reliability of:

R(GG app) = R(client SW) * R(browser) * R(client OS) * R(client HW) * R(GG framework)

It's easy to see that a GG app adds the (un)reliability of the browser and Google Gears framework into the equation. Thus, given the fabled instability of web browsers, and unknown stability of the GG framework, the client application code must be vastly more reliable in order to equal a native client application's reliability.

As for Apple's iPhone announcement, this has no impact at all. Some (presumably) ignorant commentators have waxed lyrical over how the delivery channel problems faced by mobile software vendors are magicked away by Apple's approach. I assume these commentators are merely ignorant (as opposed to plain dumb), since smart phones have supported this approach for some time, through recent releases of Opera mobile and Nokia's Safari-based web browser.

In any case, neither of these announcements really shifts the playing field much. Web apps are still much easier to deliver, but no easier to advertise or promote. And Web apps still have significant responsiveness, reliability, and privacy issues.

Is that the end of the matter, then? No, it is not. There is still one more critical factor to consider.

Usability

A major factor that I didn't talk about in the last post is usability. Anyone who has struggled with a complex web application like SugarCRM or even a simple one, such a Google Blogger, which I'm using now, will be aware of the peculiar user interface that Web APIs force on application developers.

Let's take the simple case of Blogger, as an example. Basically, Blogger is a text editor which saves to a type of bulletin board instead of a file system. Frankly, it is a terrible text editor. It has no styles, only two types of lists, no table editor, extremely primitive layout capabilities, and a very old fashioned spell checker (press a button to get your spell checking done). The problem is, Blogger doesn't allow you to use any old text editor, since the text editor is tightly integrated into the whole blogging framework. Tch, tch.

Even with its primitive features, the Blogger editor has terrible usability. The only shortcuts appear to be Ctrl+B, Ctrl+I, Ctrl+S, Ctrl+Z for bold, italic, publish (not save), and undo. There are no shortcuts for changing the font, there is no way to insert a hyperlink which has text different from its target without editing the raw html (shocking, I know). The editor is not even WYSIWYG! To access the various critical parts of the user interface, such as saving a draft, previewing (yes, previewing: no WYSIWYG here) the text, running a spellcheck, starting a list, etc. there are buttons to click with your mouse.

There is no menu bar, so any menus need to be attached to one of these on-screen buttons. Menu items have a very limited range of shortcuts to choose from, because the browser itself has already claimed many of the shortcuts.

In other words, what we end up with is a user interface with very constrained models of interaction, mostly point and click. There aren't even context menus to make interacting with objects easier. All of these useful UI mechanisms (keyboard shortcuts, menu bar, context menus, drag and drop) are either not available to the Web 2.0 developer, or they are very difficult to code (and thus lead to unreliable software).

How does this affect Mobile Web 2.0? Badly, as it happens.

The preferred UI interaction method for Web 2.0 app's is the on-screen button, clicked by a mouse. For touchscreen smartphones, this method of interaction may seem quite natural, until you look at how much screen real estate is needed for all these buttons. Ever tried to use Blogger on a VGA screen? It requires a lot of scrolling about, since Blogger only scales down to a certain size. This transforms simple movements of a pointer and clicking/tapping into the much more time consuming action of scrolling, pointing, clicking, scrolling.

Mobile devices have created whole different device interaction models (such as the softkey model), which Web 2.0 apps simply have no access to at all. (How does a Web 2.0 app register commands on the softkeys, or place commands in the browser's menu?)

Maybe the iPhone's touch interface will work with carefully tuned Web 2.0 apps, but I can't see them getting very sophisticated without becoming quite clunky. And all the cools effects of the iPhone's UI (zooming and sliding and scaling) will presumably be very difficult, if not impossible, to achieve.

The Alternative

So what is the alternative model to a Mobile Web 2.0 application? Clearly the "mostly connected" nature of a smartphone should be exploited, but so too should its native capabilities. So the obvious model is a native client application that synchronizes, when needed and as possible, with online services.

The native client application, with smartphone-resident data silo, has the benefit of:

  • Responsiveness
    Native client applications don't require bulky frameworks to start up or be kept in memory before they can run. I was reminded of how demanding this can be lately when using Google Maps on my P990i. It is a Java application, and Java takes quite a long time to start up, and chews up a phenomenal amount of memory. Of course, on some devices (such as Blackberrys) Java is the native framework, so it doesn't have such a hit. But on devices where it is not, it is a very poor choice of API. And so is a web browser.

  • Reliability
    I think I've demonstrated pretty clearly why native applications are better at this, assuming that they are written with even a little attention.

  • Privacy
    Native applications, with a local data store, can keep information private to the device. When information is sent to the online service, it can be encrypted (if it is merely to be stored) or anonymized (if it is to be used to generate a reply). Whether this is being done can be validated by a third party, which is something that is simply not possible with an online software service of any type.

  • Usability
    Native applications can be developed to exploit the UI methods of the device. I was reminded of the importance of this, once again, by Google Maps. For some reason, Google Maps uses a UI that mixes its own menu style with the built-in style. It leads to constant cognitive dissonance as things react in unexpected ways. Even Opera Mini 4, which is a model of usability in its scrolling and zooming behaviour, suddenly gets weird when you tap on an input field or the like. (Oh, and it consumes a truly shocking amount of memory.)

When native apps have benefits in all these areas, which all have a direct impact on the end user, it's merely laziness or a very cavalier attitude to the user experience that encourages people to use the Web 2.o platform for serious application delivery.

Having said that, where the main focus is browsing, rather than manipulating, non-critical information, Web apps are great. And this is where Web apps belong: in making a better Web.

Tuesday 5 June 2007

Carnival #76 is online

Carnival of the Mobilists #76 is now online at Twofones.

It's a very interesting read, this one, with lots of discussion about Mobile Web 2.0 and the technical issues surrounding it (which have shifted slightly since last week, with the announcement of Google Gears). It also has some interesting discussion on the new Palm Foleo -- a device as close to the old Psion Series 7 as I've seen for quite some time, but surrounded by all sorts of bizarre commentary (including some from Jeff Hawkins himself).

Anyway, head on over there, and make sure to have a look around Greg Clayman's blog while you're there.

Wednesday 30 May 2007

Smart Phone or Mobile Browser - Part II

In my first post on this topic, I talked about the history of web-based applications, and also quickly took a look at Japan, the land of mobile browsers (as opposed to smartphones).

In this post, I'll dig into specific issues with applications in general, and web-based apps in particular. So let's get stuck in.

In order for applications to be enjoyable to use, there are a number of factors that must meet certain strict requirements. Putting aside prettiness (which is a factor, but is less critical than these three), we all require the following from applications: responsiveness, reliability, and privacy (this last one is increasingly an issue in a thoroughly networked world). Let's look at each in turn.

Responsiveness

Responsiveness is a critical usability feature in any modern, GUI-based application. It is critical in GUI applications because we interact at such a fine-grained level with the application, performing little operations one at a time, such as typing a single character, selecting a single item, or choosing a single command.

Responsiveness is even more important in mobile applications, because we are operating these applications in high-demand situations. Situations where we need to achieve our goal in a strictly limited amount of time. Poor responsiveness will obstruct us from achieving our goal, and will force us into using alternative mechanisms (for example, something other than our phone) to achieve our goals.

There are two forms of responsiveness that are relevant to this discussion:

  • Interaction responsiveness

    This is the speed at which an application responds to our interactions, our typing a character or selecting an individual item, or issuing an individual command.

    AJAX, Flash, and other technologies have vastly improved this type of responsiveness in Web 2.0 apps. There are still restrictions, but this type of responsiveness is rarely a problem any more. It has never been a problem for client-based software, except when the hardware was simply too underpowered, or an inappropriate technology (like Java) was used.
  • Invocation responsiveness

    This is the speed with which an application instance can be invoked. In other words, how long does it take to bring up the application in order to do some work in it?

    This is an area that Web 2.0 applications are poor at. Web 2.0 apps need to be downloaded into the browser at every invocation. Browsers are heavy pieces of client software, and tend not to be left pre-loaded in phones, so the invocation of the browser is another issue that tells against Web 2.0 apps. Finally, few phones give equal priority to bookmarks as to installed application icons, so reaching a Web 2.0 app requires a multistep process -- start the browser, open the bookmark list, choose a bookmark, log in. This is an easy problem to solve though, and is a good idea for a nice, simple utility.

    An example of poor invocation responsiveness is the way people will often use a physical phone directory in preference to a web based one, because turning their computer on, starting it up, starting up the web browser, and going to the online directory takes far longer than picking up the phone book. The fact that searching for names can be much faster online than in the book (an example of interaction responsiveness) is often irrelevant.

    Interestingly, mobile phones and PDAs in general make great efforts to improve invocation responsiveness, being designed to be constantly turned on, carryable, etc., so any reduction in invocation responsiveness really cuts against the grain.
So, Web 2.0 apps are now responsive in interaction, but still poor in invocation. The invocation issue is exacerbated in a mobile environment because of the unreliability of the mobile data connection. Which leads into the next topic.

Reliability

The reliability of any application can be expressed via a simple equation.

Let's take reliability as a number between 0 (never works) and 1.0 (works 100% of the time), ie. a probability of something working.

Let's then write the reliability of a component as R(component).

So:

R(system) = R(component1) * R(component2) ...

In other words, the reliability of a system is the product of the independent reliabilities of its components. (This requires the reliabilities to be independent -- if they are dependant the easiest way to handle this is to collapse them into one measurement.)

So, the reliability of client-based software is:

R(client software) = R(client app) * R(client OS) * R(client HW)

Or, in English, the reliability of client-based software is the product of the reliabilities of the client application software itself, the client operating system it's running on, and the client hardware.

As a specific example, the reliability of DreamConnect 3 (a UIQ 3 contacts manager) is:

R(DC 3 app) = R(DC3) * R(UIQ3) * R(phone)

Until recently R(UIQ3) was pretty poor, so R(DC 3 app) overall was unsatisfactory. However, now all reliabilities are up, and R(DC 3 app) is at a level where a user can be happy. From the user's perspective, it is tempting to think only the final reliability matters, but users are more sophisticated than that. They can deduce that R(UIQ 3) was poor, for example, and that will encourage them to move to a different platform. Users can even differentiate between R(OS) and R(HW) if they have enough experience. As ISVs, we have direct control only over R(app), but we do have indirect control over R(OS) and R(HW) by deciding what platform and hardware to support. It is worth bearing this in mind.

So, how about Web 2.0 apps? What does their reliability equation look like?

R(web app) = R(AJAX app) * R(browser) * R(client OS) * R(client HW) * R(network) * R(server app) * R(web server) * R(server OS) * R(server HW)

As you can see, there are lot more components involved. Let's quickly work through them:
  • AJAX app: This is the component of the Web 2.0 app that runs inside the browser on the client machine. It may use some technology other than AJAX, but I'm just using that as a convenient label.
  • Browser: The browser is an important component of this type of solution -- it has the misfortune of being used as a development environment but needing to meet the expectations of a user application.
  • Client OS and Client HW: Same as for normal client software, except the reliability of the local storage has less of an impact on this scenario, since it is used only for invoking the client OS and browser, and not for storing the application data.
  • Network: The reliability of the network is a critical part of the functionality of a Web 2.0 app. The app is invoked across the network, various amounts of functionality are implemented in the server, and all data is stored on the server. The network reliability is thus fairly critical.
  • Server app: This is the component of the application that runs on the server side -- it often involves database code, (and the underlying database software, which is usually very reliable), etc.
  • Web server: The web server software itself, which is important to the function of a Web 2.0 app. Web servers are generally very reliable pieces of software.
  • Server OS and HW: The server's operating system and hardware, which is generally very reliable, more so than client equivalents.

So, what's the end result? Well, as mentioned above, R(server) component (R(web server) * R(server OS) * R(server HW)) is very reliable, and we could probably approximate it as approaching 1.0, and so remove it from the equation. This still leaves:

R(web app) = R(AJAX app) * R(browser) * R(client OS) * R(client HW) * R(network) * R(server app)

We can further simplify this by saying that R(app SW) = R(AJAX app) * R(server app) and we could assume that, since this is under the control of the developer, it's likely to equal R(client app). So:

R(web app) = R(app SW) * R(browser) * R(client OS) * R(client HW) * R(network)

Now we can see that Web 2.0 software is less reliable than client software by the following amount:

R(web app unreliability factor) = R(browser) * R(network)

In the past, R(browser) has been very poor, and has dramatically impacted the reliability of any Web 2.0 software. I would argue that R(browser) is still a significant issue, and counts heavily against web software, including on a PC. Of course, the impact of R(browser) is less than, say, a storage failure, so long as the software is designed properly (ie. regular saving of information -- Google has just recently recognised this by putting a very regular auto-save feature into the Web 2.0 app I'm using to write this).

On the other hand, R(network) varies widely between desktop-bound PCs and mobile smartphones. R(network) nowadays is usually quite high for fixed networks, but for mobile networks it is still quite low, especially for 3G and when moving. For example, I only need to travel ten minutes west to drop out of 3G coverage into 2G, and few minutes further (into the mountains) to drop out of 2G coverage as well. If I were using a Web 2.0 mapping/routing application (such as Google maps), it would fail me almost as soon as I left the city heading west.

In conclusion, then, R(network) is an absolute killer for Web 2.0 style apps on the mobile. Michael Mace observed this in a less formal way a while back.

Privacy

Privacy is an issue that hasn't really surfaced yet. Since I have a background in security (working on secure OS's as well as Internet security), it's one that I'm keenly aware of.

At present, Web 2.0 apps are either about sharing information, which reduces the privacy concerns, or simply make promises about privacy. There are limited technological systems in place to ensure the privacy of your data.

I have a family blog. It is restricted to only the people that I invite, namely my family. Because I've restricted it to just these people, I can feel free to write about my family's private life. Or can I? What assurance do I have that programmers at Google aren't poring over the (rather boring and very unsordid) details of my life? What assurance do I have that Google won't suddenly copy my private ponderings to the top of every search result they return to their millions of users? Well, I have Google's word. That's all.

Does Google's word protect me from a mistake in their software? No. Does Google's word protect me from a malicious programmer within (or even outside) Google? No.

Imagine this: it is 2017, MS has collapsed under its own weight. Google rules supreme. For some reason, you want to bring a suit against Google, and you are preparing your legal arguments. Using Google Office's word processor. Which saves the text on Googles servers. Encrypted by software that runs on Google's servers. How easy is it for Google to capture your password (they don't need to change the software on your machine -- it's uploaded to your machine every time you open it, so they just change it on their server, which they have every right to do and you can't prevent them doing) and to decrypt and pore over your arguments? Google may desire to do no evil, but how can we trust them to keep their word?

In contrast, client-based software allows firewalling, packet sniffing, and so on, to ensure absolute security.

But the current situation is much worse than that. I'm not even aware of any Web 2.0 apps that provide encryption. Let alone anonymization (so that the app provider can't snoop on your behavior). But both of these, in combination as often as possible, are crucial privacy protections. We're so used to relying on the inaccessibility (except by direct physical access) of our storage, but that's not a part of the Web 2.0 world.

How does encryption work for Web 2.0? Well, it only works when a) you don't want to share the data publicly, and b) you don't want the server to process the data (channel encryption, such as SSL, can still be used, though). So any document, calendar, or database should be encrypted, with the decryption key known only to you. If you wish to share pieces of this, those pieces should be stored separately, with no links back to the encrypted data (which would unnecessarily violate the security of your main data store). Why, then, aren't Google calendars encrypted? Or Google mail messages, etc.? Well, because no-one cares about privacy. Yet.

And what about anonymization? This is a technology that's useful when your data needs to be processed by a server, but doesn't need to be associated with you in particular. For example, a search query doesn't need to be associated with you (unless you want it to be, in order to benefit from the search engine's knowledge of your interests), neither does a location-based services request. Does Google search or Google maps use anonymization? No, because people aren't asking for it and it has a cost associated with it (you need a third party -- the anonymizer -- and it doubles the traffic in the cloud).

While both of these technologies have costs (encryption increases processor load and slightly increases network load), their benefits will eventually become so clearly important that we will see them implemented. I don't have time to talk about all the concerns here, but Bruce Schneier's blog is a great source for this sort of thing. Unfortunately, Web 2.0 apps are difficult to validate (because they're so easy to modify and can even present different versions to different users), so this is another stroke against Web 2.0 apps.

Conclusion

Phew! This has been a long trek. But at the end we can see that, while Web 2.0 apps make sense for some situations on desktop PCs, they have significant disadvantages for mobile usage.

In my next post, I'll talk about a third way, namely the Web Services model, in which client applications use web services to deliver a powerful solution. This is something that smartphones can excel at.

Addendum

Google has released Google Gears which is an attempt to reduce the impact of R(network) described above. Google Gears allows Web 2.0 apps to work with a client-side cache while disconnected from the network, and to synchronise the local cache with the server when the network is available.

This is a great piece of technology, if it works, since it massively reduces the impact of R(network) just as connected clients (to be discussed in my next post) do. Basically, it transforms Web 2.0 apps into connected clients. Web 2.0 apps still have the disadvantage of R(browser), of course (not to mention the memory and performance impacts of the browser and associated technologies), but this is a worthwhile improvement.

Tuesday 29 May 2007

Carnival of the Mobilists #75

Well, it's another carnival, and I'm privileged to participate in this one.

As always, there are lots of good posts that give you a good feel for the directions in which the mobile industry is heading. Make sure you have a look around Andreas Constantinou's site, which is hosting the carnival this week. He has quite a number of thought-provoking posts (I must confess I don't agree with everything he says, but that doesn't mean it's not interesting or worth reading).

The carnival has a very "web services" sort of feel to it this week, as the mobile world gets more caught up in the Web 2.0 circus. I was interested to read that there is a difference between the Mobile Web 2.0 and Mobile 2.0. I'm still not really sure why anyone would use a term like Mobile 2.0, but I guess it conveys some of the meaning that it's a collision of the traditional telecoms world with the Internet world.

Having experience working with both (including projects with a sophisticated OSI networking stack), I remain cynical about how well Internet protocols can replace telecoms ones. But then, I use VoIP as my main outgoing telecoms for my home and work phones, so I guess it can do a half-decent job...

Anyway, check out the posts, and stay tuned for Smart Phone or Mobile Browser - Part II here in the next couple of days, where I'll discuss the Mobile Web 2.0 (and its alternatives) at more length.

Tuesday 22 May 2007

Smart Phone or Mobile Browser

[This post has been languishing for a month. I think it's time to post it's beginning.]

It seems that some people are expecting smart phones to evolve into what is effectively just a mobile browser, providing a "thin client" for web-based applications (eg. AJAX apps like Google Mail, etc.).

This sounds familiar to me -- I've been through this before, in the PC industry. So is it different this time? Let's take a closer look.

History

First, some history.

In the mid-nineties, when the Web was taking off, a number of companies saw the opportunity to make a revolutionary type of personal computer to replace the existing model of the PC (running Windows 3.1 back when this started). This type of computer was called the NC, short for Network Computer. It's greatest proponents were the Acorn/Oracle partnership, and Sun. The idea was that the NC would be a mediumly thin client (thicker than a dumb terminal, such as an X-Windows terminal, but thinner than a PC). Basically the NC would run apps natively, but they would be downloaded from the network, and data would reside on the network.

This is a very similar model to the new AJAX style web-based applications.

Of course, history tells us that this approach was a massive flop, and it killed Acorn, amongst any number of other companies. Since the winning technology could hardly be considered to have won through techical superiority (we're talking about Windows 95 and NT 3 here), it must have been the underlying architectures that had significant advantages and disadvantages.

Interestingly, I was in favour of the NC at the time, mostly because I was a dedicated Acorn user, and didn't want to see the most innovative PC company go under. This time I find myself "naturally" on the other side.

So why did the NC fail? Well, the answer's two-fold, but pretty simple:


  1. Implementing sophisticated applications in Java (the language of choice for NC's) was probably as easy as doing so for Windows (back then), but the results were vastly inferior. NC's back then were running 30MHz ARM CPUs or the like, and Java was just too heavy for this (or PC) hardware. PC applications were simply more responsive than Java app's.
  2. The network simply wasn't up to handling all these apps and associated data. I remember working in Novell KK (Japan) in the latter half of '94 when Mosaic and the Web were gathering steam, and we still only had a 64kbps ISDN line shared between the 200 or so staff there!

The burning question is whether things have changed enough so that these major issues are no longer severe enough to simply kill the approach. (And, for mobile, I think the answer is pretty clearly, "no", see Michael Mace's post here for some insights.)

Japan

So, what about Japan? Clearly their phones are mobile browsers, and not smart phones. Perhaps they provide an example of the future.

Yes, well Japan has always marched to the beat of a different drum. In the nineties, when the NC was being hyped, the PC was still rare amongst home users in Japan. Much more common were turnkey wordprocessors (a type of technology almost completely dead in the West at that time).

To really understand Japan, you would have to understand what they're actually doing with their mobile browsers. I have to confess that I don't know what, exactly, they're doing. But I'm fairly sure it's merely consuming content and creating trivial content, rather than generating sophisticated content, or doing sophisticated processing that on-device app's are so good at.

In addition, you need to be aware of the geographical character of Japan and other target markets. Japan is heavily urbanised, with around a quarter of the population in greater Tokyo. The rest of the population are scattered through densely settled valleys and plains, with almost unpopulated mountains between them. This makes it very easy to build a mobile network that covers almost everyone. (Especially given the population and relative size of the country.)

Compare Japan to Australia, which, despite being heavily urbanised, has the population density such that travelling a few hundred kilometres from the population centres will dump you into a sparsely populated (but still populated) vast land, almost impossible to network with mobile networks.

So Japan can (as usual) be viewed as a special case, and not an indicator of things to come.

Next post I'll look further at issues with browser-based applications on mobile devices, including:

  • Responsiveness (a common issue, partly addressed by technologies like AJAX)
  • Reliability (a problem for mobile devices that no longer really exists for desktops)
  • Privacy (a problem that no-one seems to be taking seriously, on any platform)

Monday 23 April 2007

A bit more on the CLI

David Beers engages with my comments on his blog, Software Everywhere.

He's right when he says I'm thinking of a traditional CLI, in terms of the syntax, if not in terms of how it interacts with its environment and arguments. And there's a reason for that -- the traditional CLI syntax combines expressiveness, terseness, regularity, and a strong mnemonic in a very powerful way.

David's proposal is like a traditional CLI, but with a dash of natural language, extensive autocompletion and a reduction in expressiveness to achieve terseness. (The autocompletion is actually so extensive that it is really a major part of the UI, and strains the definition of CLI -- perhaps it should be called a CCLI, for Completing Command Line Interface.) By dismissing scripting, for example, the expressiveness of a UI is substantially reduced. There should be concomitant advantages, and in David's model there are (terseness and interactivity).

However, I'm suspicious of how effective his system would be, given the level of terseness he claims, at expressing many of the things that I want to do with my smartphone -- even on a regular basis. (For example, how can "Hang up, Add caller, Record call, Write note, etc., all [be] accessible with a single keypress if you only have one hand free and don't want to tap these options on the touchscreen" when you are in the middle of writing a note? How does it know that the character is intended for a command rather than entry into the note -- doesn't that another keypress? And what if I wanted to call the number under the cursor, rather than one out of my contacts -- how do I differentiate?)

Anyway, to illustrate what I'm talking about with the tradeoff of terseness, expressiveness and other factors, let's think about why CLI's aren't simply natural languages.

Natural Language vs Traditional CLI

Natural language delivers expressiveness at substantially greater levels than the traditional CLI, but at the cost of terseness and (more importantly) regularity. (Obviously it doesn't even need a mnemonic.)

The reason that CLIs have never attempted to be like natural languages (NLs) are simply due to the lack of regularity of NLs. Software thrives on regularity and chokes on irregularity. Now software has certainly improved, but has it improved that much? I don't think so.

Even if software had improved that much, NLs bring other problems when used in a CLI.

In order to get the no-mnemonic-needed benefit of NLs, you have to support the end-user's own NL. That includes radically different grammars. For example, in English, imperative commands, which are generally what you would issue to computers, can start with a verb, followed by the object (the subject is implicitly the computer). In Japanese, on the other hand (the only other language I can speak), verbs are always at the end, even in imperative commands (eg. "Eat your rice" is "Gohan o tabete", where gohan is rice and tabete is the imperative form of eat). Let's just ignore the different character sets (in Japanese both gohan and tabete would be written with kanji -- Chinese characters -- and hiragana) which would add a whole other layer on top of this interpretive framework. Or rather, let's not!

Another problem is the lack of terseness that natural languages have, due to their generality. The o in the Japanese sentence above, for example, marks gohan as the object of the verb. But in a traditional CLI this is not necessary. If you want to see a hint of what this is like, take a look at a COBOL program. COBOL was designed with similar goals: to be easy for end-users to write software (or instruct computers) with little training. Of course, we all know that COBOL was not easy to use, and just ended up irritating programmers with its worthless verbosity for decades. (OK, I'm speaking as a C, now C++, programmer here -- maybe COBOL's verbosity didn't annoy everyone.)

These are all still problems for natural language input. Certainly extensive auto-completion reduces the impact of verbosity, but it doesn't remove it. The main trade-off with auto-completion is between verbosity and mnemonic value. The other trade-off is between regularity and expressiveness. There is leakage between these tradeoffs, so they're not clear-cut (eg. prepositions, articles, etc. all add to expressiveness as well as having mnemonic power, and they trade off verbosity and, in many languages, regularity).

Basically, moving away from a non-traditional CLI brings an, at least, 4D trade-off space that needs to be navigated. Throw in the fact that there are multiple NLs with radically different characteristics, probably forcing different trade-offs, and the fact that not all languages are as easy to input as English, and it becomes, to put it mildly, rather challenging.

To top it off, scripting is one of the major benefits of a CLI, and a benefit even to an ordinary end-user. It's not enough to dismiss scripting as too advanced for an end-user, since people use it all the time in their personal interactions with other people. The challenge is to make it easy enough for an end-user to engage in it. While using an NL as a CLI helps that, the regularity issue would be almost crippling for any software attempting to implement NL scripts.

Alternatives to the CLI

What are the equivalent challenges for my proposed five-way hierarchical menuing system? Clearly character sets are not a challenge (Unicode has made it easy to display any character sets, and display is all that's required). Neither are an NL's grammatical peculiarities, since any syntax would be simple, regular, and independent of NL. Menuing doesn't feed well into scripting, so that's a weakness of this system (it's not incompatible with scripts, it just doesn't naturally support them in the same way that a CLI does). Expressiveness is limited, but chiefly by the choice of grammar (for example, Apple's menuing system is very simple: object-verb, implemented via selection followed by menu command choice). Regularity is a complete non-issue, of course. So that leaves mnemonic value, and this is where the trickiness lies in this solution.

In order to have strong mnemonic value, a hierarchical menu has to be structured in a logical, sensible fashion that is going to reflect the end-users own understanding and thought structures. And it has to do it out of the box (nobody will spend ages configuring software -- Apple learned that the hard way with Newton's handwriting recognition). To make matters difficult, the hierarchy has to be universal, ie. cover the full range of commands that can be issued at any time (otherwise it violates one of the purposes of David's UI -- to be able to perform any action on a piece of data). This is a matter for considerable research.

Just a quick note to relate this all back to the status quo: the traditional GUI, as implemented on all smartphones, uses contextual menus and selection mechanisms to achieve a basic object-verb level of expressiveness. The expressiveness is severely limited by the application "silos", though. Unfortunately, opening the range of commands up would overwhelm the command activation method (flatish menus). The greater expressiveness of GUIs (where needed) is achieved by dialogs. These allow very complex interactions (such as the logic-based search rules of DreamConnect), with great mnemonics, but hopeless regularity and verbosity and no chance of scripting.

Conclusions

Clearly, near-NL CLIs are not really viable, even on PCs, despite the level of expressiveness that they would deliver. Hierarchy menus have their own issues, but certainly show potential for mobile devices, I think. CCLIs have potential, especially if the command language was as expressive as traditional command lines, but I'm skeptical about the lack of expressiveness of David's proposal (either that or whether it can achieve David's claims of terseness given a reasonable level of expressiveness). The issue really is, for mobile devices, terseness is crucial, and needs to be achieved without sacrificing too much expressiveness.

Still, the proof is in the pudding. I'd like to see any of these systems running. Any would be better than the status quo, I reckon.

Saturday 14 April 2007

The CLI is cool again!

It seems that the CLI is cool again. David Beers has an excellent post on the CLI on a mobile here. Inspired by this I decided to try out Enso. Unfortunately, despite a truly great demo video, Enso was a big disappointment -- it simply didn't support what I use a computer for (I'm a programmer, but I also do writing, photo manipulation, and video editing, amongst other things). Enso failed for me because it didn't reach into the data that I manipulate with it's CLI, making the CLI almost useless. (It didn't even autocomplete filenames for me -- I have to manually map files/directories into Enso's namespace!)

I come from a CLI background (UNIX), and still use vim as my main editor (vim is a truly modal editor, with functionality like search and replace supported via command line). So you could say I'm favourably predisposed towards command lines. But let's try to analyse the benefits and disadvantages of command lines and GUIs (Graphical User Interfaces).

CLI vs. GUI

(Advantages & Disadvantages)

CLI AdvantagesCLI DisadvantagesGUI AdvantagesGUI Disadvantages
Direct access to commandscommands not displayed (completion helps)indirect access to commandscommands displayed (in menu)
random access to objects like files easy (esp. with completion)difficult to define graphical/text selectioneasy to define graphical/text selectionclumsy random access (since objects are displayed spatially, it is hard to keep a wide range of them visible)
easy to manipulate one or more database-style objects with textual search/replace style commandsdifficult to directly manipulate graphical objects/text (NB: graphical objects can include objects that are simply represented graphically, such as calendar events)easy to directly manipulate graphical objectsdifficult to manipulate more than one database-style objects
easy to implement history and/or repetitive operations (eg. scripts/macros)feedback from operation limited/implicitdifficult to implement history and/or repetitive operationsfeedback from operation usually explicit

Proposal

CLI's and GUI's clearly have different strengths and weaknesses. In summary, CLI's are better at issuing commands and dealing with database-style records or files and selection on textual-based searches; GUI's are better at direct manipulation of objects that can be represented graphically, and arbitrary but contiguous selections.

Using the two forms of UI in combination offers significant benefits:
  • remove indirectness of menu access
  • allow sophisticated sorting/searching and replacing, even in combo with GUI's graphical representations (to increase contiguity of selection)
  • allow history/redo/macro capabilities.

Implementation

So how do we implement this? For a PC, with it's large keyboard, screen, and pointer, I propose the following solution. Get rid of the menu bar (on the top of windows in Windows, and at the top of the screen on the Mac) and replace it with a command bar at the top of the screen. This line will show the command line as it's entered, and drops down a translucent list showing any autocompletion options or history.

The command line should be accessible with simple key toggle (like Enso, but probably modal, since holding down a command key while typing limits what you can type). But the CLI should also interact with the GUI's elements, for example, highlighting items which may match (i.e. are in the autocompletion list) with a special "tentative" highlight.

Command scripts should be able to span app's. So, for example, to do a search and replace in multiple documents, you might be able to simply type:

for file in C:\Documents\DS*.doc
open file in Word
Word::Replace "Series 60" "S60"
Word::SaveClose
end

This should fill in the files with autocompletion during the first line, and then execute it at completion.

Mobile Solution?

The problem with this scenario is pretty obvious: it is heavily keyboard relient. Without a full alphabetic keyboard the commands suddenly become more indirect again (with some form of input method intervening). And the long lists generated by autocompletion aren't very friendly on a small screen. Furthermore, keyboard styles vary so much from device to device (and even within the same device in an increasing number of devices, inspired by Sony Ericsson's P-Series phones) that it would be difficult for a user to become fluent in this style of interaction.

My preference is actually for a tree-structured command space, navigated using the keypad or stylus/finger. See DreamScribe for an initial implementation of such an idea. The benefit of this type of UI for mobile phones is manifold:
  • It is one-handed with one-handed phones
  • It works identically in both keypad and stylus-driven UI's, and doesn't require any additional hardware
  • Commands are visible
  • Commands are grouped by topic (like menus, but in an even more sophisticated way)
  • "Muscle memory" can be used for commands, with careful design (so the same command is always in the same place in the hierarchy)
  • Commands can be used from anywhere, unlike menus which need to be contextual in order to maintain their navigability (since they have a very flat hierarchy)
  • Commands can either be combined with the GUI (basically replacing menu commands) or feed into a CLI (with textually specified arguments)
  • Arguments can also be mapped into the hierarchical system, so that rather than long, linear lists of autocompletion options, hierarchies of possible arguments can be presented

There is a lot of potential in such a UI. Keystick (now "morphing" into Kanuu) showed that this was possible even with text input, and it looks like they're extending it to all sorts of navigation, just as I've suggested above. DreamScribe mapped calendar and contact attributes into a hierarchy, and the sky's the limit, really. See also Ring-writer as an innovative approach in this area.

Conclusion

So, while I think the combo CLI/GUI I suggest above would be great for PCs, I don't think it has much future in the mobile space. I really think the five-way hierarchy provided by joypads is a much better solution for mobiles.

Tuesday 3 April 2007

Carnival of the Mobilists #66 and #67 and responses

This is a bit late, but Carnival of the Mobilists #66 is at All About Symbian, one of my major Symbian news sources. My post on Contextuality is in that carnival.

Also, Carnival #67 is up at Wap Review and it has an interesting post from David Beers which, in the latter half, bounces off my idea of contextuality, and extends it out into the interrelationship between applications and data.

To be honest, I really wasn't thinking along these lines, for two reasons: 1) I can write applications, but I'm not in a position to write OS's and 2) I've seen too many failures of frameworks that have tried to achieve this.

Regarding reason 2: I love the idea of the user being able to use any tool he owns on his current context. However the Newton and Pink (or Taligent), both showed how difficult this is to do in reality (the Newton got further, but only because it was less ambitious). Apple aren't alone in trying this, MS have given up on their DB-based filesystem, which was trying to do a similar thing. In fact, MS have been talking about the idea for well over a decade. The most successful attempt at this approach that I've personally seen was the Oberon project, which actually allowed any text to be treated as a command. Brilliant stuff, but quite limited in the real world.

I've had so many hopes for this type of capability dashed: OLE, OpenDoc, Novell's software bus, the Newton's data soup, PenPoint's object oriented integration, Symbian's DNL (Dynamic Navigation Links, which do actually work, but are missing the key functionality of "vectorability" -- maybe more on this later), etc. etc.

It's made me very cynical about this. But I still have hope. Maybe one day we'll all get things sorted enough that software will start getting out of the way and actually helping people do stuff.

(Oh yes, regarding reason 1, maybe it's worth thinking about how to create this sort of open environment hosted by an application framework, rather than natively via the OS... Hmm...)

Thursday 29 March 2007

Cost/Benefits of open sourcing for Symbian

Over at The Mobile Lantern, Fabrizio Errante raises the idea of open sourcing Symbian OS. This has been raised before, so I thought it worth addressing.

Many people seem to forget that open sourcing has costs as well as benefits. The question for any particular product/project is, do the costs of open sourcing outweigh the benefits, or vice versa?

Let's do a quick overview for Symbian:

Benefits
  • Access to a bigger pool of developers to work on the codebase
  • Access to niche-specialist developers
  • Codebase becomes free (benefit to the customer)
  • Buzz
Costs
  • Codebase becomes free (cost to the provider, who can no longer make money from licensing)
  • Loss of control of codebase
  • Loss of control of developer quality (to be honest, much OSS is of very dubious quality, leading me to believe that either the developers don't have the time/energy to put in quality work, or there are few quality developers working on OSS)
  • Product direction driven by niches, rather than mainstream
Analysis

I contend that the above costs and benefits (which are not exhaustive), apply to all OSS projects, not just Symbian. But the impact of these costs and benefits are different for different types of projects. So how do they affect Symbian?

Well, the main one bandied about is the codebase becoming free. I have seen quite a number of comments about how Symbian will eventually lose out to Linux due to the COG (cost of goods) pressures on phones. The most recent example is from a analysts at ABIresearch. This, of course, flies in the face of significant evidence that indicates otherwise: the continuing success of Windows as a desktop OS, despite intense pricing pressures on PCs. But then, analysts usually don't seem terribly connected to reality.

The problem with all of this is, ironically, exactly what MS argues: OSS software is only free in the sense that the codebase, as it is, doesn't cost anything. Of course, the chances that the existing codebase will be satisfactory are fairly slim. (The irony in MS pointing this out, in case you haven't guessed, is that MS's codebase is far from satisfactory, and MS seem incapable of rendering it so.)

So the TCP (Total Cost of Production; made that one up in lieu of searching for the real term) of a Linux based phone is likely to remain at least as high as a Symbian one. So there is no real benefit to the customer (i.e. the handset manufacturers), and a very real cost to Symbian (they can't get any licensing revenues if Symbian is OSS).

How about the access to more developers and niche-specialist developers? Well, this benefit is offset by the cost that these developers are a) relatively poor quality and b) relatively uncontrolled. For Symbian, these costs matter. Symbian is creating an OS, not some dodgy web 2.0 app. Quality is critical for this type of software, and so is tight control. A poorly controlled API leads to fragmentation (just look at Linux). As for quality, Linux has demonstrated that OSS can deliver quality, but it seems to come at the cost of fragmentation (again) and bloat (both of code size and app UI).

Don't get me wrong, I use Linux (Fedora), but only as servers, where bloat and fragmentation aren't as critical. But on a phone? Give me a break!

The problem is that these forces are a natural by-product of the OSS process. They can be fought, but not completely neutralised. And they are all inimical to Symbian's strengths.

So should Symbian consider OSS?

No way!

Should Symbian release source code via a free license to developers? Hey, that's a different (and great) idea. And so should Nokia (with S60) and UIQ. The old days of ER5's Eikon source being available in the SDK were both better (access to the source meant inheritance was a lot easier in some ways because you could see what you were inheriting) and worse (Psion was lazy with documenting the code, because you had the source). Heck, if MS can release source to some of its UI code, I'm sure Symbian can.

Wednesday 21 March 2007

Carnival of the Mobilists #65 at Golden Swamp

Judy Brek, at http://www.goldenswamp.com/ has graciously included my previous post, Why the iPhone's UI won't scale in Carnival of the Mobilists 65. Check out the whole Carnival, there are always plenty of good posts to read, and make sure you have a look around Judy's whole site as well.

Monday 19 March 2007

Contextuality on mobile devices

In my last entry I mentioned contextuality. What is it? Can it be precisely defined? Why is it important? Let's answer all those questions.

Well, it's a term that I coined, and it means:
Showing sufficient context for the information that the user is reading or editing.

If a UI lacks contextuality, it doesn't display the context of the information the user's interested in. Let's continue with the iPhone as an example:

In the calendar, editing an entry involves selecting the entry and going into an "entry edit mode" (in Symbian, especially UIQ, this is called the "edit view"). The problem with the entry edit mode is that you can only see information relating to the current entry. Unfortunately, for calendar entries, the relevant context includes the other entries in that day. Hiding the rest of the day from the user means that when they make changes to that entry, they don't know how that will interact with other entries. (Will it cause a conflict? Will it provide enough time to move from activity to activity?)

This is a common mistake, particularly on mobile devices because of their limited screen real estate. But it's a mistake that's often made even on the desktop. A good example of how it should be done is, again, provided by Mac OS X, with iCal. iCal has a slide out pane that provides edit controls for editing activities. This pane slides out from the day/week/month view, which stays visible, keeping the activities in that view updated with changes in the edit pane (and vice-versa).

This toolpane concept has been around for a long time. I first encountered it on Artworks running under RiscOS on Acorn's Archimedes machines (the first ARM-based computers) back around 1991. It still works brilliantly on Artworks' descendant, Xara Xtreme (my favourite drawing application, available here).

Most other tools, such as anything from Adobe, Macromedia, MS, etc. tended to violate contextuality by dumping you into a dialog to do anything, which robbed you of the context of what your changes did to your document. (Dialogs have another problem: preventing "direct manipulation", but that's a subject for another post.)

Fortunately everyone seems to be cottoning on to contextuality on desktop apps (although a lot of Web 2.0 apps, such as this one courtesy of Google, set us back several years). However, the same can't be said in the mobile world.

Because of the space limitations of mobile screens, it's very important to understand how much context is enough. Desktop apps can be profligate with their screen space, but we can't. So how much context is enough?
The minimal context for a piece of data can be found by
determining the smallest amount of surrounding data which cannot
be completely reordered without significantly changing its meaning.

So, for example, a single contact record is the smallest chunk of context for contact information. Whole contacts can be reordered without changing their meaning (and, indeed, all contact applications support this -- it's called changing the sorting order). But fields within a contact cannot be reordered to the extent that they end up in other contacts, because that would change their meaning. So a single contact is the context of contact information.

What about for calendar activities? Well, this one actually depends on the user's purpose, and that's why calendar applications (and, indeed, paper representations of calendars) have multiple types of views. But at it's most focused, a calendar activity's minimal context is a whole day of activities. If the activities are moved around in the day, they change their meaning.

(Sometimes users want to work on a larger scale, such as a week or month, and thus there are week and month views to calendars, but often a day is enough, and each day can be considered interchangeable or independent.)

Contextuality is important. Without it the user is robbed of important information about what they are viewing or changing. Yet how often is it considered? Indeed, the whole design of UI's like Series 60, UIQ, Palm OS, Windows Mobile, and iPhone seem designed to violate this concept.

Could this be one of the reasons why people are so reluctant to edit information on their mobile devices (even when they have perfectly adequate keyboards, and there are significant benefits to making updates on the run)?