To be a catalyst for social development, the ubiquitous cell phone must be able to access locally relevant and freely available data.
Those of us in the developed world live in an environment where information is literally everywhere. In addition to physical media such as newspapers, books, and magazines, invisible signals carry data to our smartphones, tablets, and laptops. The uncountable pages in the World Wide Web leave nearly no question unanswered, and using mobile devices to obtain data has become natural. Information and communication technology (ICT) has become so convenient that we scarcely think about it.
For those in the developing world, however, information is less than pervasive. Although many people have a cell phone, access costs and user literacy barriers make acquiring data a deliberate, complicated, and expensive undertaking. Those in the developing world can’t effortlessly pluck invisible information from the air and must go to great lengths to find what they need.
Three representative ICT projects—two based in India, and the other in South Africa—seek to make access to information ubiquitous in the developing world. These systems fit naturally into the users’ environment, effectively making the technology invisible and providing the underprivileged with natural, convenient access to a wide range of data sources.
Given that many people in India lack the textual literacy to access information, the IBM World Wide Spoken Web seeks to create a hyperlinked information service parallel and complementary to the WWW that is voice- rather than text-driven (A. Kumar et al., “WWTW: A World Wide Telecom Web for Developing Regions,” Proc. ACM SIGCOMM Workshop Networked Systems for Developing Regions [NSDR 07], ACM, 2007; www.dritte.org/nsdr07/files/papers/s4p1.pdf).
The system consists of a network of VoiceSites, voice-driven applications analogous to websites that consist of one or more user-created voice pages—for example, VoiceXML files. VoiceSites are identified by global VoiNumbers, virtual numbers that map onto a physical phone number or to some other uniform resource identifier such as a SIP URI, and connected through VoiLinks.
Users can create VoiceSites with any telephony interface, ensuring that the underprivileged can generate content, as Figure 1a shows.
VoiLinks use the Hyperspeech Transfer Protocol to preserve and transfer context when the system transfers a call from one VoiceSite to another (S.K. Agarwal et al., “HSTP: Hyperspeech Transfer Protocol,” Proc. 18th Conf. Hypertext and Hypermedia [HT 07], ACM, 2007, pp. 67-76).
As Figure 1b shows, the Spoken Web browser enables users to navigate across VoiceSites, issue simple commands, and save bookmarks. Because the system must be accessible from any phone, the browser is implemented as a server-side entity.
With the Spoken Web, instead of talking only to their neighbors, rural villagers can post questions and answers on spoken forums, and companies can reach out to farmers to coordinate crop drops. Elsewhere, people in India are using the SpokenWeb as a platform to show off their singing talent and send voice greetings to one another (S.K. Agarwal et al., “User-Generated Content Creation and Dissemination in Rural Areas ,” Information Technologies & Int’l Development, vol. 6, no. 2, 2010, pp. 21-37).
Because interaction is all spoken and in users’ local language, the technology is both invisible and intuitive. As the number and kinds of Voice-Sites increase, however, researchers face a challenge in creating natural and simple interfaces.
Just as in the developed world, people in the developing world are captivated by smartphones’ sleek form factor and touchscreens. The Spoken Web team has therefore explored ways of providing smartphone-like interaction features to low-end cell phones.
TapBack, for example, lets callers tap on the back of their phone to navigate on their browser while simultaneously listening to content (S. Robinson et al., “TapBack: Towards Richer Mobile Interfaces in Impoverished Contexts,” Proc. Conf. Human Factors in Computing Systems [CHI 11], ACM, 2011, pp. 2733-2736). The device’s microphone picks up the tapping sounds, and the phone transmits this signal via telephony to the Spoken Web server for processing.
Despite the challenges of robustly capturing sounds on the TapBack system’s “touchscreen,” SpokenWeb researchers continue to investigate the potential of more complex audio input, such as scratching. In addition, altering the materials in the phone’s casing to create surfaces with varying acoustic properties could make it possible to generate different signals.
Another option is to capture sounds generated on the surfaces of other, nearby objects. Because people in rural India often share phones, TapBack-equipped devices must be tuned not to an individual but to a set of users. Villagers commonly put a phone on speaker mode so they can access the Spoken Web together. It might be possible to exploit this communal use and enable a wider gesture set by obtaining input from, for example, tabletop surfaces (C. Harrison and S.E. Hudson, “Scratch Input: Creating Large, Inexpensive, Unpowered and Mobile Finger Input Surfaces,” Proc. 21st ACM Symp. User Interface Software and Technology [UIST 08], ACM, 2008, pp. 205-208).
In South Africa, a research group from the University of Cape Town is working with nongovernmental organizations (NGOs) and Microsoft Research to disseminate mostly educational and health-related information to low-income communities. This information is typically presented in a multimedia format so that it’s comprehensible to people who aren’t textually literate.
In South Africa, voice communication and SMS are prohibitively expensive, making a solution like Spoken Web untenable. Data download costs are also too great for multimedia material. The question is how to distribute digital media in a natural way and at no cost to the user.
One proposed solution is the Big Board system, an electronic notice board that uses Bluetooth tech-nology to download information free of charge to a user’s handset (A. Maunder, G. Marsden, and R. Harper, “Making the Link—Providing Mobile Media for Novice Communities in the Developing World,” Int’l J. Human-Computer Studies, Sept. 2011, pp. 647-657).
A user takes a photo of an item displayed on a 40-inch LCD display screen and sends it, via Bluetooth, to the computer powering Big Board. That computer performs image recognition on the photo, discovers what topic the user is interested in, and sends relevant information back to the user’s handset. This can be images, videos, music—any media type the handset can process.
Initial Big Board trials in various communities around Cape Town were successful but raised cost concerns: large LCD displays and computers to drive them aren’t cheap. The system also needs a constant source of electricity, further limiting its deployment in developing regions.
In addition to the technical shortcomings, researchers determined that it was prohibitively difficult for many target users to travel to the buildings in which the screens were installed. For Big Board to be truly invisible, users had to be able to access the technology easily.
By studying urban and rural mobility patterns, the researchers concluded that installing the system in minibus taxis—the ubiquitous form of transport in sub-Saharan Africa—would reach the most people. Given cost and intermittent power limitations, that meant porting the system to a cellular handset and substituting stickers applied inside the taxi cabin for display screens.
After trying various configurations, the researchers settled for a server running Windows Mobile on users’ handsets and stickers like those in Figure 2 . The images require a barcode frame, as the handset doesn’t have a desktop PC’s image-processing power. Standard QR codes won’t work as they would require a text description that the target audience couldn’t read.
People who travel to and from work in these taxis and are textually illiterate can now freely access digital information that would otherwise be difficult if not impossible to obtain. It might seem a strange solution from the point of view of the developed world, but for many South African commuters, the Big Board system is both easy to learn and use.
As in South Africa, NGOs in India have difficulty providing information to users who can’t afford to access it through traditional technologies. One such NGO, Pragati (“progress”), provides healthcare assistance—primarily HIV/AIDS education and testing—as well as various ancillary services such as microfinance, counseling, and advocacy—to sex workers in the city of Bangalore.
Communicating notifications, announcements, and reminders to the women Pragati serves is done by word-of-mouth and can be challenging, as health workers must physically go to the areas where their clients live or solicit. Because most sex workers are illiterate, flyers or posters aren’t useful, and normal broadcast media such as radio or television are too diffuse and expensive. In addition, many sex workers are very poor, socially stigmatized, and nomadic, so it’s easy to lose track of individuals.
On the other hand, 97 percent of sex workers in India have mobile phones, exceeding the national average wireless teledensity of 76 percent. In fact, many maintain two separate devices or dual-SIM phones, one for work and the other for home use.
For Pragati, these facts pointed to an automated calling system as the most practical solution. In the developed world, most users find such systems—which assail us with everything from political ads to dental appointment reminders—irritating. However, given the ubiquity of cell phones in the developing world, broadcast calls are the easiest means to reach the most users.
To test this approach, Microsoft Research India developed a phone-based broadcasting system that could automatically dial sex workers’ phones and play prerecorded or custom-generated messages (N. Sambasivan and E. Cutrell, “Designing a Phone Broadcasting System for Urban Sex Workers in India,” Proc. Conf. Human Factors in Computing Systems [CHI 11], ACM, pp. 267-276). As Figure 3 shows, Pragati used the system to broadcast general announcements regarding HIV testing, health-related training sessions, and other events it sponsored as well as individualized reminders about microfinance loan deadlines.
Overall, Pragati found the automated calling system to be a useful tool for helping the NGO fulfill its mission. While most people in the developed world quickly hang up upon receiving a “robocall,” more than 80 percent of the women Pragati broadcast messages to listened to them in their entirety. Indeed, sex workers often mistook the automated messages for live calls from the project field coordinator. An unexpected side benefit was that the system reached more people than were actually contacted, as call recipients shared message content with friends who weren’t linked to the system.
Automated calling systems such as the one Pragati implemented exploit forms of interaction that are familiar to most users (answering a call, accessing voice mail) and are a potentially powerful way to bring a range of information to people who, because of literacy, financial, or social constraints, might not be able to access that information in any other way.
The cellular handset has been wildly successful in the developing world, but to be a catalyst for social development, as many hope, it must be able to access locally relevant and freely available data. The projects reported here have made a positive impact in a few, specific contexts, but for most users, very visible technology barriers continue to obscure the information they need.