ALUMNI QUARTERLY
SUMMER 1999

On the third floor of a computer laboratory in Cambridge, Massachusetts, Tony Jebara waves to a yellow stick figure appearing on his computer screen. It waves back. Jebara then raises his arms menacingly, and the figure bends down, as if cowering. Jebara isn't doing some digital aerobics routine -- he's teaching a computer the appropriate responses to common gestures. And the computer seems to be a quick study: it always reacts correctly, without being told what to do.

Artificial intelligence (AI) is no longer the stuff of science fiction. Limited forms of AI are now used in devices such as hand-held computers that recognize handwriting and robots that explore active volcanoes. While current research is a long way away from building anything close to the sinister, silky-voiced HAL of the 1968 film 2001: A Space Odyssey, researchers like Jebara, a doctoral candidate at the Massachusetts Institute of Technology (MIT) Media Laboratory, are trying to evolve computers into smarter tools: not faster, smaller or more colourful, but more intelligent and adaptive to human needs. Active instead of passive.

Jebara with a "wearable" slung over his shoulder.

Top: Tony Jebara dons a "wearable" computer to guide him in a game of eight ball.

Digital evolution

"We don't want to spoonfeed information to computers anymore," says Jebara, who was born in Syria and raised in Paris and Montreal. "We want them to acquire information on their own. So there's an effort to take computers -- which are literally deaf, dumb and blind -- and add some kind of perception. That way they can get information about the world without somebody manually typing it in or using a mouse."

One way of making computers aware of their users and environments is through computer vision. For his undergraduate degree in electrical engineering at the McGill Faculty of Engineering's Centre for Intelligent Machines (CIM), Jebara studied ways of getting computers to track and recognize faces using a database of hundreds of mug shots. His current stick figure program at the Media Lab, dubbed "action-reaction learning," builds on that research, tracking the faces and hands of two human participants through cameras. As the participants wave or clap at one another, the program uses a new mathematical formula of probability to predict responses to the gestures. Finally, it replaces one of the participants with the yellow stick figure, which gives the appropriate responses based on what it has learned during the interaction.

"The first step was getting computers to start sensing their environments," says Jebara. "Now we want them to start sensing to learn. An infant that grows up in a black box won't learn very much about the world, and that's basically the problem with a computer. The idea is to set up the right architecture so the computer can use its perceptions to learn to reason about the world."

According to the conventional wisdom about AI, Jebara's goal is a digital pipe dream. First developed in the 1950s by such scientists as MIT's Marvin Minsky, AI has since been widely regarded as a failure. Films like 2001 were fueled by the field's early goal of creating a humanlike intelligence from computers. AI did register initial successes -- some computers were scoring As on MIT calculus exams by the 1960s -- but overconfidence led researchers to concentrate on practical problems, such as robotics and chess-playing computers, ignoring the fundamentals of reasoning. Progress in general intelligence slowed. By the mid-1980s, an "AI winter" set in, and funding for the half-billion-dollar industry started drying up.

AI research has so far yielded what Minsky recently called "collections of dumb specialists in small domains." One example is Deep Blue, the IBM supercomputer that beat world chess champion Garry Kasparov in a 1997 regulation match. Though it can calculate 200 million chess positions per second compared to Kasparov's three, Deep Blue is not widely considered intelligent, and cannot "learn" its opponent as it plays.

"There's no 'discovery' of chess with Deep Blue," says Jebara, who became interested in AI because of its relation to cognition and metaphysics. "When it plays, it's really the handiwork of a few engineers writing out every single possible rule about the game. When you learn, you discover concepts on your own; you don't have someone write them into you."

Jebara demonstrates a 3D face recognition system.

Jebara's colleague Thad Starner with glasses that project an image of the Internet onto his retina.

Computers that bloom in spring

But other, limited forms of AI are flourishing. Jebara and other researchers now use "Bayesian networks" to design smarter computers. Inspired by the work of 18th-century British mathematician and cleric Reverend Thomas Bayes, Bayesian nets are sophisticated inference systems that can track statistical information about cause and effect, allowing for learning. Microsoft has worked Bayesian nets into Windows 98's "smart" help subprograms, and has also designed systems to eliminate junk e-mail, as well as to mine the seas of data on the Internet for patterns that can better serve online shoppers. And with the quantum leaps in computing power and the fall of computing cost, an "AI spring" -- or, at least, a thaw -- may be upon us. For example, after dropping off the map in the 1980s, computer-vision companies are now bringing artificial sight to new heights.

"Before, to make a product that did anything useful, it would have to be very, very expensive," says McGill's Martin Levine, Jebara's former supervisor at CIM and a computer-vision specialist. "The ability to now take pretty much the same algorithms as before and run them so much faster has made many different applications practical." Successful spinoffs from the field include diagnostic magnetic resonance imaging and satellite technology, but new applications possess decision-making abilities, such as automatic security or traffic surveillance programs that eliminate the need for constant human monitoring. One of Levine's current research projects is to find a generic visual aging process for computer face recognition. Programs that could predict what someone would look like today given an old photograph would be useful both in identifying criminal suspects and finding missing children.

Although computers are now very good at picking a face out of a crowd of mug shots, at CIM Jebara spent over a year working on a nagging problem: how to get programs to recognize faces that aren't looking directly at the camera. Jebara would sometimes put in two all-nighters in a row in the CIM lab; progress could be painstakingly slow. Eventually, he became one of the first to demonstrate a real-time, three-dimensional face tracking program at research conferences.

"It's rare to find somebody as enthusiastic as Tony was," says Levine. "My most difficult problem with Tony was to keep him from trying new ideas before he really worked out the old ones. But he was fun to work with." That enthusiasm has landed Jebara in some unique situations. Last year at MIT, he enrolled in a difficult graduate-level Media Lab course called Mathematical Modeling. About 12 other students initially attended the course, but not for long. Everyone except Jebara dropped out. "Of course, I would have to do all the assignments on the board, because the teacher would always call upon me!" he laughs. "Once, I spent all night doing the assignments. While the teacher was lecturing me the next morning, I was passing out, I was going cross-eyed!"

Hard work does have its perks. When Jebara earned his master's degree last summer at MIT, the convocation address was given by U.S. President Bill Clinton. Jebara has also traveled around the world attending conferences, and has been featured on ABC's Nightline, the BBC and in Scientific American for his work on "wearable" computers, something researchers call the next step in the evolution of portable devices such as pagers, cell phones and personal data assistants.

Researchers at MIT's Media Lab work on computer "vision" and teach computers to react to their gestures.

The apparel oft proclaims the man

Wearables are devices that can extend the user's senses, improve memory, monitor health or even help you become a pool shark. Jebara developed a program called "Stochasticks" (from the statistical probability term "stochastic") in which users don goggles equipped with a camera and video display that helps them line up their billiard shots. The system uses AI as probabilistic object recognition to select the best strategic shot for a game of eight ball in real time. Another Jebara project, DyPERS (Dynamic Personal Enhanced Reality System) uses similar hardware and software, but gives users virtual memories like steel traps. DyPERS can recognize hundreds of everyday objects and associate audiovisual experiences with them. For example, the system can make a video recording of an important conversation with a business partner, connect the conversation with his or her business card and later play back the conversation whenever the card is seen. Aside from the obvious mnemonic benefits (e.g., a daily to-do list associated with one's watch), DyPERS has many educational possibilities: a museum tour guide's comments could be linked with images of specific artworks, or a teacher could record the names of certain objects in a foreign language as an aid to students learning it. How obtrusive would the system be? The technology to miniaturize the equipment into a standard-looking pair of glasses and an Armani suit already exists. Move over, Maxwell Smart and James Bond.

"How many people know that there's a computer in their cell phone?" asks Alex Pentland, Jebara's supervisor and head of the vision and modeling group at the Media Lab. "Most people don't. Do they care? No." Pointing to watches and eyeglasses, Pentland says wearable devices have and will continue to be part of people's everyday attire, only now they're getting a lot smarter: "It's not about computers. It's about delivering services to people. Services that people want. We can now deliver all sorts of services we could never deliver before." Many Federal Express couriers now wear hand-mounted package bar-code readers, while firefighters use thermal-imaging helmets to locate people in smoke-filled rooms, but consumer wearables of the near future will include devices that monitor wearers' stress levels, direct tourists to the nearest attractions or translate sign language into audible English.

Wearables can also get you out of a jam, says Jebara. At a conference in Killington, Vermont, Jebara realized he had left a crucial file for a demonstration on a Media Lab server back in Cambridge. Without a modem, Jebara had no way to retrieve the file. He turned to Thad Starner, then a Media Lab researcher who was also at the conference. Starner, a "cyborg" who donned a wearable every day, said, "No problem."

Starner's wearable rig partly consisted of specially designed glasses that projected an image of the Internet onto his retina, and a hand-held, abbreviated keyboard device called a Twiddler. Staring vacantly, Starner keyed a series of commands into his Twiddler, located Jebara's file and a minute later, his gaze refocused on Jebara; he had downloaded the file and could transfer it to Jebara's laptop. Thanks to Starner's cyborgian habits, the demonstration went off without a hitch.

Ultimately, Jebara envisions wearables in a symbiotic relationship with humans, with the computers learning from their environments as humans benefit from their personalized services. That, he says, can advance the original AI goal of building a humanlike computer. "Wearables are my answer to robotics, which is a really huge undertaking," says Jebara. "We don't have to make a robot. We can just have these wearable systems which coexist with us. They use us to get around and to see what's important, and we use them as personal assistants, remembrance devices and our gateway to the Internet."

Out of the mouths of babes

Jebara is now focusing his doctoral research on taking computer learning and perception to the next level. He says his stick figure may be evolved into a virtual talking head, similar to the 1980s pop icon Max Headroom, but one that can learn and reproduce the appearance, speech and idiosyncrasies of a real person. It could become a user's "personal agent" for Internet representation, for example.

"In the future, computers will be much more like people," says Jebara. "They will be able to understand us, rather than us trying to understand them. They will be adaptive, becoming experts in you, rather than you in them."

And what of the dream of artificial intelligence? Are programs like Jebara's waving stick figure the digital embryos of future programs with a humanlike intelligence and behaviour? Jebara believes that growth is all up to the circuits and silicon: "You don't design it. You let it design itself. It sits there and watches and infers all its reasoning and interacts the way a child interacts with its environment, which is a very rich place to learn."

If Jebara's track record is any guide, one day that child may start asking some very adult questions.