Thwarting worms, battling viruses
Unsung (super)heroes keep the kinks out of computer systems
By Steven Schultz
Princeton NJ -- Matthew Petty had just pulled his sailboat into the Forked River near Barnegat Bay on a lazy Labor Day weekend when a call came in from work, Princeton's Office of Information Technology. Peter Olenick, manager of networking services, reported that the University's power had gone out and core computers that serve the whole campus were shutting down.
Petty, the technician who manages the main computer room, gathered his family, packed up the car and headed for Princeton. As Petty drove, nearly a dozen other staff members already were taking their places at the master console, repairing damage and bringing vital systems back online.
"Everything that could go wrong did go wrong," said Petty. Even though power came back, the backup power system had failed, leaving the computers vulnerable to further power fluctuations. Working late into that Saturday night, Petty helped repair the power supply while teams of administrators and programmers restored e-mail, databases, networking and other key services. "By Sunday morning, everything was back to normal," Petty said. When most faculty and staff returned on Tuesday, "they had no idea anything happened."
That is the goal of the people who work in the enterprise infrastructure services section of the Office of Information Technology. When power goes out, computer viruses attack or systems become unexpectedly overburdened, these OIT staff members drop what they are doing and work to keep disruptions to a minimum (see related story on page 6).
Most of the time, however, the group prevents problems from occurring in the first place. People like Dianne Kaiser, who regularly comes to work at 4 a.m. to perform routine maintenance, and Charles Augustine, who develops automatic systems for monitoring the health of computers, are part of the behind-the-scenes teams that keep the electronic backbone of the University in working order.
"If we're doing our job, nobody notices us," said Daniel Oberst, who directs the infrastructure services group.
Indeed, the OIT staff has made the University's computer systems remarkably reliable, said Betty Leydon, vice president for information technology. The rapid growth of computer systems and constant changes in technology make it very challenging to avoid and diagnose problems, she said.
"I consider these folks unsung heroes," said Leydon. "There is so much that has to go on behind the scenes in order to make things run smoothly. And, in fact, it does run smoothly."
The heart of computing resources at Princeton, from a physical standpoint, is a room in the basement of the OIT building on Prospect Avenue. A secure steel door leads to a cavernous space filled with row after row of tall metal cabinets. Each cabinet is loaded top-to-bottom with beefier-looking versions of the familiar desktop computer. Neat bundles of brightly colored cables -- pink, yellow, orange -- emerge from the floor and splay across panels of jacks.
"Every time you send an e-mail or write a purchase order, it's running on a computer in this room," said Christopher Dietrich, one of the administrators responsible for computers that run the Unix operating system. At one end of the room, Petty points out a single black cable that slips up a column into the ceiling. It is one of the University's main Internet connections, a cable through which nearly every Web page and e-mail file flows in and out of the campus.
Hardly a week goes by without Petty wheeling in new boxes of computers. In 1996, the University had 25 servers, the machines that process activity on the computer network. The number has now grown to 285. During the same time, the amount of disk space on those machines has increased nearly 500-fold.
The people who make these physical resources work occupy the rest of the building. Oberst directs 49 staff members who play the most direct role in planning, maintaining and protecting the computer systems. One group, directed by Donna Tatro, is responsible for all the servers and the various e-mail services. Another, directed by Charles Augustine, handles database administration and systems for backing up data and automatic monitoring. Lee Varian oversees an architecture group responsible for planning the way various systems interact with each other and maintaining security. A group under Lorene Lavora maintains many Web sites and works on the computers and software necessary for the University's home page.
Each of these groups works with other parts of the Office of Information Technology, including the help desk and software and hardware support groups, the networking experts and the people who develop, purchase and maintain centrally available software, such as the University's tools for handling purchases and human resources data. The OIT staff, in turn, works closely with dozens of computer support people in departments and libraries around the University.
"Our folks really need to collaborate because no one person could solve the problems," said Leydon. "It always requires a team effort, and they work very well together as a team."
For information technology staff in other departments, collaborating with OIT can give them freedom to focus on their own projects. "The infrastructure group has certainly been very helpful to us at the library," said Marvin Bielawski, the deputy University librarian. "We do maintain a lot of our own machines and because of the work they've done -- e-mail filtering, spam folders, help with security updates -- we don't have to worry about any of that. It is wonderful."
All hands on deck
For others on campus, the OIT team becomes visible when a crisis emerges. That was the case last summer when a series of computer viruses and "worms" caused havoc with computer systems worldwide. The first attack hit the campus at the end of July. Unlike conventional viruses, these worms infected computers on the University network even if no one opened a bad e-mail attachment. "It was all hands on deck throughout the week. People were working around the clock," said Oberst.
Although the worm itself was not particularly destructive, its stealth made removing it difficult. Even Microsoft did not have an immediate answer, Oberst said. OIT staff programmer Peter Everett took the lead in writing software that removed the malicious computer code from University computers.
Just as the campus was recovering, two more worms hit with even greater ability to incapacitate the University network. Staff members again scrambled to block and remove the worms, but OIT administrators saw another problem looming: Within a week, students would arrive with thousands of infected, unprotected computers.
"We knew they spent all summer hooked up to cable modems and in cyber cafés, and they were all infected," said Oberst.
OIT staff member Mary Ng, along with Anthony Scaturro, the information technology security officer, and OIT Software Services, created a CD that removed the worm from any operating system. As students returned, OIT marketed the CD with signs that said, "Got PC? Get CD." Working around the clock, OIT's help desk and software support staffs and student assistants handed out more than 2,500 CDs.
"It became a fire we could fight," said Oberst. "If we had not had the CDs it would have been a fool's errand. It gave us enough breathing room."
The CD was so successful that many other schools adopted it, including Dartmouth College, the University of Chicago, Yale University, Seattle University, the State University of New York at Albany, Michigan's public university network and the Lawrenceville School.
"Without the diligence and expertise of the OIT staff, these attacks could have been far more destructive," said Provost Amy Gutmann. "It is very reassuring to know that resources so vital to the University are in such good hands."
A chief goal in recovering from such incidents is learning from them and preventing new ones from arising, said Oberst. In some cases, corrections can be made right away. In December, OIT moved quickly to improve an online registration system for undergraduates. The system worked well for seniors and juniors but bogged down when sophomores registered. "They all hit the 'register' button at 7 a.m.," said Oberst. "We didn't know students were up at that hour, but they were and, in this case, they overwhelmed the system."
Finding the weak point in the system required close collaboration between hardware, software, application, Web and networking experts. By the time freshmen registered a few days later, the system worked smoothly with 435 students registering between 7 and 7:01 a.m.
Issues such as computer security require longer-term efforts, said Oberst. "Security was a thing where you just asked nicely [for people to take appropriate precautions] and hoped it worked out," he said. "But now it has to have a little more teeth to it." One policy change has required departmental administrators to install critical software updates within four business days or face being taken off the network.
Scaturro said the University will take an important step forward this spring with the adoption of a formal information policy, which will guide the development of policies and procedures for keeping computers secure and protecting privacy. "There is a common misconception that computer security is focused on computers," he said. "Security is an information issue and a people issue; it can evolve into a technological issue."
As with all the challenges OIT faces, said Oberst, "the ultimate goal is to recognize a problem before it starts."