Researchers seek to take 'byte' out of complex data

In computer language, a byte is a unit of storage usually the size of one character of information.

While even the most casual user of computer technology is familiar with the concept of gigabytes--that is, a billion bytes of data--it's unlikely even hard-core keyboard tappers often consider the concept of terabytes? That's because terabytes represent more data than a person can consume in a lifetime.

These days, two Mississippi State University computer science professors are attempting to get a handle on terabytes, so to speak. They're working to discover patterns in data so large and complex that scientists can't handle it without the aid of computerized analysis.

"Knowledge discovery" is the rather bland sounding tag for this highly automated process of complex data analysis to extract useful information.

Julia Hodges and Susan Bridges are using a $335,000 grant from the Office of Naval Research to support their work. They are collaborating with scientists at the Naval Oceanographic Office at the Stennis Space Center near Bay St. Louis.

"Knowledge discovery is related to the machine learning work being done in the field of artificial intelligence," said Hodges.

Hodges and Bridges will work to develop a system that can be applied to vast amounts of oceanographic data. Their products will be an object-oriented oceanographic database and tools that automatically extract useful information--"knowledge"--from the database.

"Typical databases are intended for small transactions like airline reservations, while an object-oriented database represents much more complex relationships among the different kinds of data," Bridges explained. "Traditional database systems don't handle such complexity well."

Because there is no efficient way to search, manage or organize highly complex data in traditional systems, "scientists are limited in how the data can be manipulated," Hodges added.

In their current study, data will include satellite imagery, acoustic imagery, altimetry data, and data from a variety of sensors.

Knowledge discovery techniques will provide meaningful ways to analyze the data. For example, an oceanographer may wish to analyze large collections of acoustic imagery data in order to classify the topology of the ocean bottom in a particular location.

While the two researchers are applying knowledge discovery to scientific data bases, they say the technique also is increasingly being used commercially.

"For example, marketers might use knowledge discovery to identify consumer buying patterns to target better particular catalogs," Hodges said.