Digging for Database Gold: The Devil’s in the Details
One of the biggest challenges for companies in highly regulated industries such as healthcare and insurance is data integration to generate reports satisfying regulatory requirements, says Drew Bradstock, product manager of DB2, IBM’s signature database family. In healthcare, Bradock estimates 95 percent of data processing is to satisfy regulatory requirements. Regardless of where the data you need lives, he says, storing it in Extensible Markup Language (XML) makes it much more tractable, and easier to subject to deep-dive analysis. Big Blue is one of XML’s staunchest champions in the database world.“We have a technology called pureXML,” Bradstock says. “It’s just our way of implementing the XML standard. We made it very flexible to change. If a healthcare provider needs to add more information, XML makes it very easy to add or remove fields without having to rebuild all the reports or rebuild all the work for last year. It’s going to be adopted in industries like healthcare because of the huge amount of data in slightly different formats being shipped.”
But universal adoption of XML has a long way to go. “One of the biggest reasons people haven’t moved to XML is that it’s a new way of doing things,” Bradstock says. “People always see it as a huge amount of work. ‘Well, the old way works fine, we’re still getting reports.’ They’re not looking at new ways to figure out trends in healthcare or cause factors for disease.”
And there’s the money factor. The U.S. government’s greater focus on healthcare reform may be the booster shot that XML needs. “You’ll see a wholesale-scale move to XML as soon as government ponies up the money for it and actually puts real money on the table, versus policy speeches,” Bradstock says. “It’ll start at the more leading-edge enterprises — Blue Cross/Blue Shield has been a big adopter of it, and some of the research hospitals as well. But I don’t see that for another three years or so.”
Storing data as XML creates a data lingua franca that paves the way for productive analysis. There’s also a quest to make databases themselves smarter, a critical initiative since 80 percent of all data is unstructured and lives as words. “Text analytics woven into a database is what’s really going to pick up,” Bradstock says.
That’s a long way from the origin of databases as data processors. ”The challenge for the industry is that databases were primarily designed to do automatic transaction processing,” says Joydeep Das, senior product manager for Sybase, a leading innovator in database technology. “That worked well and we got huge productivity gains. Those systems were meant to do individual transactions. But when you look at analysis, the challenge is to dig the vast amount of these transactions and collectively analyze and summarize all the data to provide information that one can act upon.”
Traditional database systems store information contiguously in rows. Sybase has its roots in creating row-oriented transaction database systems such as its SQL Anywhere family of database solutions, but has adopted a new column-oriented technology to facilitate analysis. “Traditional systems tend to fall apart at the seams when it comes to handling that amount of data,” Das says. “Newer technology enables you to store data differently and process it differently. It allows you to store and analyze data in a much more user-friendly way.” Sybase has incorporated this technology in its Sybase IQ line of servers for analytic business intelligence, data warehousing and reporting. “The point of a data warehouse is to have a single version of the truth,” he says. “You consolidate all your data there and it becomes the system of record.”
As part of the quest for smarter databases, developers are reevaluating the traditional analytic model. Analysis done the old way, Das says, “pulled data from the database into workbench tools. But the volumes of data have become so large that pulling all that data out of the repository into those tools is a non-starter. Instead of pushing the data into the logic, the solution is really to push the logic into the data. The new paradigm that’s emerging is to push the calculations into the database by plugging in algorithms. When you have such large amounts of data, why do you want to pull it out of the database? Why not just apply logic right at the source and get the answers.”
This is not the death knell for stand-alone analytic tools, though. “Not everything has to be done in the database,” Das says. “Some of it will still be done in the front-end tool. But the heavy-duty lifting can be done by the database.”


Comments
One Response to “Digging for Database Gold: The Devil’s in the Details”Trackbacks
Check out what others are saying about this post...[...] Random Feed wrote an interesting post today onHere’s a quick excerptDigging for Database Gold: The Devil’s in the DetailsIn: CMS| Featured Article| Infrastructure| Showcase| analytics 14Jul2009 By Ned SmithTake away their databases and most businesses would come to a screeching halt faster than an automaker heading to bankruptcy court. Modern business floats on a sea of data â data about employees, data about inventory, data about customers â you name it, someone has figured out a way to slice it and dice it and store it in a database. T [...]