Professional Documents
Culture Documents
Netezza
Netezza
What is Netezza?
What is Netezza?
Division of Data
Database distributed across multiple (100+) SPUs Each SPU controls, manages its slice of DB
Division of Labor
SPU FPGA handles basic filtering tasks SPU CPU handles record level processing: filtering, parsing, projecting, logging, etc. SPU CPU handles most operations on intermediate results: sorts, joins, aggregates Frontend CPU handles remaining operation
IC netlist example
Flattened netlist of 3.5 million transistors, 10 million wires Search for AND structure
IC example results
Combinatorial explosion makes directly joining all possibilities for each element impossible Can constrain better using fanouts of signals internal to the circuit Individual SQL queries for finding possible matches for the individual transistors took under 10 seconds Found all uses of the AND macro, as well as many other (1300+) identical structures generated through other means
Ontology example
Expand out all possible interpretations of a phrase Ontology specifies lexical elements, IS-A relations, concepts, and constraints on concepts Goal is to search the space, expand concepts to find all matches to given phrase
Ontology results
Partially unfolded ontology
Greatly expands database size, but reduces iterations / recursions
Recoded ontology triples as integers 5.58 sec. vs. 262 sec. can pipeline multiple queries
Issues
Works if you can reduce your problem to SQL queries All of the problems were based on graph expansion / exploration how about other domains? Issues of database partitioning? How does arbitrary slicing across 108 blades affect performance / scalability, esp. for non-sparse problems? Strawman comparison to workstation class machine: how does a traditional DB server / storage cluster compare?