Professional Documents
Culture Documents
What Does Mean?: Scalable
What Does Mean?: Scalable
Operationally:
In the past: Works even if data doesnt fit in main memory
Now: Can make use of 1000s of cheap computers
Algorithmically:
In the past: If you have N data items, you must do no more
As data sizes go up, you may only get one pass at the data
The data is streaming -- you better make that one pass count
Ex: Large Synoptic Survey Telescope (30TB / night)
5/15/13
5/15/13
TACCTGCCGTAA
GATTACGATATTA
GATTACGATATTA
TACCTGCCGTAA = GATTACGATATTA?
No.
time = 0
5/15/13
GATTACGATATTA
CCCCCAATGAC = GATTACGATATTA?
No.
time = 1
5/15/13
GATTACGATATTA
time = 17
5/15/13
GATTACGATATTA
40 records, 40 comparions
N records, N comparisons
The algorithmic complexity is order N: O(N)
5/15/13
TTTTCGTAATT
AAAATCCTGCA
AAACGCCTGCA
TTTACGTCAA
GATTACGATATTA
CTGTACACAACCT
GATTACGATATTA
0%
100%
time = 0
No match.
Skip to 75% mark
GATTACGATATTA
GGATACACATTTA
0%
100%
time = 1
No match.
Go back to 62.5% mark
GATATTTTAAGC
GATTACGATATTA
0%
100%
GATTACGATATTA
0%
100%
GATTACGATATTA = GATTACGATATTA
Match!
Walk through the records until we
fail to match.
GATTACGATATTA
0%
100%
Relational Databases
5/15/13
*almost
14
5/15/13
15
TACCTGCCGTAA
GATTACGATATTA
time = 0
time = 1
time = 17
time = 0
time = 1
time = 2
time = 3
time = 7
40 records, 6 workers
O(N/k)