Professional Documents
Culture Documents
02 Information Retrieval - Boolean Retrieval
02 Information Retrieval - Boolean Retrieval
02 Information Retrieval - Boolean Retrieval
Information Retrieval
2. Boolean Retrieval
Data Shapes
2
Data Shapes: Tables
3
Data Shapes: Trees
4
Data Shapes: Graphs
5
Data Shapes: Cubes
6
Data Shapes: Unstructured text
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam vel erat nec dui
aliquet vulputate sed quis nulla. Donec eget ultricies magna, eu dignissim elit.
Nullam sed urna nec nisl rhoncus ullamcorper placerat et enim. Integer varius
ornare libero quis consequat. Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Aenean eu efficitur orci. Aenean ac posuere tellus. Ut id
commodo turpis.
Praesent nec libero metus. Praesent at turpis placerat, congue ipsum eget,
scelerisque justo. Ut volutpat, massa ac lacinia cursus, nisl dui volutpat arcu,
quis interdum sapien turpis in tellus. Suspendisse potenti. Vestibulum pharetra
justo massa, ac venenatis mi condimentum nec. Proin viverra tortor non orci
suscipit rutrum. Phasellus sit amet euismod diam. Nullam convallis nunc sit
amet diam suscipit dapibus. Integer porta hendrerit nunc. Quisque pharetra
congue porta. Suspendisse vestibulum sed mi in euismod. Etiam a purus
suscipit, accumsan nibh vel, posuere ipsum. Nulla nec tempor nibh, id
venenatis lectus. Duis lobortis id urna eget tincidunt.
7
Information Retrieval
8
Information Retrieval
9
Information Retrieval
11
Example of corpus
CHAPTER I holmes.txt (3 MB)
Mr. Sherlock Holmes
Mr. Sherlock Holmes, who was usually very late in the mornings, save
upon those not infrequent occasions when he was up all night, was
seated at the breakfast table. I stood upon the hearth-rug and picked
up the stick which our visitor had left behind him the night before.
It was a fine, thick piece of wood, bulbous-headed, of the sort which
is known as a "Penang lawyer." Just under the head was a broad silver
band nearly an inch across. "To James Mortimer, M.R.C.S., from his
friends of the C.C.H.," was engraved upon it, with the date "1884."
It was just such a stick as the old-fashioned family practitioner
used to carry--dignified, solid, and reassuring.
"Well, Watson, what do you make of it?"
Holmes was sitting with his back to me, and I had given him no sign
of my occupation.
"How did you know what I was doing? I believe you have eyes in the
back of your head."
12
Example
Q: lawyer
13
Naive approach
Q: lawyer
14
Example of corpus
CHAPTER I holmes.txt (3 MB)
Mr. Sherlock Holmes
Mr. Sherlock Holmes, who was usually very late in the mornings, save
upon those not infrequent occasions when he was up all night, was
seated at the breakfast table. I stood upon the hearth-rug and picked
up the stick which our visitor had left behind him the night before.
It was a fine, thick piece of wood, bulbous-headed, of the sort which
is known as a "Penang lawyer." Just under the head was a broad silver
band nearly an inch across. "To James Mortimer, M.R.C.S., from his
friends of the C.C.H.," was engraved upon it, with the date "1884."
It was just such a stick as the old-fashioned family practitioner
used to carry--dignified, solid, and reassuring.
"Well, Watson, what do you make of it?"
Holmes was sitting with his back to me, and I had given him no sign
of my occupation.
"How did you know what I was doing? I believe you have eyes in the
back of your head."
15
Result
Q: lawyer
CHAPTER I
Mr. Sherlock Holmes
Mr. Sherlock Holmes, who was usually very late in the mornings, save
upon those not infrequent occasions when he was up all night, was
seated at the breakfast table. I stood upon the hearth-rug and picked
up the stick which our visitor had left behind him the night before.
It was a fine, thick piece of wood, bulbous-headed, of the sort which
is known as a "Penang lawyer." Just under the head was a broad silver
band nearly an inch across. "To James Mortimer, M.R.C.S., from his
friends of the C.C.H.," was engraved upon it, with the date "1884."
It was just such a stick as the old-fashioned family practitioner
used to carry--dignified, solid, and reassuring.
"Well, Watson, what do you make of it?"
Holmes was sitting with his back to me, and I had given him no sign
of my occupation.
"How did you know what I was doing? I believe you have eyes in the
back of your head."
16
Result
Q: lawyer
CHAPTER I
Mr. Sherlock Holmes
Mr. Sherlock Holmes, who was usually very late in the mornings, save
upon those not infrequent occasions when he was up all night, was
seated at the breakfast table. I stood upon the hearth-rug and picked
up the stick which our visitor had left behind him the night before.
It was a fine, thick piece of wood, bulbous-headed, of the sort which
is known as a "Penang lawyer." Just under the head was a broad silver
band nearly an inch across. "To James Mortimer, M.R.C.S., from his
friends of the C.C.H.," was engraved upon it, with the date "1884."
It was just such a stick as the old-fashioned family practitioner
used to carry--dignified, solid, and reassuring.
"Well, Watson, what do you make of it?"
Holmes was sitting with his back to me, and I had given him no sign
of my occupation.
"How did you know what I was doing? I believe you have eyes in the
back of your head."
17
Naive approach
$ grep lawyer corpus.txt
18
Naive approach
$ grep lawyer corpus.txt
19
Naive approach
20
Naive approach
$ grep lawyer corpus.txt | grep Penang
21
Naive approach
$ grep lawyer corpus.txt | grep Penang
which was a Penang-lawyer weighted with lead, was just such a weapon
is known as a "Penang lawyer." Just under the head was a broad silver
22
Naive approach
Q: lawyer AND
Penang AND
NOT silver
23
Naive approach
$ grep lawyer corpus.txt | grep Penang | grep –v silver
24
Naive approach
$ grep lawyer corpus.txt | grep Penang | grep –v silver
which was a Penang-lawyer weighted with lead, was just such a weapon
25
Grepping
Regular expressions
foo|bar[a-z]+
Wildcards
.*
26
Grepping: shortcomings (1)
27
Grepping: shortcomings (1)
What if we want
to search
an extremely large
collection:
The Web
?
28
Picture: 123RF / shtanzman
Grepping: shortcomings
Steven Levy
In The Plex: How Google Thinks, Works, and Shapes Our Lives
29
Grepping: shortcomings
Page said, sure, he'd go and download the web and get the structure.
Steven Levy
In The Plex: How Google Thinks, Works, and Shapes Our Lives
30
Grepping: shortcomings
Page said, sure, he'd go and download the web and get the structure.
Steven Levy
In The Plex: How Google Thinks, Works, and Shapes Our Lives
31
Grepping: shortcomings
Page said, sure, he'd go and download the web and get the structure.
Steven Levy
In The Plex: How Google Thinks, Works, and Shapes Our Lives
32
Grepping: shortcomings (2)
33
Grepping: shortcomings (2)
Proximity search?
34
Grepping: shortcomings (3)
35
Grepping: shortcomings (3)
How do we rank?
36
Indices
37
Usefulness of indices
Steven Levy
In The Plex: How Google Thinks, Works,
and Shapes Our Lives 38
Document
39
Document
Documents
40
Document
Documents
41
Term
42
Term
Sherlock
lawyer
Switzerland
Unterwalden nid dem Wald
ETH Zürich
person
watch
run
paper
book
...
43
Incidence Matrix
Documents
Terms
44
Incidence Matrix
Documents
46
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
47
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
48
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
49
Boolean Query
50
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
51
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
52
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
53
Boolean Query
NOT u
54
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
55
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
56
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
NOT
NOT u
57
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
NOT u
58
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
NOT u
59
Boolean Query
u AND x
60
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
61
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u
Terms
62
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u
Terms
AND
u AND x
63
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u AND x
64
Boolean Query
u OR x
65
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
66
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u
Terms
67
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u
Terms
OR
u OR x
68
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
u AND x
69
Effectiveness of an IR system
70
Information need
Impact of global warming on the
development of butterfly ecosystems in
Southern part of the Canton of Uri
between 2011 and 2013
71
Information need
Impact of global warming on the
development of butterfly ecosystems in
Southern part of the Canton of Uri
between 2011 and 2013
72
Results
73
Results
74
Results
Relevant Not relevant
Returned results
75
Results
Relevant Not relevant
Returned results
Precision =
76
Results
Relevant Not relevant
Returned results
true positives
Precision = returned
77
Results
Relevant Not relevant
Returned results
Precision =
78
Results
Relevant Not relevant
Returned results
Precision = 50%
79
Results
Relevant Not relevant
Returned results
Precision =
80
Results
Relevant Not relevant
Returned results
Precision = 100%
81
Results
Relevant Not relevant
Returned results
Precision =
82
Results
Relevant Not relevant
Returned results
Precision = 0%
83
Results
Relevant Not relevant
Returned results
Precision =
84
Results
Relevant Not relevant
Returned results
Precision = 80%
85
Results
Relevant Not relevant
Positives
86
Results
Relevant Not relevant
Positives
Negatives
87
Results
Relevant
Positives
Recall =
Negatives
88
Results
Relevant
Positives
true positives
Recall =
relevant
Negatives
89
Results
Relevant
Positives
Recall = 50%
Negatives
90
Results
Relevant
Positives
Recall = 100%
Negatives
91
Results
Relevant
Positives
Recall = 0%
Negatives
92
Results
Relevant
Positives
Recall = 67%
Negatives
93
Type I vs. Type II error
Relevant Not relevant
Type I error
Positives
(false positives)
Type II error
Negatives
(false negatives)
94
High precision
Relevant Not relevant
Positives
Negatives
95
High recall
Relevant Not relevant
Positives
Negatives
96
Compromise
Compromise
97
Inverted indices
98
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
99
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
w Typically 500,000
100
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
w Typically 500,000
101
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
w Typically 500,000
102
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
w Typically 500,000
v
Terms
y
How many zeros?
104
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
one billion 1s
w Typically 500,000
y
How many zeros?
105
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
one billion 1s
w Typically 500,000
499 billion 0s
y
How many zeros?
106
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
w Typically 500,000
x 99.8% empty!
y
107
Incidence Matrix
Documents
1 2 3 4 5 6 7 8 9 10
t
Typically one million
u
v
Terms
w Typically 500,000
x 99.8% empty!
y
This is space-inefficient.
108
Can we store this in a better way?
Documents
01 2 3 4 5 6 7 8 9 10
1
t 1 0 0 1 0 0 0 0 1 1
u B0 0 0 0 1 1 1 1 0 0C
B C
v B0 1 0 1 0 1 0 1 0 1C
Terms
B C
w B0 0 0 0 1 0 0 0 0 0C
B C
x @1 0 1 1 0 0 1 0 0 0A
y 0 0 0 0 1 0 0 1 0 1
109
Can we store this in a better way?
Documents
1 2 3 4 5 6 7 8 9 10
t
v
Terms
110
Can we store this in a better way?
Documents
1 2 3 4 5 6 7 8 9 10
t 1, 4, 9, 10
u 5, 6, 7
v
Terms
2, 4, 6, 8, 10
w 5
x 1, 3, 4, 7
y 5, 8, 10
111
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7
v 2, 4, 6, 8, 10
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
112
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7
v 2, 4, 6, 8, 10
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
Vocabulary
113
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7 Posting
v 2, 4, 6, 8, 10
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
114
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7 Posting Assumption:
Each document has a Document ID
v 2, 4, 6, 8, 10 (docID)
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
115
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7
v 2, 4, 6, 8, 10 Postings list
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
116
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7
v 2, 4, 6, 8, 10 Postings
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
117
Inverted index (inverted file)
t 1, 4, 9, 10
u 5, 6, 7
v 2, 4, 6, 8, 10 Dictionary
Terms
w 5
x 1, 3, 4, 7
y 5, 8, 10
118
Document frequency
t 1, 4, 9, 10 4
u 5, 6, 7 3
v 2, 4, 6, 8, 10 5
Terms
Count
w 5 1
x 1, 3, 4, 7 4
y 5, 8, 10 3
119
Document frequency
t 1, 4, 9, 10 4
u 5, 6, 7 3
v 2, 4, 6, 8, 10 5
Terms
120
Building an inverted index
Document 1
You come most carefully upon your hour.
Document 2
Take thy fair hour, Laertes; time be thine,
Document 3
My hour is almost come,
Document 4
Possess it merely. That it should come to this!
121
Tokenize
You come most carefully upon your hour
122
Linguistic pre-processing
you come most carefully upon your hour
123
Assign document IDs
you 1 take 2 my 3 possess 4
come 1 fair 2 hour 3 it 4
most 1 thy 2 is 3 merely 4
carefully 1 hour 2 almost 3 that 4
upon 1 Laertes 2 come 3 it 4
your 1 time 2 should 4
hour 1 be 2
come 4
thine 2
to 4
this 4
124
Sort
almost 3 is 3 take 2
be 2 it 4 that 4
carefully 1 it 4 thine 2
come 1 this 4
Laertes 2
come 4 thy 2
merely 4
come 3 time 2
most 1 to 4
fair 2
my 3 upon 1
hour 3
possess 4 you 1
hour 1
should 4 your 1
hour 2
125
Merge
almost 3 is 3 take 2
be 2 that 4
carefully 1 it 4 thine 2
come 1 3 4 this 4
Laertes 2
thy 2
merely 4
time 2
most 1 to 4
fair 2
my 3 upon 1
hour 1 2 3
possess 4 you 1
should 4 your 1
126
Merge
almost 3 Laertes 2
thine 2
be 2 merely 4 this 4
carefully 1 most 1 thy 2
come 1 3 4 my 3 time 2
fair 2 possess 4 to 4
hour 1 2 3 upon 1
should 4
is 3 you 1
take 2
your 1
it 4 that 4
127
Add document frequency
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
128
Single word query
hour
129
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
130
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
131
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
Document 1
You come most carefully upon your hour.
Results
Document 2
Take thy fair hour, Laertes; time be thine,
Document 3
132
My hour is almost come,
Boolean (conjunctive) query
133
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
134
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
135
Single word query
come 3 1 3 4
hour 3 1 2 3
136
Single word query
come 3 1 3 4
Intersect 1 3
hour 3 1 2 3
137
Single word query
come 3 1 3 4
Intersect 1 3
hour 3 1 2 3
Document 1
You come most carefully upon your hour.
Results
Document 3
My hour is almost come,
138
Boolean (conjunctive) query
hour OR this
139
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
140
Single word query
almost 1 3 Laertes 1 2
thine 1 2
be 1 2 merely 1 4 this 1 4
carefully 1 1 most 1 1 thy 1 2
come 3 1 3 4 my 1 3 time 1 2
fair 1 2 possess 1 to 1 4
4
hour 3 2 3 upon 1 1
1 should 1 4
is 1 3 you 1 1
take 1 2
your 1 1
it 1 4 that 1 4
141
Single word query
hour 3 1 2 3
this 1 4
142
Single word query
hour 3 1 2 3
Union
this 1 4
143
Single word query
hour 3 1 2 3
Union 1 2 3 4
this 1 4
144
Single word query
hour 3 1 2 3
Union 1 2 3 4
this 1 4
Document 1
You come most carefully upon your hour.
Document 2
Take thy fair hour, Laertes; time be thine,
Results Document 3
My hour is almost come,
Document 4
Possess it merely. That it should come to this! 145
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
146
Intersection algorithm
List A 1 2 4 5 8 9 10 12
Initialization:
two pointers
List B 1 3 4 6 7 8 11 12
147
Intersection algorithm
List A 1 2 4 5 8 9 10 12
Initialization:
two pointers
List B 1 3 4 6 7 8 11 12
Intersection of A and B
148
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B
149
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
150
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
151
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
152
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
153
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
154
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1
155
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4
156
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4
157
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4
158
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4
159
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8
160
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8
161
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8
162
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8
163
Intersection algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8 12
164
Intersection algorithm
List A 1 2 4 5 8 9 10 12
End
List B 1 3 4 6 7 8 11 12
Intersection of A and B 1 4 8 12
165
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
166
Union algorithm
List A 1 2 4 5 8 9 10 12
Initialization:
two pointers
List B 1 3 4 6 7 8 11 12
167
Union algorithm
List A 1 2 4 5 8 9 10 12
Initialization:
two pointers
List B 1 3 4 6 7 8 11 12
Union of A and B
168
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B
169
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1
170
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1
171
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2
172
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2
173
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3
174
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3
175
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4
176
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5
177
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6
178
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
179
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
8 180
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
9 8 181
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
10 9 8 182
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
11 10 9 8 183
Union algorithm
List A 1 2 4 5 8 9 10 12
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
12 11 10 9 8 184
Union algorithm
List A 1 2 4 5 8 9 10 12
End
List B 1 3 4 6 7 8 11 12
Union of A and B 1 2 3 4 5 6 7
12 11 10 9 8 185
Optimizing
186
Optimizing
100,000 docs
187
Optimizing
100,000 docs
188
Optimizing
100,000 docs
Maintain intermediate
result in memory
189
Optimizing
100,000 docs
Maintain intermediate
result in memory
190
Optimizing
100,000 docs
Maintain intermediate
result in memory
191
More complex queries
192
More complex queries
193