Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Query DSL In Elasticsearch

Narayan Kumar
Software Consultant
Knoldus Software LLP
Agenda

● Overview of Elasticsearch

● What is Query DSL?

● Queries VS Filters

● Type of query

● Demo
Overview of Elasticsearch

open-source high availability

Elasticsearch scales massively


distributed
realTime,
search &
multi tenancy analytics engine schema free

fault tolerance
lucene based

restful API
JSON over HTTP
What is Query DSL ?

➢ It is rich flexible query language.

➢ Elasticsearch provides a full Query DSL based on JSON to define


queries.

➢ We can think Query DSL as an AST of queries, consisting of two


types of clauses.

Leaf query clauses: It looks for a particular value in a particular


field, such as the match, term or range queries.
Compound query clauses: It wraps other leaf or compound
queries and are used to combine multiple queries in a logical
fashion.
Queries VS Filters

Queries Filters

full text search exact matching

relevance scoring binary yes / no

heavier fast

not cacheable cacheable


Type of query

➢ Match All Query

➢ Full text queries

➢ Term level queries

➢ Compound queries
Match All Query

The most simple query, which matches all documents, giving


them all a _score of 1.0.

Example:

"query": {
"match_all": {}
}
Full text queries
The high-level full text queries are usually used for running full text queries
on full text fields like the body of an email.

These are full text queries:

match_query

multi_match query

common_terms query

query_string query

simple_query_string
Full text queries continue ..
match_query: A family of match queries that accepts text/numerics/dates, analyzes
them, and constructs a query.
{
"query": {
"match": {
"body": {
"query": "i spent at starbucks",
"operator": "and"
}
}
}
}
multi_match:The multi_match query builds on the match query to allow multi-field
queries {
"query": {
"multi_match": {
"query": "share post",
"fields": [
"verb"
]
}
}
}
Full text queries continue ..
common_terms query: The common terms query is a modern alternative to
stopwords which improves the precision and recall of search results (by taking
stopwords into account), without sacrificing performance.
"common": {
"body": {
"query": "i am spent at starbucks",
"cutoff_frequency": 0.001,
"low_freq_operator": "and"
}
}

query_string query:A query that uses a query parser in order to parse its content.

"query": {
"query_string": {
"query": "(verb:post) AND (body:i am today OR body:came to starbucks)"
}
}
Full text queries continue ..
simple_query_string query: A query that uses the SimpleQueryParser to parse its
context.The simple_query_string query will never throw an exception, and discards
invalid parts of the query.

"query": {
"simple_query_string": {
"query": "\"at starbucks\" | today -starbucks",
"fields": [
"body"
],
"flags": "OR|NOT|PHRASE"
}
}
Term level queries
The term-level queries operate on the exact terms that are stored in the
inverted index.These queries are usually used for structured data like
numbers, dates, and enums, rather than full text fields.
These are term level queries:

term_query wildcard_query

terms_query regexp_query

range_query fuzzy_query

exists_query type_query

prefix_query ids_query
Term level queries continue….
term_query: The term query finds documents that contain the exact term
specified in the inverted index.
"term": {
"actor.postedTime": "2010-11-17T03:55:57.000Z"
}

terms_query: Filters documents that have fields that match any of the
provided terms.
"terms": {
"verb": [
"share",
"post"
]
}
Term level queries continue….
range_query: Matches documents with fields that have terms within a
certain range.
"range": {
"actor.friendsCount": {
"gte": 10,
"lte": 500
}
}

exists_query: Returns documents that have at least one non-null


value in the original field.

"exists": {
"field": "actor.links.href"
}
Term level queries continue….
prefix_query: Matches documents that have fields containing terms with a
specified prefix.
"prefix": {
"body": "rt"
}
wildcard_query: Matches documents that have fields matching a wildcard
expression .
"wildcard": {
"actor.preferredUsername": "ba*"
}

regexp_query:The regexp query allows you to use regular expression term


queries.
"regexp": {
"actor.preferredUsername": "ba.*lan"
}
Compound Queries
Compound query: Compound queries wrap other compound or leaf
queries, either to combine their results and scores, to change their
behaviour, or to switch from query to filter context.
The queries in this group are:
constant_score query indices query
bool query and, or, not
dis_max query filtered query
function_score query limit query
boosting query
Compound queries continue….
dis_max query: A query that generates the union of documents produced
by its subqueries.
"dis_max": {
"queries": [
{ "term": { "verb": "share"}},
{ "term": { "verb": "post"}}
]
}
boosting query:The boosting query can be used to effectively demote results
that match a given query.
"boosting": {
"positive": {"term": { "verb": "post" } },
"negative": {
"range": {
"actor.friendsCount": {"from": 10,"to": 500 }
}
},
"negative_boost": 0.5
}
Compound queries continue….
bool query: A query that matches documents matching boolean combinations
of other queries. "bool" : {
"must" : {
"term" : { "verb": "post" }
},
"filter": {
"term" : { "actor.displayName": "rajni" }
},
"must_not" : {
"range" : {
"actor.friendsCount" : { "from" : 10, "to": 500 }
}
},
"should" : [
{
"term" : { "actor.twitterTimeZone": "casablanca" }
},
{
"term" : { "generator.displayName": "twitter for iPhone" }
}
]
}
Compound queries continue….
constant_score query: A query which wraps another query, but executes it
in filter context. All matching documents are given the same “constant”
_score.

"constant_score": {
"filter": {
"range": {
"actor.friendsCount": {
"from": 10,
"to": 500
}
}
}
}
Other DSL Queries

Joining queries: Performing full SQL-style joins in a distributed


system like Elasticsearch . Example: nested_query,has_parent
query,has_child query etc.

Geo queries: These queries are related to geo_point and


geo_shape related operations.

Specialized queries: These queries have no any group. It uses


for some specific requirement like template_query,script_query etc.

Span queries:These are typically used to implement very specific


queries on legal documents or patents.
References

https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html

MEAP Edition Elasticsearch in Action Version 9


Thank you

You might also like