Professional Documents
Culture Documents
List All Indices: Shards & Replicas
List All Indices: Shards & Replicas
a single index of a billion documents taking up 1TB of disk space may not fit on the
disk of a single node or may be too slow to serve search requests from a single node
alone.
By default, each index in Elasticsearch is allocated 5 primary shards and 1 replica which
means that if you have at least two nodes in your cluster, your index will have 5 primary
shards and another 5 replica shards (1 complete replica) for a total of 10 shards per
index.
_cat api:
JSON is great… for computers. Even if it’s pretty-printed, trying to find relationships in
the data is tedious. Human eyes, especially when looking at an ssh terminal, need
compact and aligned text. The cat API aims to meet this need.
Whenever we ask for the cluster health, we either get green, yellow, or red.
Green - everything is good (cluster is fully functional)
Yellow - all data is available but some replicas are not yet allocated (cluster is fully
functional)
Red - some data is not available for whatever reason (cluster is partially functional)
Create an Index
Now let’s create an index named "customer" and then list all the indexes again:
PUT /customer?pretty
GET /_cat/indices?v
PUT /customer/_doc/1?pretty
{
"name": "John Doe"
}
As per pure REST conventions, POST is used for creating a new resource and
PUT is used for updating an existing resource. Here, the usage of PUT is
equivalent to saying I know the ID that I want to assign, so use this ID
while indexing this document.
If you don’t want to control the ID generation for the documents, you can
use the POST method. The format of this request is POST //, with the JSON
document as the body of the request:
Updating Documentsedit
In addition to being able to index and replace documents, we can also update
documents. Note though that Elasticsearch does not actually do in-place updates under
the hood. Whenever we do an update, Elasticsearch deletes the old document and then
indexes a new document with the update applied to it in one shot.
This example shows how to update our previous document (ID of 1) by changing the
name field to "Jane Doe":
POST /customer/_doc/1/_update?pretty
{
"doc": { "name": "Jane Doe" }
}
Update By Query APIedit
The simplest usage of _update_by_query just performs an update on every document in
the index without changing the source.
Executing Searchesedit
Now that we have seen a few of the basic search parameters, let’s dig in some more into
the Query DSL. Let’s first take a look at the returned document fields. By default, the full
JSON document is returned as part of all searches. This is referred to as the source
(_source field in the search hits). If we don’t want the entire source document returned,
we have the ability to request only a few fields from within source to be returned.
This example shows how to return two fields, account_number and balance (inside
of _source), from the search:
GET /bank/_search
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
This example returns all accounts containing the term "mill" or "lane" in the address:
GET /bank/_search
{
"query": { "match": { "address": "mill lane" } }
}
COPY AS CURLVIEW IN CONSOLE
This example is a variant of match (match_phrase) that returns all accounts containing
the phrase "mill lane" in the address:
GET /bank/_search
{
"query": { "match_phrase": { "address": "mill lane" } }
}
Let’s now introduce the bool query. The bool query allows us to compose smaller
queries into bigger queries using boolean logic.
This example composes two match queries and returns all accounts containing "mill"
and "lane" in the address:
GET /bank/_search
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
Query context
A query clause used in query context answers the question “How well does this
document match this query clause?” Besides deciding whether or not the
document matches, the query clause also calculates a _score representing how
well the document matches, relative to other documents.
Filter context
In filter context, a query clause answers the question “Does this document match
this query clause?” The answer is a simple Yes or No — no scores are calculated.
Filter context is mostly used for filtering structured data, e.g.