Professional Documents
Culture Documents
Elastic Search
Elastic Search
[]
ElasticSearch
[1.0]
0 61
Logo
[]
[2014.1.21]
1 61
Logo
[]
.................................................................................................................3
.........................................................................................................................3
.........................................................................................................................3
.................................................................................................................................3
................................................................................................3
1.
2.
3.
.....................................................................................................................................4
1.1.
.........................................................................................................................4
1.2.
....................................................................................................4
1.3.
.........................................................................................................6
1.3.1.
Cluster............................................................................................................6
1.3.2.
Shards.............................................................................................................6
1.3.3.
Replicas..........................................................................................................6
1.3.4.
Recovery........................................................................................................7
1.3.5.
River...............................................................................................................7
1.3.6.
Gateway.........................................................................................................7
1.3.7.
discovery.zen..................................................................................................7
1.3.8.
Transport........................................................................................................7
.........................................................................................................................8
2.1.
.................................................................................................................8
2.2.
.............................................................................................................8
2.3.
.........................................................................................................9
2.4.
...............................................................................................................12
Java API............................................................................................................................15
3.1.
...........................................................................................................15
3.1.1.
Node ....................................................................................................15
3.1.2.
TransportClient ...................................................................................16
3.2.
3.3.
...............................................................................................................19
3.4.
.......................................................................................................19
2 61
Logo
[]
3.5.
.......................................................................................................................20
3.6.
.......................................................................................................21
3.7.
MongoDB ........................................................................................22
3.8.
3 61
Logo
[]
1.0
2014.1.21
1.
Elasticsearch http://www.elasticsearch.cn/guide/
ElasticSearch
ES
Elasticsearch
4 61
Logo
[]
1.
1.1.
ElasticSearch Lucene RESTful
HTTP
JSON
JSON HTTP
Elasticsearch
1.2.
Github
Github Elasticsearch 20TB 13 1300
Github 2013 1
solr elasticsearch 26 8
https://github.com/blog/1381-a-whole-new-code-search
Foursquare
5 Foursquare Elasticsearch
Foursquare
Foursquare
SoundCloud
SoundCloud Elasticsearch 1.8
SoundCloud Alexa
236 SoundCloud
5 61
Logo
[]
Fog Creek
Elasticsearch Fog Creek 400 3
StumbleUpon
Elasticsearch StumbleUpon
StumbleUpon
stumble
25 HBase elasticsearch
elasticsearch solr
solr elasticsearch
Mozilla
Mozilla WarOnOrange
json elasticsearch bug
Socorro Mozilla Hbase Postgres
Hbase elasticsearch
Sony
Sony elasticsearch
Infochimps
Infochimps 25 4TB
Infochimps
hadoop
6 61
Logo
[]
Directory
7 61
Logo
[]
lucene Directory
IndexWriter
IndexWriter lucene
flush
flush commit
io
, IndexWriter
.
Index Segments
IndexReader
IndexReader IndexWriter
IndexReader
IndexWriter IndexReader flush
.
IndexReader
IndexReader
( near real-time ).
8 61
Logo
[]
1.3.2. Partitioning
Lucene Possible approach to Scale Lucene
Distributed Directory
Lucene chunks
Coherence, Terracota, GigaSpaces or Infinispan )
IndexWriter IndexReader Directory
.
lucene
ps solandra
IndexReader IndexReader
term
IndexWriter
Partitioning
ID
9 61
Logo
[]
O(K*N) K
Term FieldN
5 term
5 5 term
50
K Term O(K)
Lucene Segment
The main problem is that whole notion of Lucene Segment which is inherent to a lot of
constructs in Lucene is lost.
google PageRank
faceting
1.3.3. Replication
(replication) 2
10 61
Logo
[]
scalability ()
slave nodes
Push Replication
[master]
(document) [replica]
(
Lucene )
You index the same document several times but we transfer much less data compared to
Pull replication (and Lucene is known to index very fast)
versioning
(:
) refresh IndexReader
primary
shard
Pull Replication
11 61
Logo
[]
slavesegments
lucene
slaves master
commit slave
high availability slavea real time high available
slave 1
lucene
Data Persistency
12 61
Logo
[]
elasticsearch trasncation
log es kill -9
Transaction log
shared gateway snapshot peer shard recovery shard
Hot relocation
( )
gateway
Lucene segment files)
flushing commit
transaction log ie
transaction log replica replay
blocking
2.
http://www.elasticsearch.org/download/ elasticsearch
0.20.5es bug
:bin config lib
plugins
2.1.
elasticsearch linux bin/elasticsearch windows
13 61
Logo
[]
bin/elasticsearch.bat elasticsearch
cluster.name es
2.2.
elasticsearch-servicewrapper es
es es es
ctrl+c es https://github.com/elasticsearch/elasticsearch-servicewrapper
service es bin
bin/service/elasticsearch +
console es
start es
stop es
install es
remove
service elasticsearch.conf java
#es home
set.default.ES_HOME=<Path to ElasticSearch Home>
# es
set.default.ES_MIN_MEM=256
# es
set.default.ES_MAX_MEM=1024
#
wrapper.startup.timeout=300
#
wrapper.shutdown.timeout=300
# ping ()
wrapper.ping.timeout=300
2.3.
elasticsearch smartcn medcl
14 61
Logo
[]
es ik mmseg
ik
plugin -install medcl/elasticsearch-analysis-ik/1.1.0
github
https://github.com/medcl/elasticsearch-rtf/blob/master/elasticsearch/plugins/analysisik/elasticsearch-analysis-ik-1.2.5.jar
plugin --install //
plugin --url file://path/to/plugin --install plugin-name
ik config
cd config
wget http://github.com/downloads/medcl/elasticsearch-analysis-ik/ik.zip --no-checkcertificate
unzip ik.zip
rm ik.zip
mmseg
bin/plugin -install medcl/elasticsearch-analysis-mmseg/1.1.0
config
cd config
wget http://github.com/downloads/medcl/elasticsearch-analysis-mmseg/mmseg.zip --nocheck-certificate
unzip mmseg.zip
rm mmseg.zip
ik elasticsearch.yml
index:
analysis:
analyzer:
ik:
alias: [ik_analyzer]
type: org.elasticsearch.index.analysis.IkAnalyzerProvider
index.analysis.analyzer.ik.type : ik
mmseg elasticsearch.yml
index:
15 61
Logo
[]
analysis:
analyzer:
mmseg:
alias: [news_analyzer, mmseg_analyzer]
type: org.elasticsearch.index.analysis.MMsegAnalyzerProvider
index.analysis.analyzer.default.type : "mmseg"
mmseg
index:
analysis:
tokenizer:
mmseg_maxword:
type: mmseg
seg_type: "max_word"
mmseg_complex:
type: mmseg
seg_type: "complex"
mmseg_simple:
type: mmseg
seg_type: "simple"
es
mapping
mapping
{
"page":{
"properties":{
"title":{
"type":"string",
"indexAnalyzer":"ik",
"searchAnalyzer":"ik"
},
"content":{
"type":"string",
"indexAnalyzer":"ik",
"searchAnalyzer":"ik"
}
}
}
}
indexAnalyzer searchAnalyzer
java mapping
16 61
Logo
[]
api indexname
http://localhost:9200/indexname/_analyze?analyzer=ik&text= elasticsearch
ik https://github.com/medcl/elasticsearch-analysis-ik
mmseg https://github.com/medcl/elasticsearch-analysis-mmseg
es
https://github.com/medcl/elasticsearch-rtf
2.4.
elasticsearch config elasticsearch.yml logging.yml
es es log4j
logging.yml log4j
elasticsearch.yml
cluster.name: elasticsearch
es elasticsearches es
Logo
[]
name.txt
node.master: true
node truees
master master
node.data: true
true
index.number_of_shards: 5
5
index.number_of_replicas: 1
1
path.conf: /path/to/conf
es config
path.data: /path/to/data
es data
path.data: /path/to/data1,/path/to/data2
path.work: /path/to/work
es work
path.logs: /path/to/logs
es logs
path.plugins: /path/to/plugins
es plugins
bootstrap.mlockall: true
true jvm swapping es
swap ES_MIN_MEM ES_MAX_MEM
es elasticsearch linux
`ulimit -l unlimited`
network.bind_host: 192.168.0.1
ip ipv4 ipv6 0.0.0.0
network.publish_host: 192.168.0.1
ip
ip
network.host: 192.168.0.1
bind_host publish_host
transport.tcp.port: 9300
18 61
Logo
[]
tcp 9300
transport.tcp.compress: true
tcp false
http.port: 9200
http 9200
http.max_content_length: 100mb
100mb
http.enabled: false
http true
gateway.type: local
gateway local
hadoop HDFS amazon s3
gateway.recover_after_nodes: 1
N 1
gateway.recover_after_time: 5m
5
gateway.expected_nodes: 2
2 N
cluster.routing.allocation.node_initial_primaries_recoveries: 4
4
cluster.routing.allocation.node_concurrent_recoveries: 2
4
indices.recovery.max_size_per_sec: 0
100mb 0
indices.recovery.concurrent_streams: 5
5
discovery.zen.minimum_master_nodes: 1
N master 1
2-4
discovery.zen.ping.timeout: 3s
ping 3
discovery.zen.ping.multicast.enabled: false
19 61
Logo
[]
true
discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]
master
index.search.slowlog.level: TRACE
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 500ms
index.search.slowlog.threshold.fetch.warn: 1s
index.search.slowlog.threshold.fetch.info: 800ms
index.search.slowlog.threshold.fetch.debug:500ms
index.search.slowlog.threshold.fetch.trace: 200ms
2.5.
2.5.1. elasticsearch-head
elasticsearch-head elasticsearch html5
es index.html
git https://github.com/Aconex/elasticsearch-head
20 61
Logo
[]
21 61
Logo
[]
browser
Structured Query
product boolquerytitle price 10
100
22 61
Logo
[]
2.5.2. elasticsearch-bigdesk
bigdesk elasticsearch es
cpuhttp git https://gith
ub.com/lukas-vlcek/bigdesk head
head
23 61
Logo
[]
cpu
jvm
jvm jvm heap
heap gc
24 61
Logo
[]
es
cpu cpu
ps
tcp http
25 61
Logo
[]
26 61
Logo
[]
3. Moduls
3.1.1. Cluster
es
es
es
3.1.2. Shards
es
3.1.3. Replicas
es
es es
3.1.4. Recovery
es
27 61
Logo
[]
3.1.5. River
es es
es river es
river couchDB RabbitMQ Twitter Wikipedia river
3.1.6. Gateway
es es
es gateway es
gatewayHadoop HDFS amazon
s3
3.1.7. discovery.zen
es es p2p
3.1.8. Transport
es tcp
http json thriftservletmemcachedzeroMQ
28 61
Logo
[]
4. Java API
4.1.
elasticsearch es
es Node es es
TransportClient es
4.1.1. Node
es jvm es
JVM local true
Node node = nodeBuilder().local(true).node();
4.1.2. TransportClient
TransportClient es
29 61
Logo
[]
es ip
Client client = new TransportClient()
.addTransportAddress(new InetSocketTransportAddress("host1", 9300))
.addTransportAddress(new InetSocketTransportAddress("host2", 9300));
client.close();
elasticsearch
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "myClusterName").build();
Client client = new TransportClient(settings);
client.transport.sniff true
ip ip
mapping mapping
mapping
mapping[mapping ].json
config/mappings/[] mapping
mapping default-mapping.json config
json
{
"mappings":{
"properties":{
"title":{
"type":"string",
"store":"yes"
},
30 61
Logo
[]
"description":{
"type":"string",
"index":"not_analyzed"
},
"price":{
"type":"double"
},
"onSale":{
"type":"boolean"
},
"type":{
"type":"integer"
},
"createDate":{
"type":"date"
}
}
}
}
mapping productIndex mapping
json productIndex properties type
store "index":"not_analyzed"
{
"productIndex":{
"properties":{
"title":{
"type":"string",
"store":"yes"
},
"description":{
"type":"string",
"index":"not_analyzed"
},
"price":{
"type":"double"
},
"onSale":{
"type":"boolean"
},
"type":{
"type":"integer"
},
31 61
Logo
[]
"createDate":{
"type":"date"
}
}
}
}
java api
client.admin().indices().prepareCreate("productIndex").execute().actionGet();
put mapping
XContentBuilder mapping = jsonBuilder()
.startObject()
.startObject("productIndex")
.startObject("properties")
.startObject("title").field("type", "string").field("store", "yes").endObject()
.startObject("description").field("type", "string").field("index",
"not_analyzed").endObject()
.startObject("price").field("type", "double").endObject()
.startObject("onSale").field("type", "boolean").endObject()
.startObject("type").field("type", "integer").endObject()
.startObject("createDate").field("type", "date").endObject()
.endObject()
.endObject()
.endObject();
PutMappingRequest mappingRequest =
Requests.putMappingRequest("productIndex").type("productIndex").source(mapping);
client.admin().indices().putMapping(mappingRequest).actionGet();
4.3.
es json es java api
Logo
[]
.endObject();
client.prepareIndex("productIndex","productType").setSource(doc).execute().actionGet(
);
productIndex es productType
4.4.
api id json id
Query
id
twitter tweetid 1
DeleteResponse response = client.prepareDelete("twitter", "tweet", "1")
.execute()
.actionGet();
Query
productIndextitle query
QueryBuilder query = QueryBuilders.fieldQuery("title", "query");
client.prepareDeleteByQuery("productIndex").setQuery(query).execute().actionGet();
api api
api operationThreaded operationThreaded
api
operationThreaded true
false
DeleteResponse response = client.prepareDelete("twitter", "tweet", "1")
.setOperationThreaded(false)
.execute()
.actionGet();
http://www.elasticsearch.org/guide/reference/api/delete.html
http://www.elasticsearch.org/guide/reference/java-api/delete.html
33 61
Logo
[]
4.5.
elasticsearch json java api
QueryBuilder elasticsearch queryDSL QueryBuilder
QueryBuildersfilter FilterBuilders QueryBuilder
import static org.elasticsearch.index.query.FilterBuilders.*;
import static org.elasticsearch.index.query.QueryBuilders.*;
QueryBuilder qb1 = termQuery("name", "kimchy");
QueryBuilder qb2 = boolQuery()
.must(termQuery("content", "test1"))
.must(termQuery("content", "test4"))
.mustNot(termQuery("content", "test2"))
.should(termQuery("content", "test3"));
QueryBuilder qb3 = filteredQuery(
termQuery("name.first", "shay"),
rangeFilter("age")
.from(23)
.to(54)
.includeLower(true)
.includeUpper(false)
);
qb1 TermQuery name
lucene TermQuery qb2 BoolQuery
lucene BooleanQuery mustshouldmustNot QueryBuilder
qb3 TermQuery
RangeFilter age 23 54
elasticsearch
Query elasticsearch
SearchResponse response = client.prepareSearch("test")
.setQuery(query)
.setFrom(0).setSize(60).setExplain(true)
.execute()
.actionGet();
test query 0 60
34 61
Logo
[]
SearchResponse SearchResponse
SearchHits hits = searchResponse.hits();
for (int i = 0; i < 60; i++) {
System.out.println(hits.getAt(i).getSource().get("field"));
}
SearchResponse SearchHits hits.getAt(i).getSource().get("field") field
4.6.
elasticsearch java api
BulkRequestBuilder index/delete BulkRequestBuilder
BulkRequestBuilder
import static org.elasticsearch.common.xcontent.XContentFactory.*;
BulkRequestBuilder bulkRequest = client.prepareBulk();
bulkRequest.add(client.prepareIndex("twitter", "tweet", "1")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elastic Search")
.endObject()
)
);
bulkRequest.add(client.prepareIndex("twitter", "tweet", "2")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "another post")
.endObject()
)
);
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
35 61
Logo
[]
//
}
4.7. MongoDB
elasticsearch river es es
couchDB mongodb mongodb
git elasticsearch-river-mongodb
aparo mongodb
id mongodb id
Elasticsearch 0.19.X
MongoDB 2.X
mongodb mongodb oplog
elasticsearch-mapper-attachments gridfs
%ES_HOME%\bin\plugin.bat -install elasticsearch/elasticsearch-mapper-attachments/1.4.0
elasticsearch-river-mongodb
%ES_HOME%\bin\plugin.bat -install laigood/elasticsearch-river-mongodb/laigoodv1.0.0
river
curl
36 61
Logo
[]
Logo
[]
.setSource(
jsonBuilder().startObject()
.field("type", "mongodb")
.startObject("mongodb")
.field("host","localhost")
.field("port",27017)
.field("db","testdb")
.field("collection","test")
.field("fields","title,content")
.field("db_user","user")
<span style="white-space:pre">
</span>.field("db_password","password")
.field("local_db_user","admin")
<span style="white-space:pre">
</span>.field("local_db_password","admin")
.endObject()
.startObject("index")
.field("name","test")
.field("type","test")
.field("bulk_size","1000")
.field("bulk_timeout","30")
.endObject()
.endObject()
).execute().actionGet();
git https://github.com/laigood/elasticsearch-river-mongodb
json
{
"more_like_this" : {
"fields" : ["title", "content"],
"like_text" : "text like this one",
}
}
38 61
Logo
[]
fields _all
like_text
percent_terms_to_matchterm 0.3
min_term_freq
2
max_query_terms 25
stop_words
min_doc_freq
max_doc_freq
min_word_len 0
max_word_len
boost_terms 1
boost 1
analyzer
java api
39 61
Logo
[]
5.
5.1.
cluster.routing.allocation.allow_rebalance
always,
indices_primaries_active indices_all_active indices_all_active
cluster.routing.allocation.cluster_concurrent_rebalance
2
cluster.routing.allocation.node_initial_primaries_recoveries
local gateway
cluster.routing.allocation.node_concurrent_recoveries
2
cluster.routing.allocation.disable_allocation
api
cluster.routing.allocation.disable_replica_allocation
api
indices.recovery.concurrent_streams
peer 5
rack_id
node.rack_id: rack_one
40 61
Logo
[]
rack_id rack_one
rack_id
cluster.routing.allocation.awareness.attributes: rack_id
rack_id node.rack_id
rack_one 5
node.rack_id rack_two
rack_id
cluster.routing.allocation.awareness.attributes: rack_id,zone
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: zone
node.zone zone1 5
5 node.zone
zone2
include/exclude
:
tag
tag node.tag: value1 node.tag: value2
tag value1 value2
index.routing.allocation.include.tag value1,value2
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.include.tag" : "value1,value2"
}'
index.routing.allocation.exclude.tag value3
41 61
Logo
[]
tag value3
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.exclude.tag" : "value3"
}'
include exclude value*
_ip ip
node.group1: group1_value1
node.group2: group2_value4
include exclude
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.include.group1" : "xxx"
"index.routing.allocation.include.group2" : "yyy",
"index.routing.allocation.exclude.group3" : "zzz",
}'
api
api
ip
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
}
}'
5.2.
Elasticsearch
index cached
search cached
bulk cached
refresh cached
type
blocking
min: 1
42 61
Logo
[]
size: 30
wait_time: 30s
cache
cache
threadpool:
index:
type: cached
fixed
fixed
size cpu 5
queue_size -1
reject_policy abort
caller io
threadpool:
index:
type: fixed
size: 30
queue: 1000
reject_policy: caller
blocking
blocking min 1size
cpu 5 queue_size 1000
wait_time 60 io
threadpool:
index:
type: blocking
min: 1
size: 30
wait_time: 30s
43 61
Logo
[]
5.3.
Java6Mustang 2006
Java7(Dolphin)
ElasticSearch Java6 7
Elasticsearch Java
ElasticSearch
ElasticSearch
Elasticsearch
Elasticsearch JVM
Elasticsearch 0.19.11
JVM
Elasticsearch
Environment
-Xms
256m
ES_MIN_MEM
-Xmx
1g
ES_MAX_MEM
ES_HEAP_SIZE
-Xmn
ES_HEAP_NEW
SIZE
-XX:MaxDirectMemorySize
ES_DIRECT_SI
ZE
-Xss
256k
-XX:UseParNewGC
-XX:UseConcMarkSweepGC
75
XX:CMSInitiatingOccupancyFraction
-
XX:UseCMSInitiatingOccupancyOnly
-XX:UseCondCardMark
(commented
44 61
Logo
[]
out)
Elasticsearch 256M 1GB
./bin/elasticsearch -f Elasticsearch
Elasticsearch
2GB RAM
ES_MIN_MEM/ES_MAX_MEM ES_HEAP_SIZE
ES_HEAP_NEWSIZE
ES_DIRECT_SIZE JVM NIO
64
Elasticsearch ( OOM)
Java JVM
JVM parameter
Garbage collector
-XX:+UseSerialGC
serial collector
-XX:+UseParallelGC
parallel collector
-XX:+UseParallelOldGC
-XX:
Concurrent-Mark-Sweep (CMS)
+UseConcMarkSweepGC
-XX:+UseG1GC
collector
Garbage-First collector (G1)
UseParNewGC UseConcMarkSweepGC
UseConcMarkSweepGC UseParNewGC Serial collector
Java6
CMSInitiatingOccupancyFraction CMSConcurrent-Mark-Sweep
75.
JVM 75%
GC
UseCondCardMark card table marking
store UseCondCardMark Garbage-First
card table marking
ElasticSearch
45 61
Logo
[]
I/O
Java JVM
JVM
OOM
JVM
CMS Java
Java Java
Elasticsearch 128K
256K Java7 Java6 Java7
continuations Continuations
Logo
[]
JVM CPU
JVM Sloaris Sparc 64
JVM Xss 512KSloaris X86 320KLinux
256KWindows 32 Java6 320KWindows 64 1024K
GB G
MB Lucene segment-based
CMS Lucene
index.merge.policy.segments_per_tier
Java JVM
GC
G1
47 61
Logo
[]
1. 50% Java
2. promotion
3. gc compaction 0.5 1s
G1
G1 CPU
CPU CMS
Elasticsearch G1 stop-the-world
buffer memory I/O
G1 CPU
1.
2. log everything
3.
4.
5.
6.
7.
Elasticsearch
Elasticsearch GC warns
[2012-11-26 18:13:53,166][WARN ][monitor.jvm
] [Ectokid] [gc][ParNew]
[1135087][11248] duration [2.6m], collections [1]/[2.7m], total [2.6m]/[6.8m], memory [2.4gb]>[2.3gb]/[3.8gb], all_pools {[Code Cache] [13.7mb]->[13.7mb]/[48mb]}{[Par Eden Space]
[109.6mb]->[15.4mb]/[1gb]}{[Par Survivor Space] [136.5mb]->[0b]/[136.5mb]}{[CMS Old Gen]
[2.1gb]->[2.3gb]/[2.6gb]}{[CMS Perm Gen] [35.1mb]->[34.9mb]/[82mb]}
JvmMonitorService
Logfile
Explanation
gc
gc
ParNew
duration 2.6m
gc 2.6
collections [1]/[2.7m]
2.7
memory [2.4gb]->[2.3gb]/[3.8gb]
, 2.4gb, 2.3gb,
48 61
Logo
[]
3.8gb
Code Cache [13.7mb]->[13.7mb]/
code cache
[48mb]
Par Eden Space [109.6mb]->[15.4mb]/
[1gb]
Par Survivor Space [136.5mb]->[0b]/
[136.5mb]
CMS Old Gen [2.1gb]->[2.3gb]/[2.6gb]
[82mb]
JvmMonitorSer
3. Java sa Java
Java
4. Elasticsearch Elasticsearch
3
5. JVM
Elasticsearch
index.merge.policy.segments_per_tierparameter
6.
7.
8. CMS -XX:CMSWaitDuration
9. 6-8GB CMS
stop-the-world CMSInitiatingOccupancyFraction
GC G1
49 61
Logo
[]
6.
6.1. Guice
elasticsearch google guice spring 100
spring
guice
elasticsearch guice es
jar es jar 10M
org.elasticsearch.common.inject
Guice Module Module
bind(A).to(B) Guice
50 61
Logo
[]
@Inject
public RealBillingService(CreditCardProcessor processor,
TransactionLog transactionLog) {
this.processor = processor;
this.transactionLog = transactionLog;
}
51 61
Logo
[]
PluginsModule
SettingsModule
NodeModule
NetworkModule
NodeCacheModule
ScriptModule
JmxModulejmx
EnvironmentModule
NodeEnvironmentModule
ClusterNameModule
ThreadPoolModule
DiscoveryModule
ClusterModule
RestModulerest
TransportModuletcp
HttpServerModulehttp
RiversModuleriver
IndicesModule
SearchModule
ActionModule
MonitorModule
GatewayModule
NodeClientModule
6.2.
elasticsearch
52 61
Logo
[]
TransportIndexAction.shardOperationOnPrimary
routing
MappingMetaData mappingMd =
clusterState.metaData().index(request.index()).mappingOrDefault(request.type());
if (mappingMd != null && mappingMd.routing().required()) {
if (request.routing() == null) {
throw new RoutingMissingException(request.index(), request.type(),
53 61
Logo
[]
request.id());
}
}
INDEX id
CREATE id
if (request.opType() == IndexRequest.OpType.INDEX)
InternalIndexShard
Engine.Index index = indexShard.prepareIndex(sourceToParse)
.version(request.version())
.versionType(request.versionType())
.origin(Engine.Operation.Origin.PRIMARY);
indexShard.index(index);
InternalIndexShardtype mapping
json mapping ParsedDocument
public Engine.Index prepareIndex(SourceToParse source) throws
ElasticSearchException {
long startTime = System.nanoTime();
DocumentMapper docMapper =
mapperService.documentMapperWithAutoCreate(source.type());
ParsedDocument doc = docMapper.parse(source);
return new Engine.Index(docMapper, docMapper.uidMapper().term(doc.uid()),
doc).startTime(startTime);
}
RobinEngine () lucene
lucene RobinEngine.innerIndex
if (currentVersion == -1) {
// document does not exists, we can optimize for create
if (index.docs().size() > 1) {
writer.addDocuments(index.docs(), index.analyzer());
} else {
writer.addDocument(index.docs().get(0), index.analyzer());
}
} else {
if (index.docs().size() > 1) {
writer.updateDocuments(index.uid(), index.docs(), index.analyzer());
} else {
writer.updateDocument(index.uid(), index.docs().get(0), index.analyzer());
}
}
54 61
Logo
[]
TranslogTranslog
flush
Translog.Location translogLocation = translog.add(new Translog.Create(create));
7.
7.1.
lucene elasticsearchsolr
55 61
Logo
[]
56 61
Logo
[]
NOTE: will write new segments file in 5 seconds; this will remove 4708135 docs from the
index. THIS IS YOUR LAST CHANCE TO CTRL+C!
5...
4...
3...
2...
1...
Writing...
OK
Wrote new segments file "segments_2ch"
4708135
4708135
id
57 61
Logo
[]
7.2.
7.2.1. gc
gc jvm gc master ping3 zen
discovery ping 3
1 gc gc 2 zen discovery es
ping_retrieses ping_timeout
1 es Soft Reference
58 61
Logo
[]
7.2.3.
es RecoverFilesRecoveryException[[index][3] Failed to transfer [215] files
with total size of [9.4gb]]; nested: OutOfMemoryError[unable to create new native thread]; ]]
too many open file
jvm /
*1024*1024
1024 1024
7.2.4.
[7]: index [index], type [index], id [1569133], message [UnavailableShardsException[[index]
[1] [4] shardIt, [2] active : Timeout waiting for [1m], request:
org.elasticsearch.action.bulk.BulkShardRequest@5989fa07]]
2
12 one
59 61
Logo
[]
7.2.5. jvm
bootstrap.mlockall: true es Unknown mlockall error 0 linux
45k
linux ulimit -l unlimited
7.2.6. api
deleteByQuery BoolQuery id
BoolQuery 1024 100
es
bulkRequest
60 61