公開日

2015/09/02

最終更新

2021/06/20

記事区分

一般公開

参考にしたページ

用語

Cluster : Node の集合。名前の既定値は "elasticsearch" (複数環境がある場合は重複しないようにすること: "logging-dev", "logging-stage", and "logging-prod" など)
Node : Cluster を構成するサーバ。Shard (またはその Replica) を保持する
Index : Document の集合
Type : Document の型
Document : Index を構成する JSON 形式で表現可能な情報。いずれかの Type に属する
Shard : Index を分割したもの
Replica : Shard のコピー。オリジナル Shard を保持する Node とは別の Node がこれを保持すること (Replica の作成は必須ではない)

インストール

最新バージョンへのリンクは Downloads | Elasticsearch から調べてください。

$ sudo yum install java-1.8.0-openjdk
$ sudo rpm -ivh https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.7.2.noarch.rpm
$ sudo chkconfig --add elasticsearch
$ sudo service elasticsearch start

API 例

状態を調査

curl 'localhost:9200/_cat/health?v'
curl 'localhost:9200/_cat/nodes?v'
curl 'localhost:9200/_cat/indices?v'
curl 'localhost:9200/_cat/shards?v'

green : すべて正常
yellow : replica が作成できていない (恐らく replica 用の別 node がいない)
red : 異常あり

インデックスを作成および削除

customer という名称でインデックスを作成/削除するためには以下のようにします。pretty を指定することでレスポンス JSON を整形して表示できます。

$ curl -XPUT 'localhost:9200/customer?pretty'
$ curl -XDELETE 'localhost:9200/customer?pretty'

ドキュメントを作成、更新、削除

external タイプのドキュメントを customer インデックスに追加するためには以下のようにします。既に指定 ID のドキュメント存在している場合は更新になります。

$ curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
  "name": "John Doe"
}'
$ curl -XGET 'localhost:9200/customer/external/1?pretty'

ID を自動生成させることも可能です。

$ curl -XPOST 'localhost:9200/customer/external?pretty' -d '
{
  "name": "Jane Doe"
}'

明示的に更新するためには以下のようにします。

$ curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
  "doc": { "name": "Jane Doe", "age": 20 }
}'

削除には DELETE を指定します。_query で条件指定すれば複数ドキュメントを一括削除できます。

$ curl -XDELETE 'localhost:9200/customer/external/1?pretty'
$ curl -XDELETE 'localhost:9200/customer/external/_query?pretty' -d '
{
  "query": { "match": { "name": "John" } }
}'

バッチ処理

ドキュメントを二つ作成

$ curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'

id:1 を更新して id:2 を削除

$ curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'

検索

事前準備

JSON GENERATOR をもとに作成されたダミーデータをインデックスにドキュメントとして追加しましょう。

$ curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary @accounts.json
$ curl 'localhost:9200/_cat/indices?v'

URI 形式での検索

'*' ワイルドカードですべてに一致かつ pretty で整形して表示。既定では 10 ドキュメントだけ表示。

$ curl 'localhost:9200/bank/_search?q=*&pretty'

POST の body で検索

並べ替えや LIMIT と OFFSET 指定もできます。_source を指定すれば必要な情報だけを取得できます。

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": { "match_all": {} },
  "_source": ["account_number", "balance", "age"],
  "sort": { "age": { "order": "desc" } },
  "from": 10,
  "size": 1
}'

match クエリ

account_number が 20

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": { "match": { "account_number": 20 } }
}'

address に mill が含まれる (大文字小文字を区別しない)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": { "match": { "address": "mill" } }
}'

address に mill または lane のどちらかが含まれる (大文字小文字を区別しない)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": { "match": { "address": "mill lane" } }
}'

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}'

address に "mill lane" が含まれる (大文字小文字を区別しない)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": { "match_phrase": { "address": "mill lane" } }
}'

address に mill と lane が両方含まれる (大文字小文字を区別しない)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}'

address に mill も lane も含まない (大文字小文字を区別しない)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": {
    "bool": {
      "must_not": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}'

must, should, must_not は同時に複数指定できます。

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}'

フィルター

query と異なり、検索結果に点数を付与せず結果をメモリキャッシュするため高速です。関連度を示す値である score を必要としない場合は filter を使用します。query と併用できます。以下は「20000 以上 30000 以下」の例です。

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "query": {
    "filtered": {
      "query": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}'

集約 (Aggregation)

SQL の GROUP BY のような概念です。通常の query 結果も同時に返されるため、以下の例ではその size を 0 にして結果を取得しないようにしています。state で集約したグループの COUNT(*) が大きい順に 10 件取得します。

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state"
      }
    }
  }
}'

平均値を取得しつつ平均値が大きい順に並べ替え (既定値: 上位 10 件を取得)

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state",
        "order": {
          "average_balance": "desc"
        }
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}'

20-29, 30-39, 40-49 の 3 グループに分類して、それぞれのグループ内で更に gender 毎に集約して balance の平均値を表示

$ curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_gender": {
          "terms": {
            "field": "gender"
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
  }
}'

より詳細な API 情報は公式ドキュメントへ (Elasticsearch 1.7)

システム設定

本ページ上述の手順にしたがい rpm でインストールした場合は以下のファイルを編集することでシステム設定が可能です。

$ vi /etc/sysconfig/elasticsearch  ← /etc/init.d/elasticsearch で読み込まれます
$ vi /etc/elasticsearch/elasticsearch.yml

データ保存ディレクトリの指定

/etc/sysconfig/elasticsearch

# Elasticsearch data directory
DATA_DIR=/var/lib/elasticsearch

メモリ設定

/etc/sysconfig/elasticsearch

# Heap size defaults to 256m min, 1g max
# Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
ES_HEAP_SIZE=256m

IP 制限

/etc/elasticsearch/elasticsearch.yml

# Set both 'bind_host' and 'publish_host':
#
network.host: 127.0.0.1

したくん

Scalaはいいぞ

記事の執筆者にステッカーを贈る

有益な情報に対するお礼として、またはコメント欄における質問への返答に対するお礼として、記事の読者は、執筆者に有料のステッカーを贈ることができます。

さらに詳しく →

Feedbacks

ログインするとコメントを投稿できます。

Kibana 4.1 導入手順
Elasticsearch を利用して情報を可視化する Kibana について簡単にまとめます。Elasticsearch は準備済みであるとします。参考にしたページ Kibana User Guide 4.1 実行手順 Downloads | Kibana からダウンロー
ぼうし猫1/15/2018に更新
0
- Elasticsearch
Elasticsearch 日本語全文検索
Elasticsearch で日本語の全文検索を行うための方法の一つは Kuromoji を利用することです。Kuromoji は Java で書かれているオープンソースの日本語形態素解析エンジンです。コマンド例をまとめます。「[黒文字](https://www.google.co.jp/search?site=imghp&tbm=isch&biw=1312&bih=94...
ぴよぴよさん1/27/2018に更新
0
- Elasticsearch

Elasticsearch 導入のための基礎知識

目次