Using the Query

The yii\elasticsearch\Query class is generally compatible with its yii\db\Query, well-described in the guide.

The differences are outlined below.

  • As Elasticsearch does not support SQL, the query API does not support join(), groupBy(), having(), and union(). Sorting, limit(), offset(), limit(), and where() are all supported (with certain limitations).

  • from() does not select the tables, but the index and type to query against.

  • select() has been replaced with storedFields(). It defines the fields to retrieve from a document, similar to columns in SQL.

  • As Elasticsearch is not only a database but also a search engine, additional query and aggregation mechanisms are supported. Check out the Query DSL on how to compose queries.

Executing queries

The yii\elasticsearch\Query class provides the usual methods for executing queries: one() and all(). They return only the search results (or a single result).

There is also the search() method that returns both the search results, and all of the metadata retrieved from Elasticsearch, including aggregations.

The extension fully supports the highly efficient scroll mode, that allows to retrieve large results sets. See batch() and each() for more information.

Number of returned records and pagination caveats

Unlike most SQL servers that will return all results unless a LIMIT clause is provided, Elasticsearch limits the result set to 10 records by default. To get more records, use yii\elasticsearch\Query::limit(). This is especially important when defining relations in ActiveRecord, where record limit needs to be specified explicitly.

Elasticsearch is generally poor suited to tasks that require deep pagination. It is optimized for search engine behavior, where only first few pages of results have any relevance. While it is technically possible to go far into the result set using yii\elasticsearch\Query::limit() and yii\elasticsearch\Query::offset(), performance is reduced.

One possible solution would be to use the scroll mode, which behaves similar to cursors in traditional SQL databases. Scroll mode is implemented with batch() and each() methods.

Error handling in queries

Elasticsearch is a distributed database. Because of its distributed nature, certain requests may be partially successful.

Consider how a typical search is performed. The query is sent to all relevant shards, then their results are collected, processed, and returned to user. It is possible that not all shards are able to return a result. Yet, even with some data missing, the result may be useful.

With every query the server returns some additional metadata, including data on which shards failed. This data is lost when using standard Yii2 methods like one() and all(). Even if some shards failed, it is not considered a server error.

To get extended data, including shard statictics, use the search() method.

The query itself can also fail for a number of reasons (connectivity issues, syntax error, etc.) but that will result in an exception.

Error handling in bulk requests

In Elasticsearch a bulk request performs multiple operations in a single API call. This reduces overhead and can greatly increase indexing speed.

The operations are executed individually, so some can be successful, while others fail. Having some of the operations fail does not cause the whole bulk request to fail. If it is important to know if any of the constituent operations failed, the result of the bulk request needs to be checked.

The bulk request itself can also fail, for example, because of connectivity issues, but that will result in an exception.