Elasticsearch
Installing Elasticsearch
The Elasticsearch is available for download at https://www.elastic.co/downloads/elasticsearch. The version of Elasticsearch supported by LumisXP must be used according to the system requirements. For information on installation and configuration, see the Elasticsearch manual.
It is important to note that Elasticsearch creates a cluster with other instances that have the same value in the clusterName configuration. It is recommended to always change the value of clusterName instead of leaving the default value to prevent another Elasticsearch running on the same network from forming an unwanted cluster.
By default, Elasticsearch provides access through ports 9200 and 9300 and there is no access control unless its security is enabled. If security is not going to be enabled, it is recommended to allow access to the ports exposed by Elasticsearch only to LumisXP. Instructions on how to enable security will be detailed further below in Security Settings.
The Elasticsearch must be configured not to automatically create indices manipulated by the portal. To do this, you can, for example, include in the configuration file elasticsearch.yml the setting
action.auto_create_index: "-lumisportal-*,+*"
, if you are using the default index prefix "lumisportal" in LumisXP.
By default, Elasticsearch allows only 500
simultaneous scroll contexts. This number can be easily reached.
When this limit is exceeded, Elasticsearch generates a log message like the example below:
Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.
When this error occurs, errors in the BigData API may happen.
Therefore, we suggest increasing it to 100,000
. This configuration should, however, be suitable for the demands of the solution.
To make this configuration, just include in the configuration file elasticsearch.yml the setting
search.max_open_scroll_context: 100000
.
LumisXP, in some operations, limits the operation to certain indices of Elasticsearch. This may cause a large request line to be generated.
By default, Elasticsearch limits this line to 4kb. For LumisXP to function correctly, this configuration needs to be increased to 10mb (may need to increase more depending on the adopted solution).
To do this, just include in the configuration file elasticsearch.yml the setting
http.max_initial_line_length: 10MB
.
Installing the ICU Analysis Plugin
LumisXP requires that Elasticsearch have the ICU Analysis plugin installed on all nodes of Elasticsearch that will be used.
To install this plugin, on each node of Elasticsearch, run the command elasticsearch-plugin install analysis-icu
in the bin folder of the Elasticsearch installation. More detailed instructions for installing this plugin are available in the manual.
Settings
To configure which Big Data repository LumisXP uses, go to Settings > Portal Settings > Environment Settings.
Security Settings
The security configuration in Elasticsearch is detailed in its manual. LumisXP supports authentication using API Key.
To enable security in an Elasticsearch used by LumisXP, the steps below will be necessary. Note that during the execution of these steps, LumisXP will not be able to access Elasticsearch and will generate errors in its logs because of this. But the errors should stop being generated in the log once all steps are successfully completed.
- If the security configuration in Elasticsearch is disabled, enable it and restart it.
- Generate an API Key.
- In LumisXP, go to Settings > Portal Settings > Environment Settings and enter the details of the generated API Key, along with the other access configurations to Elasticsearch.
In the case of an initial installation of LumisXP, you can also follow the above steps. In the case of a version update of LumisXP, it is important to only have the security configuration of Elasticsearch enabled after LumisXP is up and running, and all background processing of the execution queue is completed, for it to complete its update process using the configurations it already had before starting the update.
Mapping the analysis folder
The folder <lumisdata>/shared/data/elasticsearch/lumis-analysis
must be mapped to <config>/lumis-analysis
on each Elasticsearch server used by the portal (for example, through a mount point or symbolic link).
For example, assuming that LumisXP is being used on Windows and its installation is in C:\lumis\lumisportal
and that a
local Elasticsearch is being used, located at C:\lumis\elasticsearch
, the mapping could be done using a
junction point in the following
manner: mklink /J "C:\lumis\elasticsearch\config\lumis-analysis" "C:\lumis\lumisportal\lumisdata\shared\data\elasticsearch\lumis-analysis"
.
Now, assuming that LumisXP is being used on Linux and its installation is in /lumis/lumisportal
and that a
local Elasticsearch is being used, with its configuration in /etc/elasticsearch
, the mapping could be done through
bind mounts in the following
manner:
mkdir /etc/elasticsearch/lumis-analysis
mount --bind /lumis/lumisportal/lumisdata/shared/data/elasticsearch/lumis-analysis /etc/elasticsearch/lumis-analysis
Creation of index templates for use with synonyms
If the solution will utilize the synonym functionality and utilize a language that is not Portuguese (code pt_BR
), English (code en_US
) or Spanish (code es_ES
), an index template should be created for the language used to create the necessary analyzers to ensure that synonyms are correctly applied. For more information, see the technical documentation.
Known Issues
In certain situations, Elasticsearch may generate an error if a search generates too many aggregation buckets.
In particular, this may occur if user segmentation is performed using one of the options Did the action
or Did not perform the action
.
In this case, it is possible that messages like the following appear in the logs: TooManyBucketsException[Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001].
This limit can be set by changing the [search.max_buckets] cluster level setting.]
.
The action to be taken should be evaluated on a case-by-case basis, but we recommend that if the problem manifests in user segmentation, increase this configuration (search.max_buckets
) to 30,000
,
preferably in a persistent manner. To make this configuration, just include in the configuration file elasticsearch.yml the setting
search.max_buckets: 30000
.
This value of 30,000
may, however, not be sufficient if the property bag lumis.service.analytics.usersegmentation.control.UserSegmentationData.subQueriesResultLimit
has been defined with a value greater than this.