Rewrite Your Elasticsearch Requests OnTheFly #
In some cases, you may find that the QueryDSL generated by the service code is unreasonable. The general practice is to modify the service code and publish it online. If the launch of a new version takes a long time (for example, the put-into-production window is not reached, major network operation closure is in progress, or additional code needs to be submitted to go live), a large number of tests need to be performed. However, faults in the production environment need to be rectified immediately and customers have no time to wait. What should be done in that case?
Don’t worry. You can use INFINI Gateway to dynamically repair queries.
Example #
See the following query example:
GET _search
{
"size": 1000000
, "explain": true
}
The size
parameter is set to a very large value and the problem is not found at the beginning. With more and more data generated, too much returned data is bound to cause a sharp decline in performance.
In addition, enabling the explain
parameter will create unnecessary performance overhead and this function is generally used only during development and debugging.
By adding the request_body_json_set
filter to the gateway, you can dynamically replace the value of the specified request body JSON PATH. The configuration for the above example is as follows:
flow:
- name: rewrite_query
filter:
- request_body_json_set:
path:
- explain -> false
- size -> 10
- dump_request_body:
- elasticsearch:
elasticsearch: dev
Set the explain
and size
parameters again. The query is rewritten in the following format before it is sent to Elasticsearch:
{
"size": 10, "explain": false
}
The problem is successfully fixed in in-service mode.
Another Example #
Look at the following query example. The programmer who writes the code writes the name of the field to be queried by mistake. The name should be name
but is written as name1
. The size
parameter is set to a very large value.
GET medcl/_search
{
"aggs": {
"total_num": {
"terms": {
"field": "name1",
"size": 1000000
}
}
}
}
The system goes live but a problem arises when a query is conducted. For this problem, you can add the following filter configuration to the gateway request flow:
flow:
- name: rewrite_query
filter:
- request_body_json_set:
path:
- aggs.total_num.terms.field -> "name"
- aggs.total_num.terms.size -> 10
- size -> 0
- dump_request_body:
- elasticsearch:
elasticsearch: dev
In the above configuration, we can replace the data of the JSON request body through its path, and add one parameter not to return the query document because only aggregated results are required.
Another Example #
The user query is as follows:
{
"query":{
"bool":{
"should":[{"term":{"isDel":0}},{"match":{"type":"order"}}]
}
}
}
Now you want to replace the term query with the equivalent range query as follows:
{
"query":{
"bool":{
"should":[{ "range": { "isDel": {"gte": 0,"lte": 0 }}},{"match":{"type":"order"}}]
}
}
}
Use the following configuration:
flow:
- name: rewrite_query
filter:
- request_body_json_del:
path:
- query.bool.should.[0]
- request_body_json_set:
path:
- query.bool.should.[1].range.isDel.gte -> 0
- query.bool.should.[1].range.isDel.lte -> 0
- dump_request_body:
- elasticsearch:
elasticsearch: dev
In the above configuration, one request_body_json_del
filter is used to delete the first element from the Should query, that is, the Term subquery to be replaced.
There is only one Match query left. One Should subquery is added, and the added subscript should be 1
. Set the attributes of the Range query.
Further Improvement #
In the above examples, queries are directly replaced. In general, you may need to make a judgment about whether to replace the query, for example, replacement may only be performed when the _ctx.request.body_json.query.bool.should.[0].term.isDel
JSON field exists.
The
conditional judgment of the gateway is very flexible and the configuration is as follows:
flow:
- name: cache_first
filter:
- if:
and:
- exists: ['_ctx.request.body_json.query.bool.should.[0].term.isDel']
then:
- request_body_json_del:
path:
- query.bool.should.[0]
- request_body_json_set:
path:
- query.bool.should.[1].range.isDel.gte -> 0
- query.bool.should.[1].range.isDel.lte -> 0
- dump_request_body:
- elasticsearch:
elasticsearch: dev
The feature is superb!