logo

NJP

Database Performance: Improving Slow OR and JOIN Queries

Import · Jan 11, 2017 · article

This guide is written by the ServiceNow Technical Support Performance team (All Articles). We are a global group of experts that help our customers with performance issues. If you have questions about the content of this article we will try to answer them here. However, if you have urgent questions or specific issues, please see the list of resources on our profile page: ServiceNowPerformanceGTS

This document is targeted at advanced developer level audience. I will discuss various methods to solve two problems. The first problem is when there is a query against a single table with conditions against multiple fields joined by an OR. The second problem is when a query requires a JOIN between two tables. Both solutions involve running multiple simple queries instead of a single complex query and splicing the results via script - effectively emulating a SQL sub-query.

NOTE: Beware, some of these solutions require increased code complexity and potentially requires dealing with large datasets in memory! Most of the time you can fix slow a query through optimizing at the database level using indexes or simply avoiding complex queries in the first place. However, the strategies described in this article might be worth considering if you have exhausted all other options. TRY SIMPLER SOLUTIONS FIRST.

Also, perhaps more importantly, the scripted solutions listed here include code that builds lists of sys_id's that will be used in filters. This is very memory intensive and we have included a hard limit of 1,000 records in our code to avoid end-user impact. If you are considering a script that will build a filter by returning a list of sys_id's from code, please write protective code to ensure you do not build too big of a list. Even a list as small as 10,000 sys_id's can severely impact system-wide performance. Your mileage may vary!

PROBLEM#1 - Queries with Multiple OR Operations on One Table

Sometimes MariaDB can't do an efficient query when using OR operations. Consider the following query:

SELECT … FROM u_table WHERE (a = X OR b = X OR c = X) AND d = 5

MariaDB tries to use indexes to optimize the execution path of each query. An index is like a phone book; it is a listing of records ordered by one field and perhaps sub-ordered by a additional fields. With three OR conditions (like above), the optimizer has a hard time because it needs to be able to search each of three fields and see if any of them match at all. Going back to our phone book analogy, to do a search like this you would need to have three phone books and then merge the results together. This can be done with something called an Index Merge* (see below), but in some cases, the same results can be achieved much faster by doing 3 separate queries and then merging the results within ServiceNow's javascript code.

SELECT sys_id FROM u_table WHERE a = X AND d = 5

SELECT sys_id FROM u_table WHERE b = X AND d = 5

SELECT sys_id FROM u_table WHERE c = X AND d = 5

SELECT … FROM u_table WHERE sys_id IN () <- MUCH BETTER!

*Index Merge

Some complex OR queries are better served by an Index Merge. For example, in the scenario above, suppose that 300,000 of the records in u_table have d = 5. The optimizer cannot sufficiently narrow down the result set by using an index on d. It cannot use a compound index on a, b, c because they are joined by OR statements. But if there were three indexes, one index where each of the three columns (a, b, and c) are listed as the first field in the index, then the MariaDB optimizer can use an Index Merge algorithm. This is equivalent to having three phone books, searching each of them individually for matching results, and then merging the results. Sometimes using an Index Merge is a more efficient way of dealing with inefficient queries that contain multiple OR conditions. The most common use case for this type of query is in the Auto-complete for reference fields functionality (also see here) or in complex user created reports (especially ones that use the dreaded über OR - where two parenthetical clauses are joined by an OR). To see if an index merge will work for your query you would need to ensure that you Create a table index for each field in your query that is joined by an OR condition.

MySQL :: MySQL 5.5 Reference Manual :: 8.2.1.4 Index Merge Optimization

SOLUTION #1 - Just Split It Into Two Queries

In some cases, you may be able to simply split the behavior into separate queries with separate results and still satisfy your end user use case. If you are creating a very complex query that is running slowly ask yourself if the customer would accept a solution that has multiple request/results instead of one merged result. If the performance is much faster with separate results, customer might be willing to bend on the requirements.

For example:

1 big report takes 35 seconds:

(Closed today) OR (Opened today) OR (Active=true AND Assigned To=Service Desk)

3 little reports take 4.5 seconds (even less if you can multi-thread them, as is done for reports on a home page, for example).

  1. Closed Today (1.3 seconds)
  2. Opened Today (1.2 seconds)
  3. Active=true AND Assigned To=Service Desk (2 seconds)

SOLUTION #2 - Merge Two Separate Query Results in Javascript

NOTE: Read note above regarding using complex coding solutions rather than optimizing at the database

Most queries in ServiceNow are done through the UI using the Condition Builder tool. In order to leverage a scripted sub-query method in the Condition Builder you need a piece of script that can be referenced anywhere. The standard way to do this is with a global script include.

1.Create a global Script Include called "eitherOr" with the following script:

var eitherOr = function(value, table, fields, active) {
  var value = value || gs.getUserID();
  var active = (typeof active == "undefined") ? true : Boolean(active);
  var result = [];
  var queryTable = function(table, field, qValue, resultAr) {
            var grQuery = new GlideRecord(table);
            grQuery.addQuery(field, qValue);
            if (active) {
                      grQuery.addActiveQuery();
            }
            grQuery.setLimit(1001);
            grQuery.query();
            //gs.print(grQuery.getEncodedQuery());
            var count = 0;
            while(grQuery._next()) {
                      resultAr.push(grQuery.sys_id+"");
                      if (count++ > 1000){
                                gs.addErrorMessage("SubQueryFilter: " + table +
                                "." + field + " = " + qValue +
                               " returned over 1,000 results. Results of this list are trimmed.");
                                break;
                      }
            }
            return resultAr;
  }
  for (var ia = 0; ia < fields.length; ia++){
            result = queryTable(table, fields[ia], value, result);
  }
  //gs.print("result: " + result.join(","));
  return result.join(",");
}

2. In the Condition Builder where you want to leverage the eitherOr method, create a scripted condition like the one below

image

NOTE: It is bad practice to hard code a sys_id. A better option would be to programmatically return the sys_id of the current logged in user, like the following:

javascript: eitherOr(gs.getUserID(), 'incident', ['assigned_to', 'caller_id']);

Once you realize how much sub-queries can improve the speed of complex OR queries you will see how this principle can be applied in other places as well. Another common area is with JOIN queries in which the WHERE clause includes conditions from two different tables.

PROBLEM #2 - Queries with slow WHERE Conditions from Two Separate Tables

MariaDB has an internal optimizer that tries to find the most efficient way to query data. The main way that MariaDB improves query retrieval time is by using indexes. An index is like a phone book that orders all the results from some table in a sequence by a specified field or fields.

Suppose you have code using a dot-walk and query like this:

var endOfLifeItems = new GlideRecord("item");
endOfLifeItems.addQuery("active", 1);
endOfLifeItems.addQuery("customer" = "Acme");
endOfLifeItems.addQuery("model.end_of_life", ">=", "2017-01-01");
endOfLifeItems.addQuery("model.end_of_life", "<=", "2017-02-01");
endOfLifeItems.query();

SELECT ... FROM item LEFT JOIN model ON item.model = model.idWHERE item.active = 1AND item.customer = 'Acme'AND model.end_of_life >= '2017-01-01'

AND model.end_of_life <= '2017-02-01';

View original source

https://www.servicenow.com/community/developer-articles/database-performance-improving-slow-or-and-join-queries/ta-p/2299441