logo

NJP

Improving performance when importing data with more than one field as coalesce key

Import · Dec 05, 2016 · article

If you are planning to import thousands of records into your instance and you have a complex coalesce key to update data, this post is for you. Easy import, data load, and import sets are wonderfully designed to import data into your instance.

ServiceNow uses two steps to import data:

Data import is crafted to a very high specification, where the loading happens on the Data Sources while the transforming happens on the Transformation maps. Each execution is controlled by an Import set that displays the history of the data imported. Transformations maps can have coalesce field (keys to avoid duplicates) to allow them to update records.

I will focus on showing an example of a transformation map with a "complex key" (more than one field as coalesce) to update the target records which also avoid duplicates from reference fields (see below) and make one just one query (instead of multiple internal queries if selecting multiple coalesce fields).

image

On a transformation map, one, or several, 'coalesce' fields define when a record is updated. Whilst the transformation maps are flexible and configurable, when using complex "keys" some transformations are better with a "field map script" as coalesce (aka "conditional coalesce").

A few notes on coalesce fields:

  • Coalesce field searches benefit from indexes on the target data field they are mapped to.
  • sys_id is indexed on all tables, making searches faster if they are used for mappings.
  • Using a reference field as coalesce can cause duplicates if the referenced data has duplicates (see below).
  • Setting multiple fields as coalesce, they could cause multiple queries for each of the coalesce fields on the target data, increasing import times.

On my example, I will use the alm_stock_rule table. To the untrained eye, you would think it contains only strings, and integers.

image

A closer look at the alm_stock_rule table show the fields Model, Parent stockroom, and Stockroom are references to another table data (Reference fields). Reference fields store a sys_id for each referenced record in the database, but the sys_id is not shown. The reference field shows the display value.

(empty) or blank does not means the reference field is empty. It could be that the reference field display value is (empty) or blank. Always validate this by reviewing if it contains a sys_id value or not on the record itself e.g. Review the XML data of the record.

image

Coalesce using one-to-one field mapping on the transformation map

On reference fields, you can import data using the sys_id of the target 'referenced' data. However, most times, you would like to import data into "alm_stock_rule", using the display value instead to match the existing records.

image

For this example, we would use "Stockroom", "Model", "Parent stockroom", "restocking option" as key for updates.

On this transformation map, we would define the "Stockroom", "Model", "Parent stockroom", "restocking option" fields with coalesce "true"

image

imageHere is a list of pros and cons I've generated on using one-to-one field mapping on the transformation map for the coalesce fields:

PROS CONS
It is very configurable per field You have no control on the final searches performed to match the coalesce fields values to the target data. This means that more than one search could be triggered. Worst case scenario is that more than one per each coalesce fields may be triggered.
It is easy to understand If some of the coalesce fields source data is empty, it can trigger a query for (field=NULL) and the remaining coalesce fields which is unlikely to follow the indexes
No scripting is required It depends on the field mapping options available
You can map more than the display value of the reference field by using "referenced value field name" If some of the coalesce fields data holds very limited values (e.g. choice field) and the target table is very large, the query could be slow. e.g. you add impact as part of you coalesce fields, and your target table is incident. There is a case where query could be "select ... from incident where impact = 1" which could be a large query if you have a large incident table.
It is easier see which fields on the target table requires indexes (if the data is unique enough) It could cause duplicates if reference fields are used as coalesce (see below)

Duplicate records could appear if reference fields are used as coalesce.

Notes on coalesce on reference fields

In this example, the model we are importing is "APC 42U 3100 SP2 NetShelter." I have created two records on the model referenced table (it is not the target table itself but the 'Product Model' table which is referenced by 'model'). As this happens, the coalesce fields will match two, then the import will create a new unwanted record instead of updating it. This is a common problem as not all tables holds unique values.

image

On the import set, those records will show as State = Inserted when it should show ignored or updated

image

Using a reference field as coalesce can cause duplicates if the referenced data has duplicates

image

Coalesce on field map scripts

An alternative coalesce would be a "Script" mapping to the target "sys_id".

For this example, I will explain a technique of creating a simple coalesce field by field map script to the sys_id of the target. As sys_id have an index already, so the last search with the script result as coalesce is minimal. You would like to do this to have more flexibility on the final search generated to update your data.

When using a field map script, the previous example transformation map would look as follow:

image

Then set the field map script to match the sys_id on the target and make it the ONLY with coalesce = true.

image

On the field map script, add the script to find the correct target record:

image

Here is the script I used to find the target record:

answer = function(a) {           var list_to_compare=[["u_stockroom","stockroom.display_name"],                     ["u_parent_stockroom","parent_stockroom.display_name"],                     ["u_restocking_option","restocking_option"],                     ["u_model","model.display_name"]];           return findmatch(list_to_compare, source, map.target_table,false,true);  

}(source);  

View original source

https://www.servicenow.com/community/developer-blog/improving-performance-when-importing-data-with-more-than-one/ba-p/2290684