IntegrationHub ETL in ServiceNow | Share the ServiceNow Wealth

so what we want to talk about is one of the new features that was released just after knowledge this year that really does improve the capabilities for customers to ingest data into the cmdb from third-party sources or from manual data sources in a much more trusted manner previously we required just using data sources in a simple transform map and 99 of customers out there would simply just coalesce on one value and have no understanding whatsoever of the identification and reconciliation engine that we want to try and utilize prior to data getting into the cdb so the integration hub etl is that tool set and it really does allow customers to not really have a thorough understanding of the ire but still leverage it effectively so i don't want to spend too much time on this slide deck i've only got a few slides and i hope that most people on the call have a familiarity with the cmdb and the value that it brings to the to the servicenow platform but one thing that we certainly recognized over the years of working in this space is it's very easy for customers to load in junk data and that seemed to be to become a untrustworthy tool and therefore they don't leverage it to enhance their processes so you look at the incident form or the change form they may have hidden the configuration item reference field and they don't use it at all to receive any of the benefits from the cmdb a lot of that comes down to not trusting the data or using just flat files that were loaded in one time and not updated over time so we want to push our customers to obviously use servicenow's discovery tools it helps us from a implementation standpoint but it really does generate the highest quality data that's loading into the cmdb and also guarantees that that data is going to the tables that servicenow expects that data to reside some of the naming conventions of tables aren't particularly fantastic so it can be confusing for customers to know where they should store say for instance an application so using discovery and service mapping takes away that headache but we typically still have a scenario where they do want to bring their own data to the table whether that's connected into a other third-party tool the likes of a an secm a jamf an altiris or it could just be they'll have their own repository of information maybe on the sharepoint or on a maintaining excel that we want to try and steer away from but sometimes there's no other option but to to leverage that data and usually this will come in if it's something that isn't discoverable so maybe it's ownership information location information something relying to policy and compliance that we want to stop inside the cmdb so what we're going to do is going to take a look at how we can leverage this new integration hub etl which realistically acts as the the front end for the robust transform engine that transforms the data from the schema map of the data source into the appropriate format for servicenow and then push that data through the identification reconciliation engine which helps reduce the likelihood of duplicate data being put into the to the cmdb or data from other sources overwriting existing data that we don't want to have in place so we're in our instance right now and i'm just looking at configuration items that's been created today so we're starting with a blank slate and but customers come to us with this spreadsheet and said to us hey we want this to form all our our cmdb and prior to having the integration of btl and the robust transform engine it was a bit of a challenge to insert this data effectively into the right places but also insert them to generate relationships of the appropriate type between these different ci's because we can see right here we've got a good information to work from but it's likely going into different classes we've got some application information we've got some business applications and then we've got some hardware information that we want to insert as well and we've got some network card information that's also going to be a slightly different class so could be a little bit more tricky to do this would have to to script it out to make sure that it's going into to the right places and involving multiple different transfer maps to to handle this data set we're going to take a look at taking this information and put it into the scene to be using the integration of etl so i've created a data source just like we would have done with a regular transform and i've attached the hardware in an app service spreadsheet that we're looking at here completed the appropriate configuration for the data source and we're in a position now to to load in those records so if i come over to my integration hub etl i've got a few different cnv applications that's available to you right here some of these have been loaded in from some of the store apps called service graph connectors and we'll talk about those shortly i'm going to take a look at the one that we've created here for hardware to application from excel some of the the interface here going through the wizard is quite straightforward to to set up one of the key things we need is the the data source itself so the data source what we have defined we also have to set the discovery source on the ci record and that discovery source is what's going to feed into the identification and reconciliation engine to make a determination of if it can update that record so if i just come over and just quickly touch on that identification and reconciliation engine the identification part is looking for unique identifiers so it can match on its existing records in the cmdb so from a hardware perspective these are the ones we bring to the table out the box we're looking at the serial number serial number type on the serial number table if that isn't being populated we fall back to the serial number value on the hardware table then we go down to the the name of the the hardware record and then the final priority is a combination of the mac address and name if we wanted to we can add additional identifier entries if we wanted to get a little bit more granular or we have a feed of information that may not align to these identifiers but it's typically good practice especially for the hardware side just to leave it as as it is and as serial number does a a good job of acting as a unique identifier and then leveraging that discovery source this is where we can get control over what different sources can do what with that information so i've just set up a quick example here saying that discovery from servicenow is going to be my highest priority and i want it to be able to update practically every single attribute on this class so there's 92 attributes on the windows server table servicenow discovery is going to be the primary populator for for that information and i'm saying i've also got an integration with the tool land desk and land desk is our source of truth for the assigned to information and location information of our inventory so what i'm saying here is any data that comes through from land desk can come in and update these values but land desk won't be able to overwrite say for instance the operating system information or the memory information that's been populated by servicenow so we can prioritize which source we trust the most and which source gets to update which records so one thing i'll point out here if you're on an instance older than paris you may notice something called a data precedence rule it would typically have to be at the bottom but they've kind of merged them together in paris and merged it just with the reconciliation rule so you now give it a priority in the actual reconciliation rule so if you go into your own instance and are looking at this and like well why doesn't it look like that that's why yep exactly and just while we're here the data refresh rules you can specify a time frame if that data source hasn't updated the record in say 14 days you can fall back to a low prioritized data source and they can then update the record so landes could potentially come and update records that servicenow had originally populated if servers now hadn't touched it in 14 days or whatever time frame you set it to let's come back over to our etl and we'll work through transforming our excel spreadsheet into an appropriate format for servicenow cmdb so looking at the preview and prepared data this is one of the more important steps that's going to be going on we take in that raw data sets and the columns from in this example just our spreadsheet and we want to try and convert this data into the appropriate format that fits in with the service now cmdb so i've got this column name here for computer name it looks quite good in its current format but i haven't done many of these kinds of imports in the past sometimes you get data where it's in a like a fully qualified domain name format and that doesn't always fit in with how you want it to be stored inside the cmdb so inside the integration hub etl we have the integration and cn2b commons already associated so this has a number of out-of-the-box transforms that allows us to quickly manipulate the values that's come through from the the data source so i can cleanse the ip address if it's not in the appropriate format and the ipv version and there's one as well in here like the example i'm just mentioning with fully qualified domain name where i can process it and pass it out into the appropriate formats of where i want to store it inside the cmdb in addition you can call script operations as well so that you can run the different script includes across the the data set and if you want a little bit more information on some of these conversions the dark side do a pretty good job of running through what these individual operators do with some examples of what the raw data insert is compared to what the result would be can you build your own brand new transforms are you limited to what servicenow is provided no outside of the script so there's there's the script that's in there they've got some basic steps that aren't like the full operations similar to concatenation or trimming and splitting so the cover of most of the bases but things that potentially may not work you probably just have to fall back to the script and the one thing i'll point out with the script is you want to get used to using another ide because the way they did this is similar to the new ui in that that side panel that you see there it says new transform that's the space you get to write your script so mark select the script part for a sec you get a box that big and it doesn't really expand very well now they added this in paris this was not there in orlando i'll tell you that much right now so there has been improvements yeah rob and i were part of one of the very first implementations at a customer site of the integration of btl and the very early version had a lot of bugs in it there were some scary moments where you thought everything had disappeared but it has got better and you can also start doing some cloning activities now to duplicate existing etls so you don't have to reinvent the wheel each and every time the other thing i'll say here too is if you ever start playing with this and you start a script for the first time do a demo of it and output everything that gets put into it because at first it's a little confusing about what the batch actually is and what's inside the batch and the output variables and all that different stuff if you just put the loop in and output everything and kind of do a json stringify on it you'll get a better idea for how everything actually looks and it'll make a lot more sense it took me a bit to figure that out because it wasn't well documented what gets put in and how you're supposed to modify it but once you get used to it it's actually pretty simple to use yeah so once we've gone through our steps of this is an example of using concatenation between the make and the model once you've gone through and cleansed your data and got it into the appropriate format for storing the service now cmdb we come through and start mapping it to the appropriate class so we have a couple of options where we can just simply add the class itself or we can do a conditional class and this allows us to find various logic and if statements to basically say if the data source for operating system contains windows then i'm going to pop it into the windows server table and i could string a number of these together if i wanted to look for linux for susie i could look for solaris whatever it may be but for this example i'm just falling back to to hardware if it doesn't contain the word windows and then we'll take a look at the the mapping itself so the interface that we've seen right here is very similar to what we see over in flow designer we have the source data columns and from here i can just drag and drop into the various different attributes that we find on the windows server table inside the cmdb so the top right here this is going to be our primary identifier feeding into the identification reconciliation engine using the the serial number we also have some reference lookups as well that might be a different class to the windows server class we're working from so this is where we pop in that serial number table we're populating the the nick information and anything that i can take from my source and put into the appropriate place now i needed to add more data i can just choose any of the the columns that is available from the table i'm working from so i go through each of my different classes that i want to have populated utilizing the the source information until i've got all of those mapped out so mark before you go forward here one thing i'll call out here you'll notice things seem to load up really quickly and it's because when you initially load this up it actually loads a lot of this stuff into memory that is great for performance when switching back and forth what it's not good for is if you do any sort of attribute mappings or anything like that in another window you'll notice that it may not show up or certain mappings are missing so just keep that in mind it tends to cache a lot of things and something weird's ever happening with it just reload the window and it tends to fix things yeah and in addition if you did clone this this transformation this seemed to be application for etl and you came to the these mappings and it looked like there's no information there that again is just because it's not cached and viewable it hasn't disappeared but it will give you a bit of a brief heart attack especially if you've got 20 different conditional classes with all the mappings laid out so we've been able to populate the like the standalone ci's themselves with the data coming through from the data source the next aspect is building up the relationships and the relationship logic between the different classes that we've we've put in and this is really where you start getting into a differentiator of the ease of use of the etl over using previous individual transforms so from here i can i can leverage the csdm reference guide so i know exactly which relationships that i should be should be using and i can specify that the parent application service it's got a depends on depends on or used by to the windows server and then i'm in a position where i want to run this integration so it's going to take that import set it's going to run it through the robust transform engine and it's going to provide me the results so i could do with this in a small sample size i'll do a cross reference with the excel spreadsheet so i have an understanding of what the data should be that's coming in and i can see that i should have three application services associated with two business applications and i've got all the other classes that we've populated from our source data and in this as well it's going to generate any errors if we had trouble inserting that data or if it had a identification reconciliation error this will all be presented here in this interface and i can see the activity log of what actually occurred so i have full visibility into how my transformation actually applied now this went quite smoothly because i tested it out previously but that's not always going to be the case there might be some data gets inserted in the wrong place or you made a mistake on when you dragged over the transform mapping so one of the benefits we have of using the etl is when it comes up to this format is complete on this step i can either retain this data or i can perform a full rollback so i don't have to go searching through six or seven different ci classes and deleting the data that i erroneously inserted i can just perform rollback from this interface but for this example i'm going to actually retain this data just so we've got data to work with if i come back to my configuration items and just refresh this list we can see that we have successfully inserted all of the various different ci's from that spreadsheet and put them into the appropriate spots inside the the cmdb and i can confirm as well that if i look at my dependency viewer we do see that we've got a csdm compliant where we've got our payroll business application and associated with a payroll us prod and then the difference windows servers that's used to support that application service have i approached it from using the business application as our primary points i see that i've got my development application service mapped out with the servers that's used to deliver that dev instance any questions related to that of how we were able to successfully transform this raw data with just simple columns into a csdm compliant model inside the cmdb hi this is michael a real quick question for companies that are very highly virtualized so you have like vsphere that run on you know you have windows vms linux vms that run on a host is there anything different about that kind of setup than what you've shown no there wouldn't be so i got the windows server from if it was virtualized if you ran discovery and in addition to run the vcenter collection you would have a virtual machine instance ci as well associated with that windows server so that's an additional visibility we get from that connection we could build out that same relationship and data point if it's represented in say the spreadsheet but the one thing i always try to stay away from if you've got highly dynamic data such as the virtualization layer because you might be motion it over to a different esx host that's not really something you want to rely on fairly static data like a flat file so i'll be looking to run the vcenter collection directly against that instance so that you can have reliable information brought in thank you i really hope that someday soon they take the vcenter integration and put it through this or put it into a pattern and get it out of a probe because of all those issues that we just talked about there yes and so just for other people like when we looked at the reconciliation rules right here the the job that's involved for the vcenter doesn't go through the identification reconciliation engine so you may you could potentially generate duplicate data where you've already inserted it say from a flat file for all of your esx hosts but if you don't specify the more id of that esx then it's not likely going to match on the existing record and you'll get duplicate data in the cmdb one thing i wanted to show finally is i took that excel atl and i basically just duplicated it and created one for a rest insert the only difference here is i just changed the data source everything else remains exactly the same so this here is just mimicking another data source type which is very common they might have a third-party tool set that's going to push data or you might build at a via rest i just want to showcase that in action as well so i've got this simple json payload and the situation here is we've got our u.s prods application service and we've just deployed an additional server to support that application service so i'm going to send that over now to the servicenow instance i'm going to come back and it would run through that transform and let's find my tab for my thing right here and i'm just going to refresh my dependency map and we can see here that we've automatically ingested that information coming through from that third-party source and it's built out the relationships to our payroll us prod and it's added the new server to that model so we can if we have a trusted gold standard of data source that we want to use to populate our data around relationships we can either set it up on a scheduled job to return that information or rely on them pushing that data to us and it goes through our transformation engine to build out the right ci's and the relationships between them so i have a question there i noticed you just used the rest to insert straight into the import set table so what would your data source be in the event that you were just doing that yeah so the data source that i created is just specially import set table and when that gets updated it calls that robust transform engine that we created the hardware to application service from rest and then it goes through which is just so because that data source was a type file but it is it's just going to look for any data source with a matching table name yeah so i use the type file here to generate the columns in the import set so originally i just attached this and it'll load records to build up the import set table and then from there it's going to look for anything that gets added to that import set and processes it this is not really technical but are there any licensing costs associated to using it yeah so this is where i mean servicenow evens don't have a great handle on it and i often have to help them out when they're having that conversation so for what we did right there there was no licensing cost association but servicenow have released what they call service graph connectors and there's a number of them for like the common third-party sources that you'll typically connect into so whether it's like a microsoft intune jamf solarwinds these are all available for people on the store and they'll essentially just create a cdb application like i've got here for microsoft intune any data that comes through this is part of like the item visibility licensing so we end up populating a ci records in a subscription unit based table then it will consume a subscription unit so some people might be asking like what is a subscription unit table so we take a look at like servicenow's licensing if i use that etl for intune or for sccm and it generated a server record or any of its extended classes from server then it is going to consume a subscription unit and you require item visibility to to do that so it's always something to keep in mind that's just for the service graph connectors though not if you build your own transforms yeah exactly so i did like via the rest inserts and i did it with the excel that wouldn't consume subscription units so essentially if i come to my like ci is updated there we see that i defined the like the discovery source and some classes like network adapter don't get populated but this is what it's looking at to determine whether it's licensed or not so these ones here rest and import set and not going to be licensed but the ones that i have coming through from like intune or from sccm those are going to have a discovery source that is going to be licensed the other thing i'll point out here too is we're we're showing the kind of gooey front end to this for the seam to be side of the house they do have plans and you can actually kind of do it now in weird ways but you can actually use this for non-cmdb data as well it's not going to be pretty like the etl side of the house is right now i think in the future they're going to add it in but the robust transform engine is actually something that can be used outside of the seem to be side of this i've never done it i've never touched it for that because there's a ton of tables involved with it and honestly if you don't have the nice front end there's no point really using it simply because it's not giving you any value but in the future i fully expect that's the direction they're gonna end up going with this that was gonna be my question as well is can we use this for non-cmdb so like if you're doing it an integration with like incident management so to speak so instead of using like a transform app and data source can you use the robust transformer but you're saying that in the gui it's cmdb specific so you don't get all the nice front-end stuff to set it all up you'd have to kind of come in on the back end and do all the stuff manually that's what you're saying pretty much yeah so what mark's showing you here when you do anything in that front end it translates into these tables there may be some other ones as well but these are the main tables here that you see in the tabs and this is how it does all that transforming without the gui it's actually extra work than just doing an import set and writing your own script right right right the whole value of this is the gui and everything goes through ire because i don't know if anyone here has ever tried to bootstrap an import set in a transform set with the ire it's messy and it wasn't done very well and you could tell it was done afterwards so like before and after scripts don't really work in it you basically have to have everything mapped through a field mapping in order for that part to work if you just bootstrap it on top of an import set so the real value here is the low code no code mixed with the fact that everything goes through ire with it i fully expect probably maybe not quebec but like rome time frame they will somehow unstrap this from ire as well and give you one for incident or any other table yeah gotcha good to know thank you the other thing i will mention too you may end up getting some gooey bugs like i did if you do this table breakdown that's being shown here is really important because you can actually come in here and you'll see duplicates and you can remove them so there was a bug in orlando they'd probably fix it right now where if you didn't specifically click the x button when you were doing a mapping and went and just selected a different field to map into it it wouldn't actually remove the old one and you would get two of them in here and when you would look on the gui it would show that it's mapping there but then when you actually looked at the results it wouldn't pull from that attribute and simply because in this rte field mappings here there were multiple entries for that specific field it was mapping into and it would randomly pick one so this part is actually really important to be able to see what's happening in the back end here for when those minor day one bugs end up happening and i was able to get around most of them but i'm sure by paris now they've probably fixed most of those it's no different than when flow designer first came out and it had its gui bugs yeah they've released a number of versions since the one that we were working on i think we were on 1.1 so they've fixed some of the issues that we certainly experienced and made a little bit easier to work with and a bit more efficient as well i noticed you had like two base classes and then a conditional class so does that mean that it's going to always create two different or a total of three ci's one for each base class and one for each conditional cloud so it really just depends on what's in the source data if it has nothing to map then i'll just skip over it okay but if it's a base if it's a base class that means every single row goes to that if it's a basic class then would it mean every single row goes to that ci class or am i now if i so i come over and take a look at the so it's just a basic class so there's no conditional statement if i want to select it i still have my just everything on that raw from the data source and i just select what i want to move over that's appropriate for this class so this one was quite straightforward the business application name is going to be the the native key and here i'm just storing some information regarding the itiona so just selecting what i've got available to me from the data set that's going to be applied to this business application let's say i guess i was thinking more of like if they were all met new ci's as opposed to updating the existing ci so in that case would it insert a ci in both basic classes or just one of them in this example it would insert in both of them because that's where we're specifying like the business application name is going over to it now so in our example right here because i've got application service business application sometimes customers don't know the difference and only send one you might want to build both of them out of just that one value so i can still i could leverage the same field like i did with the apona and place it into multiple places of the cmdb i see that got it thank you now i will say this it'll take you a bit to get your mind wrapped around this so when we did this for a customer we had three layers deep that you could go so you could have one ci relationship another ci relationship and another ci but when you actually look at like the import set table it's all a flat one row where we had like l1 app owner l2 app owner l3 app owner so like they each had their own attributes but when you insert it into the table your l1 l2 and l3's oh and your layer structure is all in there on the same line so that when you bring that back and you map it in you end up getting that nice little structure now if l3 was blank let's for example we had it so that they could send a server database instance and then a database itself well if they didn't have database it would still work you would just say the way that you end up doing the mappings in here you would end up with the server and then the database instance so it's all based around what's actually set and if something's not in there it just kind of loads in whatever it possibly can for based upon how it matches the ire and everything so it takes a little bit to get used to but once you do one of them you're like wow this is actually pretty easy once you figure out all those little minor things i know i i could see that it seems like it's more intuitive or like with transform maps you would have to create a separate map for every single table you want to go to exactly exactly that that was the one of the biggest issues with it right it made it very very difficult to do that or you had to do a massive before or after script to actually insert the data on the actual loading end of the record right so this gives you complete visibility into what's being put in and i didn't use the final step when you do this where you can do your test and see all that stuff i didn't use that much now that i've seen it in here i'm like wow i really should have used that more the first time i did it because it gives you all the information you could possibly want to see nice similar to what christasio has mentioned earlier there is service graph connectors that you can install and again if these populate data that are part of a table that's part of the subscription unit it will consume a subscription unit and you will require item visibility licensing to be able to enable them so it's only one thing to keep in mind but if you've done sccm like the connector before or jamf they are quite straightforward so you can build your own without having to to leverage servicenow's creation and if you take a look at this and the sccm one it is exactly the same as the the plugin that has been around for eight or nine years so there's nothing new in there you can essentially reverse engineer that jdbc query and just put it through the integration of btl rather than the transform app and the one takeaway i want to make sure everybody's aware of though we're still at the we're still limited to the quality of data that's coming through from those sources so take for example if you're using scum microsoft scum is one of those sources it doesn't really know too much about the ci to effectively create a record so maybe i'll get us like the main information the name information but we're not going to get anything regarding software limited hardware information coming through from it sccm is a good deal better and we can populate os information some hardware information but again the key thing that's missing from those data sources is the relationships that exist between itself and other ci's inside the cmdb so we can't do effective things like root cause analysis or impact analysis and when these are all standalone you always still want to advocate for using servicenow's discovery capabilities get that robust complete ci form and also all of those relationships that exist between themselves and other items inside the cmdb and we can also do on-demand updates to the configuration item we can kick off a discovery job and it'll refresh all of that information inside the cmdb and also we can pair it with discovery's cousin of service mapping where we can group all of those ci's together into what they deliver for the business so that we can start determining the risk and priority of certain tasks and if anybody wants some more additional information there is a course on now learning that goes through the the creation of the like an etl and they give you all of these spreadsheet to provide and you go through some of those integration commons transformation elements and then i included a link also to the to the store app so you can see the release notes for the different versions that do come out thanks for taking time out and joining us today yeah we'll talk to you guys later [Music] you