Beers With Cloud Engineers - Episode 23 - Cloud Provider Metrics With ACC-M
when we kick things off you want to get us started Mike yep absolutely hey everybody Welcome to session 23 of beers with Engineers um will and I are super excited to invite you yet again to join us to talk about kind of the intersection of service now and Cloud native Technologies um we'll if you want to kick us forward we'll we'll start off with the uh the usual Safe Harbor notice right um some of the things that we may be talking about today uh may not come to fruition exactly the way that we discussed them they may not be fully uh released and or generally available as of yet so you know don't go make stock buying decisions based on this and uh try and keep this as you know on the DL as possible although we will be publishing this to YouTube so U it's not exactly uh a secret anybody um so we'll go ahead and uh talk through kind of the general agenda um and we always try and keep it informal um if you may have noticed as we go through this as people join we make sure that they have the ability to um uh to talk so if you have questions thoughts concerns um please feel free to come off mute and just chime in and ask questions as they go the more interactive um the better it is for us um and then so let's talk about why we're here and why we um Kick this off so um the whole point that that will and I try to solve with this is that um we realize that service now customers AR aren't necessarily super talking about where they are in their Cloud native Journey um and they're not um talking to each other and so and there's a lot of folks who don't even understand that service now has capabilities in the cloud native space so we really felt like it was important to build a community for people to talk with each other about kind of where they're at and also for us to get the word out that that you know service now has a lot of capabilities in this space and um you know make sure that people are taking advantage of it in order to you know move down their Cloud native Journey so let's talk about who we are um so I actually uh used to be at service now um but I I went to the dark side as a customer um so I'm now the uh Enterprise applications manager for a company called drw um out of Chicago um I H love any and all things technology and I've been doing it for um far long far longer than I care to to admit um I am all about um trying to utilize technology intelligently and efficiently in order to solve business problems um I'm a a big nerd in all things nerdy I love board games I love video games um and I I love Jiu-Jitsu um and I love spending time with my family so um I am again not drinking beer today uh because I am uh trying to prep for a jiu-jitsu tournament um next month and so I'm trying to get down into the right weight class so water it is for me again today um over to you Bill will what beer are you drinking ah yes uh thankfully I am I'm not training for uh Jiu-Jitsu trying to make weight so I will be enjoying a one of my faves a juice bomb Northeastern IPA from oh the green screen is blocking it juice bomb North Eastern IPA from Sloop Brewing uh a little bit about me I'm a advisory itom architect at at service now uh currently working in a a new team since January called the itom center of excellence um for for those of you that may be unaware um starting with the beginning of this year we've taken the itom it operations management functionality and instead of having it kind of off to the side handled by what's called a spe team of Specialists from a sales and uh sales engineering perspective that's now um what we call graduated into our core Suite of products so now when you're talking to somebody from service now who's um talking about outcomes having to do with our itsm solution for example that same core team is now also going to be talking about things in the it operations management space and then the the convergence of those two things is something that we call service operations or service Ops so if you hear um you know your service now team talking about service operations service Ops that's really the intersection between it service management and it operations management uh I've been working in technology for uh at least three decades now doing different things around it operations uh server engineering and really kind of of late my Keen focus and interest has been in automation finding recurring tasks that can be fraught with human error when they're done manually and create um nice automated solutions to take that off the plate of our very important technical contributors in my spare time I like to hang out with my family play a little bit of pickup hockey and uh enjoy some video games okay so now uh into our Tech deep dive for this month and uh just just to call out this is the two-year anniversary we did our first beers with engineers in um on on St Patrick's Day of 2022 so now we've passed two years of uh of doing this and U Mike and I like to like to say this is like our favorite work day of the month is when we get to just get together with some folks and and talk about some of this cool technical stuff and and you know hear from hear from you all as far as what um what your challenges are what you've run into what you've discovered and kind of enriching everybody as as part of as as one of the results of that so this month we're going to talk about cloud provider metrics and the fact that you can pull those into your service now instance via our agent client collector so this capability comes with the uh agent client collector monitoring or accm capability so if you're a service now customer and you have our um for example our itom visibility or itom uh Enterprise or itom operator Enterprise or or Pro you're going to have this capability to pull metrics out of your various Cloud providers and the way that's implemented is with a group of plugins with policies out of the box as well as the ability to customize uh the existing policies create your own Etc they are available in uh slightly different capacities for AWS as your and gcp and I'm going to kind of dig into the detail behind that um as we kind of progress through our uh through our Deep dive today uh one kind of under underlying theme that I saw as I kind of set each one of these up and use them to pull metrics in from my cloud accounts is they do require credentials so uh for if any of you are discovering AWS for example today you you are um hopefully adhering to the best practice of leveraging things like aeral capability to do that this functionality right now does not leverage that it wants a kind of old school um either you know an Azure it be a service principle gcp it' be a bearer token AWS it would be uh an API key and secret and these plugins do all kind of require that you have that today day um the one thing I would say on that is because all it's doing is pulling metrics it's a fairly you know well controlled well minimally permissioned um service account that you would need to create or or service principle in Azure so it's you know it's from a risk perspective it's pretty it's pretty minimal but that is something to be aware of is that um today that's how this functions it is expecting kind of just a a credential for for these API calls that it's going to make to be available within your service now instance so what specific metrics can we pull uh just kind of run down through how each of the three Cloud providers is um is monitored with the content we have today um so I've got a couple slides here that basically just break down the policies that are present in an out-of-the-box installation of accm and then the underlying scripts or executables that they use running on your on your um agent host or hosts to get to get the data pulled into your your service Now metric intelligence um metric base and that's I guess it's probably worth just calling out that um when we say it's pulling it into the platform it's not actually sticking all these metrics into a table in your instance um we when you have uh iton visibility and Metric intelligence within your platform you actually have what's called a metric base instance that's running off to the side it's similar to the way if you're familiar with health log analytics similar to the way that with health log analytics you have some additional dedicated compute that's spun up kind of next to your your service now instance proper and is used to process that very specialized log data and we do the same thing with metrics by virtue of this metric base which is a it's it's essentially a purpose-built database that focuses on time series data and the so the all the scripts and execu that are used to pull the AWS metrics they're deployed via an accm plugin called um monitoring plug- AWS so what that is is essentially a payload that when you activate these policies within your service now instance it's going to automatically distribute this uh payload in the form of a compressed archive file so that all the requisite programs that are needed are present on on your on your um your agent host so here's the second page that just kind of D runs through the the scripts that you get out of the box and then it highlights down here the the plugin that gets deployed and just a couple obs a that I that I made myself when I turned this on it it does it it does use some resources on your agent host um when I first turned it on I had kind of the default rather conservative CPU protection settings in place on my agent host and within a couple iterations of pulling metrics from my AWS uh account and I did have basically everything turned on because I was you know I was wanted it to do something so um it you know ran my CPU up to 15% something like that which invoked the automatic uh agent protection and so my agent immediately stopped collecting so I did have to adjust that threshold uh accordingly and if you're looking to do this in production my recommendation would be have a dedicated uh agent or two that are you know set up or red redundancy to pull these metrics from AWS if you've got you know a a reasonably sized AWS footprint um the AWS flavor of this does not um there's nothing built into it to make custom customization uh as easy as it is with the Azure capability that I'm going to show you uh in a in a couple minutes so with Azure we provide some built-in hooks where you can essentially customize what metrics for what kind of CIS you want to pull um with AWS we don't quite have that level of ease but all of the scripts that are used are visible they're Ruby scripts so you can go in and U modify them so it's kind of like a medium level of customization they're not compiled executable so you can get in see what they're doing make a copy and have it do something you know if there's some additional type of cloud resource that you want to monitor that's feasible to do starting with what you've got out of the box um and so now we've kind of given the overview let me just switch into demo mode and just kind of show you what that looks like within the instance so in my instance I've got everything that accm does is controlled by these U these policies and out of the box you got several uh several AWS metric policies that are available um oh well two policies but uh this one is kind of the main one and it's got multiple checks that run inside of it so this policy is the one that I turned on for my account and then uh I designated a specific host which is running our agent to use this what's called a proxy agent and so we have kind of two fundamental types of content that you can run using agent client collector one runs directly against the host that's running our agent and that would be things you know we our agent will run on Windows Linux and Mac OS and so if you want to for example monitor endpoints which are running Linux windows or or Mac OS you've got a large number of checks which can execute via the agent directly on that host and give you information on that host in contrast and when we're talking about things like um Cloud providers you can't run there's no there's there's no capability to run our own agent within AWS right so the way we accomplish monitoring is via what's called a proxy agent and that's essentially just a proxy um a a a server which is running our agent which we then designate as this is going to run the commands thews commands in this case on our behalf and so I've kind of designated that pointing my policy at my specific agent right here and so what that does if I look at my agent which is a Linux server that starts to push the requisite content into the file system onto your proxy agent so that it can execute the checks that are part of that policy and so if I go to my cache directory which is created by the agent uh we can see that we've got a subdirectory named after that that plugin that I called out before as being the plugin that contains all the AWS monitoring stuff and so that's kind of the the format that we use we push these things down from the instance um in the form of a we we basically stick them in the work ceue for the agent and then when the agent phones home to the instance he says hey I've got some new monitoring plugins for you and the agent says okay and it pulls them down and stores them in this cache directory and then if we go under there it's got kind of a standard uh standard looking directory structure where it's got bin for binaries and then because it's using Ruby Ruby makes use of a a module subsystem that it refers to as as gems and so there's a gems subdirectory to store those Ruby modules and then a Live directory that's used for shared libraries and then when I go into the bin directory I can see the different scripts which then tie back to what's in the check instances for this policy so if I scroll down within the policy it shows you check instances which is essentially the individual commands that the policy will invoke on your proxy agent and so if I look for example at the AWS metrics drilling into that will show me the specific command that's being run has some usage information and the nice thing about this being a ruby script is that it's completely transparent if you want to see exactly what it's doing I'm in this directory on my agent and I can just look at the script directly and see see what it's doing and that's what I meant when I said we don't we don't give you an on instance way to customize this but you can start with this content and make your own version which perhaps pulls different metrics than what's uh than what's in there out of the box and these scripts in the case of the AWS content they are using stand the standard Ruby AWS API library to make these calls and to see the specific metrics that they pull in that's actually listed right here and so it's a frequent question that'll come up a customer has this capability they'd like to know hey what metrics am I going to get and so in the case of the AWS stuff this is how we answer that question is just by looking at this content uh if you don't have the policy activated you can always get this content by just clicking into the plugin itself it's stored as an attachment in the ACC plugins table so for every plugin that gets distributed to an agent host will have a record so here's the AWS one and so if I wanted to just peruse hey what scripts is this pushing down to my agent I can just grab this attachment pull it down to my local machine unzip it and Bruise the bruise the scripts to see what they're up to and so here's an example of just a couple CPU uh metric records coming in for some ec2 instances that I am Gathering via this policy this is the what's called insights Explorer just lets you kind of ad hoc drill into CIS and whole data based on what's coming in via the available metrics for a given CI hey Will yes sir just I want to jump in Dave's got his hand raised Dave I'm guessing you want to jump in with a question oh sorry I I didn't mean to do that my fault I was trying to C on something else sorry no worries worri no worries y but but uh good day sir good day it's good to see you guys again you too yeah likewise so yeah insights Explorer just gives you uh an easy way to browse the metrics that are coming in for a given CI type um you can generate charts and then with the plus sign you can create a save view so that when I reload it all when I reload the page it always comes back to a essentially a blank slate empty canvas and then there's a pull down list which remembers the views that I've saved off to the side and lets me kind of recall those as I need to so that's that's kind of the overview of how the a uh the AWS metric capability is put in place um I did put out an inquiry about making it more like the Azure one that I'm going to show you where there's more ability to cherry pick the metrics that you want via uh an interface that's completely within the um essentially managed within the instance as opposed to having to create a custom plugin and I'm told it's uh told by the product team it's on the backlog they're assessing actively assessing it because there's other work in place around open Telemetry which is a little bit of an overlap so they're trying to prioritize that stuff so I would say if the concept of um collecting AWS metrics via accm and then having that additional ability to more easily customize what metrics and what CIS you're interested in is of interest um provide feedback to your account team the um we have our idea portal which is ostensibly kind of the primary front door for asking for enhancements or uh up voting existing recommendations um I know sometimes I I I'd heard skepticism in the past that those ideas go anywhere um and kind of had a little bit of that myself but actually our upcoming store release is rolling out a couple capabilities that first were asked for by the idea portal there's some stuff around adding uh service mapping Affinity to Mid servers so in other words um having specific mid servers aligned to discover certain application services and that was something that originated on the idea portal and that's going to be reality I think with the June store release is what I saw but anyway that's just uh you know I I kind of wasn't 100% sure how ideas were um how the intake of ideas occurred or if it occurred and so the fact that we've got that feature rolling out as direct result of enough people expressing interest on the idea portal kind of reinforce the um my my uh my impetus to U recommend the idea portal and it it's very easy to to enter something so it's really it's kind of like you know why not why not provide your feedback there um yeah it's happened on multiple occasions that I've seen where stuff comes in as an idea and it eventually rolls out into a store app update so it's definitely worth utilizing that um I have a couple of clarifying things uh one you had said that this comes with itam visibility am I I would I'm sorry yeah I missed you good catch yeah it would be Health um okay it would be the health um Standalone skew or the it Tom operator Pro okay the other point and this is just to clarify for me um this is really effectively you've got a an easy2 instance running inside of your awos that's your proxy host and it's basically hitting the uh the API in order to pull these metrics and aggregate them and funnel them back to the service now instance is that an accurate statement okay um and one thing I haven't wanted to mess with but I I I also have questions about is the um cross account capabilities if I have a whack load of accounts in my org do I need one of these per account or can I do this you know cross org level did we use the existing AWS configs that we have set up across the orgs to you can so you got a couple options there you can one of the ways so I just did this kind of simplistically in my policy I just assigned a single proxy because that was all I had was was my one uh my one test agent and I only have one AWS account that I can work in I don't have ownership of an entire AWS org that I can do that kind of cross account testing with so what I did let see if I can get back to the policy record here something is something's like eating my CPU I guess because it's refreshing really slowly um so I've just got the single proxy agent option come on I've got the single proxy agent option checked and I've just got my single so you can do uh you can either do a cluster or you can do multiproxy agents by scripts um in which case if you wanted to Target so the the CI type that this policy pulls against is just the AWS data center so you can do something like um you can run it against AWS service accounts and then based on the service account that's being selected you can select a proxy that aligns to that service account what I the way I believe this is this will work if you don't you know if you don't distribute because it's using the API key in secret you would just stick multiple like you'd have you'd have an API key for each sub account you'd align that you'd list those all under an a credential Alias and then this just chicks it it it's just going to tick down each credential that's in the Alias and execute against that and got it so you don't have to necessarily point it at a given sub account because that's kind of integral to the API key itself yeah um again I kind of derive that by looking at the way the code is put together and how it runs I I don't have the ability to run that my myself um but that's what and that's what the doc said is it basically said if you put multiple credentials in the Alias it'll just run through each one and pull the metrics that that secret that API key and secret are entitled to pull that's nice so yeah it's um I I think they just kind of walk the line of not wanting to require that you had a specific setup like for Discovery with assume roles and stuff in place just to pull this stuff so they just kept it off to the side simple API key and secret honestly there's no requirement that this run in that this proxy agent run in AWS from our perspective um a lot of organizations are going to require that by virtue of their Network traversal rules including including service now service now has Network traversal rules that they'll block API calls to AWS from outside AWS so I had to put my mid server in AWS just so it could make the API calls but we don't explicitly require that if you had an on-prem agent as long as it had the ability to run AWS CLI commands you could use that to pull the metrics if you wanted to makes sense any other questions before we shift gears to the next one I had one I I missed the beginning but is this going to be supported in gcp also there is a gcp capability it's a nice segue I'm that's the next uh that's the next flavor I'm going to okay awesome gon go into all right so gcp um so the way it's put together for gcp uh for whatever reason right now is it's actually a a just a an option that's baked into kind of a one siiz or a one to many executable called common checks which is available on the uh it's it's a available as part of the monitoring plugin modules plugin which is a whoops which is a bit of a kind of general purpose plugin that deploys a lot of different supporting scripts and executables so the um this is kind of the least transparent of the flavors of the three um Cloud providers it in that it's a compiled executable so you can't just kind of look at the contents to see what metrics it's pulling um it's focused right now on uh GCE the computer engine the the VM information and another kind of slight difference this has a dependency on Discovery you have to discover your Google Cloud VMS in order for it to then construct the query that it hits the gcp with to pull the metrics and that's something that it shares with the Azure approach it does the same thing it basically iterates through the list of resources of a certain type that you've discovered and that are in your cmdb and then it specifically queries for the against each of those CIS that are in there so you've got that prequisite where you need to populate your cmdb with whatever it is you want metrics for and then you can get those metrics based on what's in your cmdb and so what that looks like on platform and does it split those so the VMS that are under one org right it understands it can quy those against that org but there's a a different or for another set of the gcp VMS okay awesome whoops seems so the policy that turns that on is called gcp GCE metrics I guess the plus side to a slow loading browser Pages it gives me an opportunity to have a sip of my beer so again we've got the proxy agent settings in this case I've got and again this isn't a requirement a base requirement of the product but within our specific um corporate infrastructure I need to query gcp from within gcp I can't just spin up a an on-prem um Standalone agent host and query our gcp stuff because we've got some security around that so then if I click into the the check instance to see what command it's actually running under the covers it's running this common checks command which is kind of a it's a wrapper that then has different modules compiled inside of it and one of those is metric gcp and this is going to get run against each Google compute engine VM that's found in my cmdb and that's done by virtue of this option in there's a monitor CI that's I'm the how did I get back to AWS whoops so that we've provided and part of rolling this capability out included providing the ability to specify the monitored CIS as um the result of a script so looking at the GCE policy instead of um what we typically see with a lot of the outof thebox content is we're just doing a simple filter where we specify a CI class and then apply one or more filter settings to pick a list of CIS that are going to be monitored in this case we're actually using excuse me we're using uh JavaScript to query the Google data center record and return back the U corresponding Google Google projects that need to get queried from the metric API call and then if I go to my gcp proxy agent our mandatory idle Timeout on SS session is not is not a friend to those of us who have to do demonstrations so here's my uh I bet that's the no that's the mid server so I've got my Google Cloud mid server and again there's no requirement at the platform level that that be collocated in the cloud that you want to pull metric from it's really just a function of what your specific Network traversal rules within your environment mandate that you do so again if I go to the the barache folder that the agent creates and and I've got my monitoring plugin modules directory I've got just the common checks executable so there's significantly less flexibility with the GCE stuff in that if you wanted to pull other metrics you'd really have to kind of um create your own script or executable more or less from scratch and generate the query and then the one thing that you in order to make sure that the data was in a format that you could then kind of ingest you'd want to look at this Cloud metrics check type within the instance which is what controls how that data is parsed and based on the um based on the JavaScript that kind of underpins this check type which there's going to be JavaScript under the covers that's parsing that data in theory you could kind of back into getting other metrics from Google that match that format and then you could ingest those via a custom check but again kind of on the Spectrum that's pretty close to the highest level of difficulty to get like kind of additional custom Google um metrics out of gcp so I would say if that's if that's something of Interest again um future direction is determined so much by customer voices I can't emphasize enough that if something would be of value get that funneled back through your account team put something in the idea portal or upload it because that's that's really what drives the the future Direction and where the resources go so that brings us to the the what we provide with with Azure and the Azure metrix that that's the most recently updated capability So currently you know and I break it down in the slides that there's different um we've got kind of a mix so we've got some that are similar to The AWS which is a self-contained Ruby script that has specific metrics that it's going to pull and so that's things like uh for various Azure DB instances uh Azure uh kubernetes clusters and then we've got a more generic check program called Azure metrics collector and this is kind of the metrics collector the most advanced metrics collector because it takes it we've broken out the actual metrics and the CIS that are collected and parameterize them into Json payloads that the collector actually just reads and then that's how it determines what metrics it collects and that's all delivered via this separate uh monitoring plugin called Azure metric collector and so when I get back into the instance I I'll run through what we provide out of the box which is there's a few policies that make use of this that you can use as a starting point uh and so kind of the major where where this is kind of stands above the other two cloud provider metric capabilities is we include in our documentation a process for adding new metric policies and it doesn't require a custom plugin uh you just create a new Json file upload it as an attachment and create a new policy and then you'll be pulling in new metrics that were previous not included with the out of the box this does require that you put your um your agent framework into developer mode and that is in the documentation I also call it out here there's a property called SN agent. deev mode you set that to true and that kind of allows you to add that additional add those kind of externally produced Json files to direct the metric collector similar to the gcp metric collector it does have a prerequisite that you discover your Azure estate and populate cmdb with your Azure resources and I think I already kind of touched on the fact that the the broad Strokes of the process are you create a Json file there's examples in the documentation where you essentially say here's the resource type I'm interested in and then here are the metrics that I want to see for this resource type and then you create a new policy that corresponds to that set of um that set of CIS and metrics that you're interested in so as I went through this process I made a couple observations that I wanted to share so it does have a pretty good debugging capability you can actually adjust the check definition so that it adds a parameter called Dash no log and you set that to false and then it puts Copus logging into the agent log directly when it's pulling these metrics and that was actually really handy for me because um the test I was doing was I was running it against Azure VMS and my the initial filter that I used to select the target of VMS was a little too broad so when I tried to run it it was actually pulling in VMS as well as uh scale sets and one of the kind of um boundary conditions for pulling these metrics and that's uh kind of listed down here this bullet right here oops I always want to click on the slides and then that advances them so this bullet right here talks about how all the resources in a given batch have to be the same resource type and so I was getting no metrics for my VMS because it was actually pulling in VMS and scale sets because they matched my my imperfect filter and I discovered that by using the debugging capability and then I fixed my filter and then everything was good and so the other kind of the other limitations are um all the resources in a batch have to be in the same subscription and they all to be in the same region so what'll happen is your um when the script that runs as part of the policy selects the list of CIS it's going to kind of make sure that it's B bunching them together so that they meet those requirements and so what that looks like on flatform and so the nice thing about this is for one you don't have to go on the agent or download the plugin to see what metrics are being pulled it's all handled on um it's all handled on platform so let's see Azure VM metrics is the one that I kind of turned on so see right get this a little more readable so the filter that I had set up see it's still refreshing so basically the filter I set up was just going against my VM instance table and then looking for an object ID that match is that Azure syntax to denote uh an Azure VM and then I'm not sure why see here just trying to get visibility to the script that it uses to pull down the BM I wonder if that's oh that might be on the uh check instance tab actually and then what the documentation has you do if you're doing a custom uh policy is you just create uh a Json file which lists the resource type and then the metrics to pull in the case of this since it's an outof thebox one we've got that uh we've got that Json file already available uh see and that's passed as a parameter right here and then if we go to under agent client collector there's a big file table and so then what it actually does under the covers [Music] is the check instance that runs on the agent host it pulls this file from from the platform it reads this file uses it to generate a list of monitored resources and then it stores that in the form of this file here that's underneath it so this file that file underneath there that's Dynamic and that gets refreshed every time the check runs if need be if new CIS have been added to the cmdb and again and like the Syntax for this is added or it's U documented in the documentation you can also just download a copy as a starting point take a look at what that looks like e okay so it's a little bit of an eye chart um but essentially you've got a fairly basic Json payload the Top Line tells it what resource type to Target and then there's a list underneath that which just lists out all of the metrics that you want to pull and so looks like that's based on the Azure definition for the Azure metrics monitoring platform yep cool yeah and there's a link in the when we send out the SL slides I included a link to the underlying um Azure API call which provides some additional information as far as kind of the rules of the road for what you can query uh from that Azure batch metric capability and so that there's actually um I had a question from a customer recently because AWS now has this concept of a metric stream where you can make cloudwatch metrics available in the form of uh uh basically kind of like a Kafka stream and so I did have one customer that expressed interest and if we were going to kind of move in that direction from a ingestion of AWS metric perspective and that is kind of one of those options that is being kind of considered as far as the the kind of the next generation of what we pull out of AWS so if there's interest there again provide that feedback back through the channel the channel you prefer so that we can kind of make sure the product team is aware of what what our customers are are most interested in and so just to kind of round it out the result of that whole thing is you get the ability to pull metrics from your various Cloud providers and then what can you do with those so we actually added a a little scoped app which is super powerful um it gives you the ability to very easily establish thresholds for these metrics and then feed those into event management so there's a a scoped app called metric rules now and if you're a if you're a health or a itom operator Pro customer you can create rules against any of these metrics that are coming in and use those rules to trigger events which then go into alerts when a threshold is passed and we do this is kind of on top of the ml driven establishing of thresholds based on Metric baselining that occurs so when you start pulling those metrics in the system will start baselining those and present candidates for alerts the problem with that is some there's a a ramp up time for to learn what's normal and then establish the machine generated thresholds if you want to kind of skip that part of the process at least for stuff that's just well established you can go through this scoped app and uh which presents just kind of a nice streamlined user experience for selecting CI types and then metrics then setting a threshold and setting some basic rules for escalating whether it's a single sample an average of sample over time or just all sample values over time and the result of that is that you can get these alerts when you know for for example a CPU CPU threshold is exceeded hey Will what's the default polling interval on the metrix I think out of the box it's 60 seconds just go yeah look it looked like it was one minute in that script that you just pulled up that Json payload the very bottom there it said well oh that's right yeah it's had the uh time green okay cool thank you but that's I guess that's probably there's probably so there's probably two things in play there um that's probably the granularity with which it's pulling the metric data um and then there's also the polling frequency within the policy which is going to control how often it pulls the data so you've kind of got a couple couple dials Avail available there you can adjust the granularity of the data itself as well as how frequently you pull to pull the data down I would love to know what's slowing my computer down so much like to like kill some processes but I'm afraid of killing the wrong thing and then terminating the entire webinar yeah that might be bad y are you guys gonna do another uh beers with Engineers uh panel or session at knowledge uh still open to that um right now I don't have any funding um last year we had a nice kind of planets aligned and the the cloud observability folks wanted you know to have some um to have a session where they could dig into the the newly announced uh SG hotel and so far nothing like that's presented itself so um yeah we were looking at AI Ops and implemented the otel service graph connector um for Discovery so we're still kind of waiting through that to understand how good that how good some of the data is versus how not as useful some of the other data is so gotcha but yeah I was gonna ask about that and then if you guys have any recommendations for good sessions that you know about and knowledge I have not seen the the the catalog yet um but I will give it a perusal and if things jump out I'll includ include it in the wrapup notes when we send out the excellent slides and such I saw one that I'm interested in it's uh I think it's for the app engine Studio piece but it's it's basically recreating Marvel snap within service now um if you've never played that game Marvel snap it's like a mobile game it's silly and fun but I was like that sounds like a blast if nothing else to just go and you know build to build a game in the platform so yeah that's that's pretty cool yeah and while so much of what we do is underpinned by app engine it seems like um it really behooves anybody who works in the platform to get familiar with app engine flow designer all that stuff yeah yeah so looks like the default interval for collecting metrics is 60 seconds um but that's you know you've certainly got the ability to to tune that hey and just really quickly we're five minutes over so I don't know if we want to wrap up the formal session and then shift gears into questions that's that's everything I had prepared so if there unless there's any more questions about the metric stuff we can go into Round Table mode no before you go to that um How is this um accm Works in conjunction with the HLA uh a couple different ways um I guess the most pronounced way that it functions directly with HLA is ACC if you have HLA then you can use the agent client collector to actually collect endpoint logs and send them directly into HLA um and then the other way that it works with HLA is if HLA is reporting uh if HLA is Raising alerts against CIS or CIS that are related to alerts that are coming in from this those will all get collected together so that you kind of have them all in one place for one person to analyze instead of a bunch of different alerts that potentially could be getting assigned to different people without awareness that there's something something bigger going on they both say you know within the standard event management framework and so they they both get the benefit of the correlation and the noise reduction that occurs within that capability okay how is this um different from the traditional item Health like how would it benefit from this new accm development that you're introducing like what uh feature would you U differentiate per se well it it augments it augments what you can do with health because Health um traditionally had been uh an event a receiver of events you would have external Point solution tools like solar winds uh in AWS you'd have Cloud watch and those would be each monitoring different pieces of the infrastructure they would each have their own set of thresholds and then when a threshold was pass they would send an event into service now which then would fall into event management and become uh turned into an alert which would then get correlated with other alerts that were coming in from those various sources what this gives you the ability to do is potentially reduce that tool sprawl and get those cloud metrics directly into service now and which at which point they can generate alerts directly without having to go through a separate tool so a common scenario that we'll see is where a customer has this capability and they also have a point tool that they've as a legacy that they've used to monitor aspects of their compute environment and they're seeking to reduce reduce that that tool footprint and get rid of some of the tools that they've had and this gives them the ability to do that they can just use service now to pull in their um their Cloud metrics for example they don't have to manage thresholds via cloudwatch alarms anymore they can manage them all on platform to reduce and kind of reduce the toil involved with maintaining that that's that's generally what the kind of The Sweet Spot for this oh gota so get rid of the intermediate layer in between yes okay thanks for that yep yeah and it moves the intelligence around you know kind of what constitutes a problem uh out of a third party tool and into the service now platform right so now we're ingesting the raw metric data from these Cloud components into metric base within the service now platform and then we can utilize the kind of like AI baselining as well as the um um the you know threshold building directly within the service now platform as opposed to like will had said kind of replicating and duplicating those efforts and some other tool and then ingesting the data from that other tool in the service now good thanks for that no problem cool all right well so thanks everybody for joining the session uh as always will thank you for the amazing Tech Deep dive um and we'll go ahead and shut down the recording now and and then open it up for our standard post session question and answer round table so thanks everybody
https://www.youtube.com/watch?v=fiY9moFHSYM