logo

NJP

Getting ready for the AIOps project

Import · May 16, 2024 · video

hello everyone Welcome to our discussion today with industry experts we're going to talk about AI Ops today and we have two great technology Leaders with us we welcome Alexander ystrom and AK baser and they're joining us from inar and partners right from Amsterdam so welcome guys how are you hey yes m'am thank you for having us here yeah we're doing great spring time is you know knocking on the the doorsteps here over in Amsterdam so yeah good Good Vibes a little bit of sound for for change so all good with us thank you awesome and let's dive in a little bit but before we get into the content let's get you guys introduced especially in our and partners uh because Alexander you've been building this practice uh for a couple of years and we have seen it grow so tell us a little bit about it and then pth to ative to to introduce himself as well sure happy to do uh well uh our pitch is pretty simple we do one thing and one thing only that is service now itm and aops um one of the primary reasons why I started a Partners was that uh in my previous line of work then I saw that many AI Ops and itm Transformations if they fail it was rather due to technical aspects but almost always cultural or people aspects so um that's what we're working a lot on not only the technology side but also the operating model side and embedding the cultural components of aops um so uh today we have clients across the globe and just like you said we're happy about the growth um but um I'm know onean Army but I also have AK with me so AK why don't you say a few words about yourself and what you do my name is I baser and I'm the R&D lead AI Ops and data science at an parness Research Unit our mission is to know the B know in ion and aios awesome yeah we really appreciate you guys joining the late hour uh and we're going to have a great session so uh let's dive in right away and Alexander let's start with you and one of the things we also see in service now is of course the AI opsis has seen a tremendous growth and customers are always asking us what are the right approaches to get AI Ops kicked off so let's talk a little bit about what are those first uh few steps maybe three steps to get your planning um started for aops yeah sure so um what we've seen just like you said there's a dire need to really sort of demystify what is required on that AI Ops journey and um of course there's many things which can be said here but um in a nutshell what we see is that there are three core pillars which is important for this success of an aops project or transformation uh one of those pillars is to actually in an early stage sketch out the future operating model of aiops and what does that mean then well we often say try to categorize it in four areas so the first one is where are we currently for example assessing the current maturity and preferably with brutal honesty as in our monitoring teams are they very old school or are we already into observability and site reliability engineering for example but with understanding where we are then we also have what do we want to achieve and here I think that a lot of organizations they miss to tie AI Ops initiatives to clear business outcomes and how to really Define these success criterias is super important then we have what can we use so aiops doesn't work in a vacuum but we actually need inventor critical data sources their functions and how it relates to sort of the future operating model and finally governance so how can we really optimize governance and plan for a future governance model related to aios um for example where observability is included so that's the first pillar to really at an early stage start sketching out that future operating model second one is related to we see a lot of companies that say view aiops as a tool and we always say view it more as a business function and if you start viewing aiops more as a business function then you can also start formalize some requirements so these requirements you can give to the business for example what requirements do we put on application teams if we want to be onboarded to observability what do we need to prepare so things like slos sis so availability targets health indicators so to really have this business function to be able to help and support service owners for example and speaking of services not all services are equal some are very critical some are less critical So within this business function it can also be really valuable to Define things like maturity models for example so if you have the highest maturity you put a lot of requirements on on the data that needs to be provided but may lower maturity scale for Less critical applications doesn't need as much um data and preparation um so that's really what we mean with a business function to really view it as not just a tool but a way to to communicate with the business and then the third pillar which we call AI is easy Ops is hard what we mean with that is well if we look at the the overall trends of aiops then the algorithm the anomal detection the clustering and so forth it tends to work pretty well actually but the operational part that's often where we still see organizations struggle so you can have great definitions of what metrics do we want to measure what sort of types what sort of logs what sort of traces and all of this is great but you know without the Ops part then the AI and all the anomalies in a vacuum They Don't Really provide maybe full value as they should so those are among many other things some of those three pillars that we see that that people should be doing with maybe a little bit more of intention at an early stage and these are great in fact I love it because there is a sense of current way of doing things and the future and then doing you know using using some of the things like AI to his Advantage but also making sure that you got the right uh ingredients and place as you call the operating model and uh and it's not a tool that's amazing right so business function is a you know more of a um an ongoing thing that you're serving for a business and that's exactly you know how we take it uh from our standpoint as well now I want to kind of dive into some of the things you said but also making sure I think it's goes back to the operating Model A little bit is building a right team and what kind of team there should be and what kind of sponsorship there should be I think that's a really critical point as uh companies start to invest in AI Ops and they want to make it successful yeah no absolutely interesting enough we've seen there a tremendous change uh just in the in the past couple of years because before aiops was maybe a novelty project coming from the the monitoring or observability department but now we actually see a lot that aiops is coming from an executive level more and more and also there try you to get an executive sponsorship we see maybe not as super critical but it sure helps a lot and also depending on the company if there is an outspoken AI strategy in place and to actually attach to that it can already make life much much easier so to have that executive sponsorship uh that that creates a lot of benefits let's say um another thing which we see tend to be underestimated is actually security and compliance because the nature of aiops is that we start sharing a lot of data and connecting a lot of data points and that can sometimes be a sensitive thing and now that just means that security teams they need to do their due diligence if you maybe are in finance or something similar there there might be compliance teams we're working with some public sector clients Etc but early involving security and compliance teams on the aiops journey we've also seen been critical for Success um taking those aside then we have also seen that service and application owners um to really deliberately involve them to make them also understand a little bit around the aiops concepts they don't need to be super technical they can even be more on the business side but to bring them closer that gives huge boost to the success of aiops projects because then it's not just a technical novelty project but it's something which can really be correlated to the business more and finally we always recommend to really create what we call an aiops task force or an aiops team of Excellence it can be small in the beginning but including things like Sr teams and observability teams monitoring teams for example some times um ml Engineers or data scientists but have it small scale in the beginning maybe create like proof of concept set up on a short-term basis we've seen that that is a more efficient model to slowly generate Buy in rather than big bag approach yeah I love this um because it's also when we look at the analyst community and analyst research reports it kind of rounds up to what you're saying because these are of course you got the right team but right team building of course needs the right uh places you're going to investigate the use cases which we're going to get into in a bit and then um it's not domain specific either right uh more more and more we're fighting is yes having a specific domain and expertise is great but for a longer success of the business you really need a rounded approach um or well red approach rather um so that's it that's tremendous we talked about the kind of the bigger pillars objectives we talked about the team but then I think naturally we you got to start looking into The Right Use cases so let's get into those yeah absolutely maybe you know that's that's the holy grade right so give the right use cases a little bit and that can look different depending on the organizations of course um but what we always say is try to get out of the weeds again it's not just about oh some new cool features for monitoring teams to work with anomalies for example or or predictions but we need to lift at a couple of levels higher and it's often tends to start with monitoring so depending on where we are there um the assis situation then potentially some use cases can all well be found like within the domain of of monitoring and and observability teams but there it could be things like okay we want to spend less time on thresholds and updating thresholds all the time it could be that well we want to create a use case where we actually provide more it intelligence to various stakeholders uh so how is the actual State and nature of a particular application for example um and also use cases to not just look at reactive alerts of course but actually start working more with behaviors with States having a look at seasonalities at Trends Etc but those use cases that I mentioned now they're still very much down in those weeds like within the monitoring teams if we take it up and Notch there then we've seen a huge benefit also for buying purposes and then sort of making it successful to actually correlate aiops to tangible business metrics um we work with some banks who used aiops with you know a very simple Mission we need to reduce P1 or major incidents and that was all it was about we worked now also with a lot of clients where we see that especially here in Europe new regulations put a lot of requirements on operational resilience so for example the Dora act so that is something else where aiops can and really speak on behalf of a broader let's say audience within the business but then if can also sometimes be about efficiencies so for example reducing onboarding time of Engineers so if you have a new engineer they tend to be expensive but if you have like a good AI Ops operating model then Engineers don't need to spend one two months learning how all the threshold should be configured and all the historical things but you know reducing the time to on Bo them essentially so these are some of the examples of those use cases um but irrespective of which one which are selected we've often seen that it's better to select two three golden use cases and focus on them rather than being stretched too thin and trying to fulfill you know 10 15 20 use cases but having like a really clear and crisp storyline where should our efforts go with the use cases what are those success criterias and then really having like a feedback loop from from people we see is super important there as well so this is some Food For Thoughts around use cases yeah no makes a lot of sense I I got to ask you this because this is the questions we are starting to get a lot from our customers how do you embed the generative AI into these use cases what are your thoughts well personally my thought is that generative AI is the natural Next Step what we spoke about before is that AI is easy Ops is hard and one of those big things which tends to be a challenge is things like recycling all incident resolutions seeing the you know at the tip of your fingers the solution to a particular issue and you don't want those things to be highly technical highly cryptic but you want them to be relevant and easy to to interpret and that's where we see that gen AI can really argument and boost the overall intelligence of AI Ops so quite related to the to the Ops Parts there is what we have seen it being pitched and used for awesome all right um great stuff and it's a lot to kind of unpack here so we're going to take a pause thanks Alexander um and we're going to move to AEF who has done a lot of research around aop so um Aki why don't you share some of the findings of of these research um in a nutshell yes inde need uh was R So based on our recent uh survey on uh data models uh analyzing the trends data models and cmdb we see that uh it becomes more and more important when it comes to aiops because um I mean it's mainly underestimated how much valuable business context a mature data model can um provide so and I always say the is very um data driven so um if you don't if you can provide the business context it's more um mindless analytics so we see that the importance of data models and the seemd be growing in this uh uh in our recent uh survey research and another thing is um I mean when when we look at the predictions and even if we go into selfhealing and Outsourcing decision making to AI uh within the aiops uh context and it's again um AI itself the models behind it in AI Ops are the large complex um models so um it's difficult to understand why they came into certain prediction why um uh they uh came up with certain recommendation so we see still some um let's say mistr or not mistrust but some um uh questions around why it predicted a certain uh um outcome and that's why we also argued last year uh during a conference um it you know there should be more um uh we should use more explainable AI models so to mitigate this um distrust in AI outcomes and another thing we observed um uh is aiops itself should be in a broader AI strategy um to it should be defined why you should use Ai and um uh and because it helps a lot when it's when a BS is integrated into in general AI strategy within Enterprise or business and because it it helps a lot to why you should um uh adopt as scale bi Ops within the it organization this is well you know the the three um really key uh uh insights we got from our recent um yeah research and uh survey actually from the the Research Unit that's great can you share a little bit about um I you know I and I 100% agree with you the data is such an important part of it getting getting those data right in your configuration Management Systems now let you know what what is kind of the steps that customers are taking um event correlation techniques or uh doing anomaly detection based on the data and the way it becomes uh how you inject more speed into triaging what what are you are you seeing any uh kind of insights coming from those uh things that customers are doing yeah so I can speak on behalf of that a little bit um because no one wants to start an aiops project if it means we first have to spend three years on creating a perfect cmdb so what what we see is that um having some degree of cmdb so we can do you know some sort of correlations and so forth is an added Advantage but more and more we also see that leveraging things like clustering based methodologies based on tags for example there can also be other sort of metadata such as times temps there is a natural domino effect when a major outage happens and you know these algorithms that that are commercial of the Shelf today are pretty good at determining correlations just based on very tight intervals and time stamps for example so it doesn't all have to be a perfect data model it helps if it's to a decent quality but combining that with with some of the the mechanisms that I just mentioned is what we also see that more and more clients are trying to leverage essentially awesome all right so we are almost towards the end again there's a lot to um you know kind of unpack and we'll uh get you guys back in here um in near future to kind of unpack more stuff but let's leave our listeners with some kind of tangible things what are some of the couple of outcomes benefits the teams should be looking forward to as they adopt AI Ops yeah for me I'm going to say simple thing here culture and becoming resilient so it's not just about a technical cool feature but this is a deep cultural philosophy and mindset that is one of those outcomes on a long-term basis that we want to embed thanks Alexander for AEF being here today we had a great discussion and we're going to have a lot more coming up so thank you guys for being here and we will see you again in near future our pleasure we're looking forward to it thank you for having us this morning thank J M looking forward to the next session and thank you to our listeners for tuning in today if you have more questions you can reach out to Alexander AEF and you can also reach out to your account teams to learn more about some of the the new things that service now is doing bye for now

View original source

https://www.youtube.com/watch?v=W6uB-M8lpj0