Now on Now: Turning AI governance risk into reward with the AI Control Tower

(relaxing music) - Welcome to our Now on Now session on turning AI governance risk into reward with the AI Control Tower. By way of quick introduction, I'm Girish Srinivasan, VP of Data and Analytics. I've been with the company about five years in the DT organization, and have been in the data science and analytics space for over 20 years now. - And I'm Brian Hoffman, I'm Senior Director of Enterprise AI. I've been with ServiceNow almost 13 years now. And like Girish, I've been in the data science and data analytics area for more than 20 years. - Let's get right into it. Previously, most organizations that invested in AI did so in a limited and sometimes in a very siloed manner. Take for example, marketing using AI and ML techniques to drive greater marketing effectiveness, or for example, sales-taking AI and ML techniques to drive prospecting and targeting. Now, this was no different for us at ServiceNow as well. However, over the last couple of years, what we've seen is more of our departments, more departments in our company have started to develop their own AI and ML models to accelerate their individual outcomes. And while that happens, all these outcomes tend to be driven in a more decentralized manner. The development of AI and ML routines and signals get generated in a more decentralized manner. And so while that's great for an organization and while the scale has changed, and as digital workflows have started to intersect, all these signals are essentially connecting and cutting paths, and therefore, organizations really need a way to track all these AI models and the signals that they leave behind in a consistent, cogent, trackable manner. And so for example, if let's say the HR or the people organization is getting ready to deploy any AI model, it's no longer just them that's doing so, they will have to involve our legal, privacy, and ethics and compliance teams as well. And likewise, if marketing is getting ready to deploy certain AI models, it's important that they involve, again, legal teams as well as potentially our sales teams as they generate signals that then influence sales downstream. And so as a result of more and more interest within the organization across all these departments, we just have more questions to answer. Things that usually just apply to normal business processes, apply even more when it has to do with AI helping run our business. You're likely to get a lot of these stomach-churning questions. At any point in time, we are gonna be asked, "What do these models have? What variables are being used? What outcomes are they generating? Have you gone through all the approvals? Did you think about the architectural implications? Have you thought about bias and how that affects the quality of your output and your recommendations? How do you know if your models are performing well? Are they underperforming? And ultimately, are they delivering value? What's the ROI for all the time and effort that one has spent, is it really driving the outcomes that the company needs? Do we get the necessary authorization before we actually launch these models?" And so while we get all these questions internally, this is also the same buzz that we are hearing on the outside in the marketplace. And a number of articles you'll see are asking the same questions on governance around AI, the risk that it poses, but also the opportunity and the value that it can generate. And so if I were to summarize the nature of questions into into four distinct categories, they are, "Do we have a registry and do we have a way of maintaining our inventory of AI signals and models? Do we have a way of managing that lifecycle? To have all of our privacy, security, and legal concerns being addressed before we went live with these models? How efficient are we being in gathering these signals and pushing them into workflows and driving action? Are we doing this ad hoc or are we being really efficient about this?" And then last but not least, "Do we have a record of the value that these AI signals are actually delivering? And is that value increasing over time? Is it decreasing over time?" And so these are the four fundamental questions that are being posed to all enterprises as well as us at ServiceNow. And so to answer those four questions, I invite Brian to talk us through our solution developed from the Now platform through the Now on Now efforts, Brian. - Thanks, Girish. So Girish mentioned those four critical areas, the challenges that face enterprise AI teams, not just at ServiceNow, but anywhere. And since there was no commercial solution available that addressed those, I'm proud to say our enterprise AI team took matters into their hands. They took the Now platform and built out tools to help alert, monitor, and notify us when there are gaps in model execution or other areas that we should pay attention to. So this is what the AI Control Tower looks like. At the top we see summary information that helps us figure out where do we need to go and take a deeper look. But below that is this list, model registry, model by model of what models are built in our group and in other groups within ServiceNow. With more than 100 models in production, it can be a challenge to track these and figure out are there areas where we need to go back and reinvest, are there challenges that we need to fix, or is that model performing as expected and we can continue to invest in new models? Here we see the sections that we monitor on a regular basis. Approvals are those mechanisms by which we have to reach out to other teams like legal, or privacy, or security. And to be frank, our data scientist hated submitting approvals because for each group, they didn't know what form am I supposed to use and to whom do I send it, and then once it's approved, keeping track of those approvals can be daunting. So they really love this aspect of the tool. Using ServiceNow approvals is easy, the forms are customizable. And so now it's as simple as clicking a button, entering some texts, and they have their approval request submitted. And then once it's approved, it's easy to go back and say, "Oh, here's the approvals I got for this model." Once we've gone through the rigor of building these models, interviewing stakeholders, figuring out what data's available, matching the right algorithm to the job, and then launching the model, that's really just the first half of the process, because from that point on, we need to monitor to make sure did that model actually score on the cadence it was supposed to? Some models score every 15 minutes, some score every hour, some score once a year. But we needed a mechanism to say, "Hey, did that really happen? Are the end users really getting the fresh scoring data they expected or did something happen and prevent that model from scoring?" So that's what execution monitoring does for us. For every model, we set a target accuracy. And so that's how we evaluate these models, whether or not they're really doing their job is are they meeting or exceeding the target accuracy? And since we've set the target, we have mechanisms to monitor the actual accuracy for each model, and that helps us figure out, there's a case in which we need to go back, figure out what's going wrong and address the accuracy. And then even more important, for every model we set value targets. This is actual dollars saved, dollars created, and so anything that we can use that has a mathematical formula that says, "Every time a user uses this tool and gets this benefit, it's worth X dollars," we can aggregate that up and we can see what the actual versus target value is for each model. Data drift helps us figure out if data is changing over time and therefore impacting the models. So let's go take a little bit of a deeper look. If we actually go into the Control Tower and dig into one of the model's details, underneath the hood, we have all the kinds of information we might expect to have about a model built out into the ServiceNow tables that house it. So we see the model development lifecycle. So these models, we can track them based on stage, and then for every stage we have what is mandatory entries for the model marked by asterisk here. So that allows us to figure out is this model ready to go to the next stage or are we still waiting on something? Something that customers have really loved about what we track here is in addition to the description and who reviews the models, and who the team is, the business owners, they really love entries that show where does this model surface its results? Is it in a dashboard? Is it in a database? And then what is the link to find more information about the model? These models are typically complex, so end users really wanna know what kind of features are used in the model, how is the model built, when was it last updated? Those kinds of things can be accessed in the underlying collateral. So the risk and governance piece here comes from being able to monitor these factors over time. Did the model execute? Is it accurate as we thought it would be? Is it generating value? Is the data changing over time and causing changes in the model itself? So that's what we mean when we refer to AI risk and governance. But the sheer list of models themselves represents some element of governance, and that is because organizations like ServiceNow often have multiple teams working on models and perhaps one of the biggest tragedies might be if two separate teams built the same kind of model to address the same challenge. We can avoid that by using the model registry. If we dig deeper into what that looks like in terms of risk, the the elements of approvals and execution are not the only elements that are at play here. Change is the only constant at large organizations, and so the features themselves that the models use represent risk. If there is the challenge of one of the features that can often affect multiple models and it could be very, very difficult to monitor. What is happening with one of the features with regard to all of the downstream models. But that's easy using the tools that the team built out. If there's a problem with one of the features, sales territory changes or how our plugin counts what the source data is for that, if models use those, it's easy for the data scientists to notice that in terms of the risk and open stories in order to fix the predictor and address that in the model directly. That's a great detailed level of monitoring and alerting that really helps make sure our models function as accurately as possible. Operational, from operations standpoint, building the models is the easy part, frankly, monitoring all of these other aspects of how the model is performing and when we need to go back and address elements is really important. And the way the team does that is by investigating within the dashboard, looking at things like accuracy and value and finding or receiving alerts on those so that they go in and dig deeper into the model output. So when we look at model details for a particular model, we can evaluate the model and what it's providing in terms of output for value. Like Girish mentioned, if value is going up, that's great. Perhaps we're able to invest even more in similar models. If it's declining, then we need to look down at these other factors below that. Is the model executing on time? Is the accuracy what we thought it would be? Is it adopted as much as we had hoped or are there gaps? And then what kind of user feedback is there for the model? Finally, we see the prediction history and we see if there are any gaps. Over time we would hope to be predicting more and more observations for models as our business grows just like any other organization. And then the value piece is critical. If there's one single piece I can't over-emphasize enough to my team, or to you, or to our customers, it is put this in place and track how your models are trending over time with regard to value. If we dig back into one of those models, we can see that value plot over time do we need to go and look into model accuracy or execution? Because the value these models deliver is really a function of adoption, accuracy, execution, and drift. And so the ability to look at these factors in one single pane expedites how we spend our time in terms of continuing to focus on new models or going back to remediate ones that might have had some change in adoption or accuracy or execution. So these four elements are the challenges the team addressed. Making sure we have the list of all the models throughout the organization in one place where everybody can go look, they can inspect, they can learn about what models have been built or are being built, and figure out if they're considering a model, how does that fit within the portfolio? The risk and governance pieces in terms of submitting approvals, help keep my team out of trouble with regard to what kinds of data they can use, what kind of models can be deployed and what are the infrastructure pieces that they use. The operations piece helps us figure out if value's dropping, is it because adoption may have changed? And if adoption changed, is it because accuracy may be within question? User feedback is a great way to check that out. And then value management really helps steer the investments not only that we make in models, but in terms of AI teams in general. Does it pay off? It's a bit early in terms of our deployment of the AI Control Tower, but we're already seeing clear benefits. 85% improvement in AI model availability. When you launch a model, you expect it to execute on a cadence, but like I mentioned, 15 minutes or an hour, whatever you have to find with the stakeholders, if it fails to execute, that can affect everything downstream. And so improvements in that availability make a big difference. 87% less time preparing and tracking approvals. We really want our data scientists focused on creating models and not going through the processes. The ones that are really important, but the processes in order to make sure that we have the right approvals to build the models. And then calculating the ROI for models sometimes can take quite a bit of time, especially as your model portfolio grows. The team has done a great job, they encode all of the information needed to calculate that in the form of SQL statements so that every quarter when we go back and evaluate how are all our models performing, that process is easy now. Our team, again, can focus on building the next model instead of all of these other elements that might have sucked up their time. So in a nutshell, this is how our team built a tool to help with all of those challenges and help us identify which models need review and remediation versus which ones are on the positive track and running as expected. This is the tool that they built on the Now platform. - Thank you, Brian, and so to summarize this, this was our Now on Now story on how we leverage the capabilities on the Now intelligent platform. We use things like platform analytics and basic workflow-related tables and created what we are starting to use internally as the AI Control Tower. Call to action for this team is to learn more. Use a QR code, you'll see a lot of useful information and videos on the solution and others. If there's interest, reach out to your sales rep and they know how to get in touch with the Now on Now practitioners. And then in addition, there's a whole lot of other Now on Now stories that we encourage you to look at on our website.