logo

NJP

Embrace AI faster and responsibly with robust AI Governance that builds trust

Unknown source · May 12, 2024 · video

- Hi. My name is AnnMarie Fernandez. I'm the senior TPSM for risk and the banking business. - Hi, everyone. I'm Anushree, and I'm part of the risk product team, and I manage our IT risk compliance and audit product. - So, Anushree, have you heard of the latest hot topic, AI? - Of course. I think AI is everywhere. In fact, I used our employee's gen AI photo booth, and look what it transformed me into. A warrior princess. Yeah, that's right. And I don't know. I'm like a Harry Potter sorcerer or something. Either way, I think we're gonna be for sure in the next "Star Wars" movie. - For sure. So you see what happens when AI starts hallucinating, right? (crowd laughing) All right. But seriously, AI is here to stay, AI is here to transform the enterprises, AI is here to increase the productivity and accelerate the agility. But the only thing that is growing faster than AI is worry of AI. So let's talk about how do we help organizations reduce their worry and have governance around AI. - Absolutely. So, let's take an example. We've got Dana, who is the e-commerce director of this outdoor Contoso sporting goods company. What she wants to do is use AI to increase sales on the website. So her idea is to use AI within the chat bot. So what we're going to do is walk you through an end-to-end lifecycle, starting with a project onboarding. Then we're gonna go through transition to production. And then show you how we can monitor AI in real time to make sure it's doing what we want it to do. So let's start with project onboarding. What's important at this phase is that we work with the right stakeholders. So I'm gonna walk you through three key people that you'll need to know. So here, we've got Eileen. Dana reaches out to Eileen. Eileen is the product owner. From the strategic planning workspace, she can see all the work that's planned against the website. So you can see she's got demands, epics, projects, big and small things, and it looks like she can take on more work and happy to open the project. She creates this project to integrate AI using prompt flows into the chat bot. So once she creates the project, she can go ahead and look at if they've got enough capacity to deliver this project. And from the resource planning, we can see that there is a point in time where they might be short, some very expert resources to have AI developers come in and finish off this project. So what he's gonna have to do is reach out and find out if they have any approved vendors that they can easily, that can easily bring on board some AI developers. So this is Adam, third party risk manager. He's got a list of the approved third parties. We can see three here, and they're all active. We're gonna start with the first one, AI Powered Consulting, because we know that they've done good work and they're familiar with our responsible AI policies. So from the vendor risk management workspace, all we need to do is create an engagement. This will let the AI-powered service providers go ahead and start to bring that resource on board. Next, really important is that we engage the rest of the people that need to do an architecture review on the project plans and diagrams. So she's gonna go ahead and submit a architecture review. What that then is gonna go to is a group of folks: enterprise architecture, security, model risk management, compliance, privacy, and business continuity. All these people will look at it from their perspectives. One of the big decisions that they have to make here is which LLM they're going to use. So then that now moves us to the operational service. Let's just say that the project has about, the development is done, they're ready to go live. But before we go live, there's a couple steps that we want to make sure we do. A lot of organizations will skip this step, but we say it's gonna be key to success. One of the first things that we wanna do is ensure that we update the CMDB. So as you can see here, we've got three LLMs that are already in play, and the architecture review board ended up so happen to approve the Azure LLM to go ahead and integrate with the chat bot. So you can see we've got the application mapped to the chat bot, which is also mapped to the e-commerce website. Once all that's in play, then what we can do is gather the risk information. So we'll do an internal risk assessment. Once we gather that information, the system can recommend the right risks to relate to the the e-commerce website CI. Once all those risks are then created, then we determine the right controls to mitigate those risks. And again, the system can recommend these to you based on the information there. So it makes it a little bit easy. Whoops. So, now that we've got our risks and controls mapped to our integration in the website, now we're ready to go live with AI. Whoops, sorry. - Yay. - Yay! - We're live. - Yeah, all right you go. - All right. So we're live with our AI project. And the last thing that we are gonna start doing here onwards is everything is operationalized, so how do we monitor AI in real time? So let's look at the demo now. What I have here is the Contoso Outdoor Company website, which is live. And this is a website for you to go and buy your outdoor activity gears. You can see tents, backpacks, hiking clothing. You can also see the camping stores and sleeping bags. On this website, we have AI or gen AI chat bot, and this chat bot will help customers to search for the exact product they want, and this will basically increase the sale for the website. I've gone ahead and asked some questions to the chat bot. I'm planning to have a camping trip right after Knowledge next weekend to detox. So I've gone ahead and asked questions around, "What kind of tens do they have?" The chat bot has gone ahead and answered my question and given me the products that are available. I would also like to have some pillows. So, let's go ahead and ask that question. And the chat bot has given me the pillow that is available. But I want my trip to be a little more comfortable, so I want mats as well. So why not add sleeping mats? So let's go ahead and ask a question to gen AI on the sleeping mat. Let's see what they have. So the gen AI chat bot has gone ahead and pulled up a product which is a mat, which is actually not available on the website. So it's pulling up information which is not available, which means it's hallucinating. And this is a huge risk because the customers will start losing trust and the sale is gonna decline. Since we are monitoring these controls, and AnnMarie already showed you, we have risks that have identified and the controls to monitor these risks. The continuous monitoring will determine what the issue is and trigger an email to Dana who is the owner of So Dana will receive this email and she will see the failure, which is on the groundedness AI metric. The groundedness AI metric is nothing but the truthfulness of the AI. So I can go ahead and check what the issue is. So I can click into it. Once I click into it, it gives me the details of the issues, and it also gives me the results or the evidence why this particular control has failed. And before we go deeper into this, let's step back a little bit and see how all of this impacts the compliance posture of your organization. So here I have compliance workspace where Dana and the compliance managers can come and monitor the compliance Here, I have regulation, which is for the EU AI Act, and I can see that the compliance score is 25% and there is one high priority issue against it. I can also see AI governance policy with the lower compliance score and the issue against it. And you can also monitor various business applications and business services that you might be monitoring for the same. Here, I can see the e-commerce website, as well as the customer support chat bot, which both of them are affected by the same issue, and the compliance score is really low. So let's see what's going on on this e-commerce website. This is a 360 view of my e-commerce website where you can see list of risks that are associated with it. There's risk of data security, loss of availability, risk of information handling and retention, as well as loss of confidentiality and data privacy. Along with all these other risks, I can also see a risk associated with a gen AI for the inaccuracy. So, if I look at this particular risk, we have list of risks that we need to make sure that we are mitigating. So, we wanna make sure that we have the controls in place. And I can see that there are list of controls here. One of the controls, which is on the e-commerce website, is around the gen AI accuracy that you can see here. This particular control is marked as non-compliant, which means there's an issue against that particular control. And there are other controls that are on the customer support chat bot as well. One of them is truthfulness. You have the fluency, coherence, relevance. So the truthfulness metric is basically the one that is actually checking for hallucination. And looks like it has failed, and is marking the control non-compliant. So let's dig into the control here. This is a metric for groundedness, and you can see that it's failed. And it takes us right back into the same issue that we were looking at. These are the evidence of the failure, and you can see that there's a value which is a 44% failed. The target value that you can see on the right-hand side tells you that you can basically set the target to say if you want to be 80% compliant in this case. So if the target is 20%, which means it's basically, you don't want the chat bot to fail about 20%. But in this case, the 44% is about 20%, which is basically failing our target value. So if I go inside it, it gives us the actual evidence of the failure. Here, I have groundedness metric, which is ranging from the score of one to five. One being the worst score and five being the best. I do see that there are 53 results that have come back as failed and 67 has passed, which comes across, which if you calculate, it comes about 44% failure. Now, if I wanna see what's going on here and I'm gonna fetch actual evidence, I can go and click on this link, which will take me right into my Azure LLM portal. So this is where you can see the actual evidence, what is the metric that is failing and where is it fail failing from. You can continuously monitor it and you can also present these results to your auditors and you can be audit ready. So this helps you with continuously monitoring your controls. But let's see how this affects risk. So what we have here is we have sent out an automated risk assessment in which we've set up a couple of automated factors. One of them is groundedness count, and another is fluency. It's actually checking the AI metric for groundedness and fluency. And automatically calculating the risk rating. We can perform the control assessment. And these are the two controls that have failed or are non-compliant. So my... I'm just marking them as non-effective. And when I mark them as non-effective, my residual risk is going to go high. So you can see the residual score is high here. So this basically gives us the assessment of the risk and where it is going. And the next step is, how do you remediate this? So here is a risk response where you can see that the deviation with particular control, which is not being compliant or the policy, the AI policy, is not being followed. And the cause of this is improper segregation of data. So the prompt flows are not approved to pull from the mixed data, but it is actually pulling from your training data as well and giving you incorrect results. So the remediation of course is you need to work with your AI team and retain your model on the proper data and make sure there is no risk of hallucination there. And that's how you would go ahead and mitigate the failure of your control and make sure that your risk is not high. But while we can't prevent the issues from happening, organizations can be better positioned to monitor and respond effectively. So what we see here is an operational resilience dashboard for the business application or a business service. In this case, it is for the Contoso e-commerce website. And Dana, who's the service owner this application, can come here and monitor all the aspects of risk and compliance and other operational metrics as well. So here, you can see the new issues that are created, the failed controls, the risk. Even the privacy cases can be monitored from here. And apart from that, you can also look at the incidents, change requests, outages, vulnerabilities, as well as security incident. All of it can be managed and monitored here. So Dana can monitor everything around a business service, not only reactively but also proactively. - Awesome. Great demo, Anushree. So now we're gonna wrap it up. Dana is excited. She's already seeing revenue up with AI chat bot recommending products to customers that they wouldn't normally find through search results or filters. She's confident in their AI program, that the AI policies are built into their plan, build, run workflows, and that the right people are engaged. And as you can see, we can monitor these risks and controls in real time, which gives her the confidence that she can expand AI to other parts of the company with confidence. So, that wraps up our demo, guys. If you have any questions, we've got Anushree, we've got Anish here if you got any questions. We've got actually a little bit of time if you've got any questions. Or you can hang out too. So, join us in zone two. We've got the risk floor where you can ask questions. There's also topic tables and 15-minute demos that you can go there, sign up for, and also the knowledge bar to speak with a product expert. And then today at four o'clock, we've got the cybersecurity and resilience keynote. So we hope to see you guys there. - Thank you so much. - Thanks, guys.

View original source

https://players.brightcove.net/5703385908001/zKNjJ2k2DM_default/index.html?videoId=ref:SES2961-K24