Platform Sensitive Data Discovery and Real Time Data Anonymization HRSD
all right everyone I think we're going to get started I know we have folks still trickling in I know there's a lot of uh conflicts this morning and things like that um but let's get started um as I said good morning good afternoon good evening uh this is our Academy session as part of platform security and today's topic is about platform sensitive data Discovery anonymization that's the theme of what we want to talk to you about I'll talk about where that fits into the broader scheme of all the different things we offer as part of the platform security team but most importantly I want to talk about what's new in this release for detecting and protecting sensitive employee and customer data quick introduction so uh my name is Andrew vlo I'm a principal outbound product manager on the platform security team um I formerly had the opportunity to lead an inbound team but recent made this switch to outbound to help us get our products and platform security components into the hands of customers because security is uh not a feature it's an expectation and we have fantastic foundational and uh premium products that we offer now with me you're probably familiar with the name uh for cot Saleem he's unable to make it I can't talk about too much details because of privacy why he's not here but just uh we're wishing him and his family all the best uh he has some exciting uh developments happening can't say more uh but I know he was really excited to be here and he spent a lot of time getting ready for this presentation you might know for fott he worked on Virtual agent live agent nlu employee Center AI search um and then recently in the last couple of years he's been hosting these Academy sessions so he'll be back soon um but um he's not going to be with us today but I got someone uh wonderful to join us instead uh and I'm going to introduce Bara baia that I hope I pronounce that right she's my partner in crime on the inbound side for data privacy so she's the person that's working with engineering prioritizing the road map and she's uh graciously joined to just listen in she's actually going to give a demo of some of the things I'm going to talk about and then she'll be around for Q&A Bara do you want to just say hello to our participants hello everyone thank you Andrew for such a warm welcome um yes I'm super excited for what's happening with forat and looking forward to hearing more from him but I'm very excited to be here today and also excited to demo uh what's new in Washington in data privacy awesome Bara awesome thank you um just normal Safe Harbor notice by looking at this screen and attending this particular webinar Academy session uh you're uh acknowledge that you're know that we're going to have forward-looking statements that may influence our quarterly reports and findings mostly most of the things we're going to talk about today are what's new in Washington so these are things that are GA but occasionally especially with Barco here she might be excited and tell talk about some of the exciting things we're working on so just be aware this is uh confidential sensitive uh information and uh should not be disclosed without uh checking with your governance team um with that out of the way um let's talk about the agenda so we're going to do a quick overview of what service now data privacy is we're going to talk about some of the new features as part of the Washington release uh we're going to go through a demo talk about some of the ways you can stay engaged we actually have a blog post that went live today and then we'll turn it over to Q&A and uh bark and I will try to answer some of your questions with that said let's talk about service now data privacy and uh the the kind of the theme is we have purpose built capabilities that are both foundational to the platform and part of our premium bundles that are there to safeguard sensitive data on service now instances um and what that means is our Target in terms of who our customers are are service now customers and partners that host both employee and their customers data so if the company interfaces at all with like a B2B Toc model that is in scope right so uh some of the sensitive data could be from employees like for start of employment and social security number or it could be consumers where they enter their credit card information our team part of platform security specifically focuses on that Persona building the products and the tools for customers to properly handle sensitive data now again I always like to tee things up of well why does this matter I think uh if you've just recently read some of headlines and just continued uh data leaks uh privacy is very much in the eyes of consumers and the reason that matters is because as consumers continue to uh demand privacy consumers are effectively employees I'm a consumer for a bunch of uh super products for example Amazon but at the same time I'm a service now employee and if the sentiment is that I am concerned about how my sensitive data is being handled that applies to my employee work work work workflow profile um as well as the consumer profile for you know my online shopping and things like that and the sentiment is pretty clear 87% of consumers say they will not do business with a company if they have concerns about business security practices I think we all know that that's the essence of why we have Vault uh but more importantly consumers would not buy from an organization if they didn't trust businesses with their data and then lastly business repot said that customers would not buy from them if their data was not properly protected and by the way this is from a study that PWC did in 2017 so I sense that if the study was done again today those numbers would be a lot higher um and that's just because again with more data flowing more iot devices um even the monetization of data uh uh it's very much in top of mind for consumers and employees alike um so that's the problem statement and I know our customers from the conversations we have they are very interested in taking steps on this um there is a desire from a company perspective to address customer needs right like the product management 101 but the thing is even if there is an interest from our customers to actually invest in properly safeguarding sensitive data there is still all kinds of different compliance um standards and regulations that businesses have to pay atten to so the key takeaway here is that the data privacy is really a top priority in business strategy from the chief legal officer to the ciso to the CEO CMO um and then risk and compliance and just to give some examples of um you know the kind of challenges that um our customers face when it comes to data privacy uh is data governance for personal information of employees that are no longer longer part of the organization what's that retention um span depending on what compliance and governance is in place and then how do you properly handle that information and anonymize it um another example is gdpr compliance we also call it the right to be forgotten uh we need to know where the pii data resides and create an inventory of it so that you can properly classify it and properly handle it and dispose of it as needed Hippa of course in medical one of the common use cases that bar and I have seen with all of our data privacy tools is um we can't have sensitive data in test and Dev environments and it poses a risk of exposure so the data is anonymized before it gets put into those particular test and Dev environments and then last but not least uh privacy policies right um I think a lot of customers especially with Gen a hint hint I'll talk about um I'm giving you preview of uh what we're going to be talking about knowledge but as there's a more desire to take make sense of data we've acquired all of this data and to share it and make sense of it um that's fine there's all kinds of different reasons for that but there are privacy policies of what can and cannot be done with data so before it could be used by contractors third parties or even internal you have to review that and then more importantly properly um uh normalize the data or handle the data before being able to work with it so that's kind of me talking about the importance of data privacy and what it means to our particular customers and partners next I'm going to talk about um just a brief overview that data privacy is just one of the controls of a holistic um need that our customers have for securing their data and their assets on the service now platform so we're going to be talking about data privacy today which is available as part of our vault bundle but it's also available as a standalone skew so if for example a customer is maybe not quite ready for Vault um but wants to take advantage of in the data privacy capabilities they can do so and that just really has to do with the size of the organization so it's it's more of where they're at we we're trying to meet them over that that's why we have a standalone skew uh but really a comprehensive solution just really quick um we need to think about it doesn't make sense to uh properly anonymize data or is it the right the data should be encrypted um code signing and just maybe making sure that uh mid servers have not been tampered with we have log export service for sending logs for re review uh Secrets management uh again uh sensitive keys and uh maintaining uh API keys and secrets on the platform and then newest is zero trust access which is all about Advanced authentication controls based on who you are and narrowing the frame of uh access depending on on data patterns so really exciting um I'll talk a little bit out other upcoming Academy sessions to go deeper on those but back on topic let's just do a quick quick recap about what service now data privacy is it's made up of three components one of them again is foundational to the platform meaning customers can start using it today and that is classifying your data so before you can decide how you want to act on your data it's really foundational and really important to classify your data and understand what kind of data is this is this sensitive data is this data that could be uh very risky and damaging to our organization if it leaks to an unintended party external party what about internal um it might not even be that the data is sensitive to leave uh you know the organization but internally should for example someone in HR have access to um you know someone's um I don't know codebase or whatnot I don't not a perfect example but it could be it's just about data segregation and segregation of Duties that's where classification comes in but you can't really let's say you do have all of these different compliance drivers and you're trying to help customers you can't really get start started on doing that until you start going through and figuring out what is sensitive and what's not and customers can again start doing that today but what we've been hearing and what we're trying to do is help customers uh save time and go back and focus on their core business so that's why we have things like the data Discovery component and the data anization component as of add-ons that we think that are going to really save them a lot of time and and resources so let's say you're classifying your data but maybe you're having a hard time you have so much data you're having a hard time of really knowing where sensitive data could be could exist that's where data Discovery comes in that allows you to form at effectively uh data patterns and looking for particular kinds of patterns of data such as Social Security numbers credit card numbers phone numbers Etc that will help you discover that sensitive data and then be able to properly classify it so those really go hand in hand now okay you classified the data you've discovered it well sometimes the whole point and where why we want to mitigate the risk of data being leaked or even if we know it's sensitive data storing it in plain text is not an option right uh we want to make sure that if it if some if an unauthorized unintended party did access the data uh that the company would be um would be safe from uh the data being potentially extracted right that's where data anonymization comes in place and that's you I'll show you in some examples that's like overwriting a social security number or a phone number uh with values that cannot be actually consumed by someone looking at the data both at the application layer and at the database layer so if someone got into your database they're not going to do anything with that social security number because it's been properly anonymized so that's the overview of data privacy and I think let's get to the meat and the potatoes of this presentation and let's talk about what's new in Washington okay when then pause so the first thing we're going to talk about is uh data Discovery and an enhancement we've made for uh data pattern matching for partial anonymization and what does that mean um in this example here you could see a screenshot there are is a you see the test uh test data input and you see that there's a social security number and then you would like to change change it and then more importantly here's my phone number with partial anonymization that we've added for data Discovery it allows you to anonymize the sensitive information here which is the social security number and the phone number without losing context of what the uh conversation or description is about so think about a uh incident in a description field it's important for context for agents to be able to understand that customers would like to have a change or be able to review that but they also don't want that sensitive information being stored on the back end on the database and this is where the uh partial data anonymization comes comes into place that uh if someone was to consume and look at that ticket they would not see that sensitive information it's possible that one of the agents was a would would have been able to see this information but the moment becomes anonymized no one else can see that information so you're really narrowing the uh blast radius of that sensitive data being exposed very very important for our customers it's been something that's been on a wish list for a while um and again uh this has come out uh as part of the Washington release the next one I'm going to talk about is data Discovery and keyword support and data patterns this one's really really interesting and cool so um if you know those of you haven't attended this sessions before uh we had the ability to be able to use regular Expressions to be able to detect uh sensitive data and properly handle it now what do you do in the scenario and I'm just going to kind of see if I can hover my mouse around this um where you have um something like a date and the date is in a specific format and there's multiple dates on that particular format right and that could be something like a date of birth it can be an employees start date how do you target the specific piece of information that you want to potentially anonymize or discover and anonymize that's where this capability with keyword pairing and proximity detection come into place what it does is it allows you to Define find a key which the key is you're going to be looking for let's say you're looking for instead of the employee start date you're looking for the date of birth the key is date of birth so what you would add is you know if if in a particular database it was do you would add the keyw do if the database was date of birth you would add date of birth that way when we go through and discover the sunset of data we are able to actually Target that particular piece of key value pair and then properly handle that particular piece of data so very powerful allows you to be much more precise and targeted with your discovery of sensitive data on a service now instance um so that's that's that one I'm really excited about and I think we have several customers that are uh already getting their hands on this today and then the third one I think this one's the most exciting and um again I'm going to have Market kind of go talk us through the demo um but we have a lot of exciting thoughts on where we're going with this but uh for the first time ever uh we are introducing data privacy apis on our platform and we're calling them re realtime apis because as soon as you call the API we perform the the action and these apis apply to both data Discovery and data anonymization why is this important well in the example here um we have the short description and we have a description and again we have some sensitive data there at the time an agent goes through and maybe they enter that information maybe the customer entered it in there the moment the update or commit has occurred those apis can be called through for example like a business rule and then the data can immediately be potentially anonymized right at that that that time of the entry of the data so if you can imagine an inent that's really really powerful again and it's similar to the use case we talked about below is handling the data as it enters the the instance and the parameter right so effectively we're creating an API that's acting like a data privacy firewall that gives you the opportunity to handle the data as soon as it enters the instance and then is properly anonymized um so uh and that also U helps with the keyword support and things like that so um those are the really three key features of the Washington release and I know that me talking about it is just part of it um I think at this point I'm going to turn it over to Barka and she's going to talk through a scenario um and do a quick demo of how we could put all of these different things into motion so bark I'm gonna stop sharing my screen I think you got it and over to you awesome thank you so much Andrew that was great information a lot of information but very nice information so have you ever run across um have have you come across a scenario where you have a lot of incidents which are reported either from your employees or your customers and maybe accidentally or maybe because to get the incident resoled they had to add some pii some sensitive information in the incident now all of us know it's not safe to have sensitive information everywhere floating around in the incidents because a lot of people have access to incidents so there is a legal liability there is a legal risk of having sensitive data exposure in that scenario what do you do how do you mitigate that scenario so for that I have on this instance I have HR Service delivery installed and I also have data Discovery installed data Discovery is a store app so I have that installed along with data privacy Store app so first let's go to data Discovery so under data Discovery I'm going to first this is this is always going to be the first step I'm going to look at all the active uh all the data patterns so right now here I have few data patterns which are either shipped out of the box a few of them are custom as well what's new in Washington is with every data pattern there is an Associated privacy technique so for here we have credit card American Express so this is out of box data pattern and by default we have a privacy technique of selective replace with X which is associated with it the data privacy admin does have the permissions to go ahead and change the Privacy technique which is associated with a data pattern what what this Association essentially does is wherever credit card American Express will be discovered only that particular um pattern will be an rized with the selective replace so that helps by keeping the complete business context alive so you still have the business context but we are only anonymizing the sensitive data now let's look at a data pattern that I have created custom which is date of birth if I go to date of birth now there isn't really any difference between the format of date of birth or date of higher and what we are doing here is I am doing date of birth with keywords so I have the regular expression of mmdd y y along with that I'm searching for some keywords like do date of birth so wherever I find this expression along with these keywords which are in the keyword proximity of around 20 characters around this expression that's where I will confidently say that yes this is my date of birth and this is how I will Discover it so the keyword support and the keyword proximity is new in Washington and this is something which gives more confidence in the data pattern because Rex can be complicated Rex can result into some false positives as well so with keyword we are just making it more precise so once I have all the data patterns here this is essentially the first step and this is not a step that the admin will do over and over again it's just something that you set up everything um in the beginning and after that it's mainly about just configuration so the next one is I'm going to to go to active data pattern active data pattern is the list of data patterns that I want either to be scanned as part of next job or as part of real time now what's new in Washington is with every data pattern there is a priority or a order list so in case if there is a conflict between two regular expression which regular expression will take the priority and which will be anonymized the first so that is what the order is about if I go to edit here I would typically see the available list and I can move it to the selected list with the help of up and down arrow I can pick the order of the data patterns now I'm going to go to the second store app which is the data privacy so under data privacy we have classification and we have anonymization once I go to anonymization under anonymization there are multiple techniques which are shipped out of box so everything that you see here as base like data pattern and automization selective replace with X selective replace all of these are base along with these base techniques I have created some custom techniques for example replace date of birth if I go to replace date of birth click on edit so this is using the base technique of selective replace so for date of birth I want to keep the format of the date of birth so in this case since I want to keep the format I am excluding The Hyphen and I'm replacing the Sens of characters with asri you also have an option of either anonymizing fully or partially so in case of credit card number you want to keep the last four digits and maybe the first digit as well to understand if it comes from Visa Master Card or AMX so we can have the start index as second and end index as 12 but for now I just want to keep the format I do not want to keep any sensitive data that's the reason I am starting it from one similarly for Selective replace SSN I have done the same I'm using the selective replace technique if I go to next I'm going to start with one I'm going to exclude the dash and I'm going to replace it with the asri now uh what we have done in Washington is we have come up with data privacy apis which are basically two different apis one is data Discovery API and the other one is the anonymization API now these apis can be called either from business rules or from the flow designer as well now the beauty of these apis is it gives you a lot of flexibility so that you can discover data either on insert you can discover an anonymized data excuse me after a particular condition is met so in this scenario I have a business rule created in which I am sanitizing description and short description which is associated with the HR case so here I'm going to monitor the HR case table this is active and advanc and I'm going to trigger this business rule as soon as the state of the HR case is changed to closed complete now think about a scenario in which uh in an incident you have sensitive data but you still want to keep that sensitive data in order to resolve the incident so this is when you would say the state so you want to keep the sensitive data but as soon as the incident is changed to close complete you want to anonymize it so that's when you would call the business Ru so this is where I've specified when to run if I go in advanced I literally have have um a three line script for description and short description in which I'm just calling the apis to discover an anonymize the sensitive data everything that was there in the active data pattern list with that let's go and see how it works for incidents so here I have HR cases now let's look at a HR case which ends in 1 0 if I open this right now it's in the state ready um state is ready and under short description I don't have any PI it just says update SSN and date of birth but under description I do see a lot of pii which is enable to update SSN to XYZ then the date of birth in the HR portal so this user probably is a new user and they are not able to update their pii their personal information in the HR portal in this scenario in order to resolve the incident we need to have this SSN number and date of birth so that we can we can update it in the HR profile so that's the reason we are going to keep it till the incident is closed so here under description I have SSN number I have date of birth I have email ID and I also have the phone number so let's say the agent goes ahead and updates The Hop profile with all the info I go and change the state to close complete and once I've done that I'm going to click on update go back to the HR case which is one and under description it is all anonymized so um the SSN number has been anonymized but the format of the SSN is still intact based on the tech technique that we used similarly date of birth email ID and also the phone number are all anonymized once a sensitive data is anonymized it is actually changed in the database there is no way to get it back and this is what makes it so unique and special because according to privacy um you do not want to have any legal liability of keeping the sensitive data if you no longer need it in this scenario we needed the sensitive data just to update the H H Port once the HR portal is updated we don't really need to keep the sensitive data we still need to keep the incident and the business for recording for audit and record purpose but we don't need to keep the sensitive data that's the reason we've anonymized the sensitive data but we have also kept the business context in terms of when was the incident CL open when was it closed what did the user want we have all the business context here without actually oversharing or without actually having any sensitive data leakage thank you with that um Andrew I'll pass it over to you sounds good thank you Bara um I think at this point let me go ahead and I'm just going to hiding from me I'm gonna go and turn it back over to questions after that demo I know we had one and and feel free to use the Q&A feature of Zoom uh but we had someone ask uh Bara oh Bara are you I see you're typing an answer already do you want to just answer I can yeah I can just um to the answer live so the question is does the data Discovery run on all tables or can you select where you may not want it to off escate the data like HR case management right so um something which I missed showing in the demo is Target tables so under Target tables you can pick the tables that you want the discovery to run on on so once you pick the tables for example you can pick either just the incident table just the task table or the HR table and the discovery and anonymization will run only on those tables keeping the sensitive data in other tables intact without even touching them thank you thank you Bara that's wonderful um I think the next question we have is um just helping customers get started um let's say that today they you know they they might be familiar with some level of classification uh but they're maybe not feeling like they're they're doing enough what's the best way to go from just starting with getting familiar with classification and then when do you introduce data Discovery to to your you know Suite of of quote unquote Tools in your portfolio yeah so that that's a very interesting question so data classification is installed by default the plugin is installed by default excuse me as part of the platform capability so anyone can start using classification as early as right away right so um classification really helps uh to create an inventory of data so uh and there isn't really a nist standard for data classification and it really depends on the data governance of every organization how how do they really want to classify the data I have seen customers classify it as um just one single layout which says sensitive restricted public there are other customers who classify it based on workflows like this is the HR workflow and under HR workflow they can have sub classes as well so it's really up to the customers what what works best based on their legal practices based on the data governance practices so data classification can help you just classification by itself can help you if you know where you have sensitive data so for example under HR under HR profile I know I have a column for name and I just want to classify it I can just go ahead and classify that simply but um what if you have sensitive data in places where you don't know about it it can either be dark data where it should not be or it can be where it should be but it's I I just did not know about that that's when data Discovery comes into the picture which can help you discover and uncover these different areas and once you've done that you have an option of either deleting anonymizing that data if you don't need it or you also have an option of classifying the data so that you have a clean inventory and a good way to start the protection mechanism or to you know just come up with the whole protection strategy for your organization wow uh that was very well sinly said Bara thank you um we had one other question I think this is a follow on on data Discovery and running on tables uh there's a question about handling the audit history um I don't have any more context than that is that enough for you to be able to answer yes yes so as of today we do not um anonymize we cannot discover and anonymize sensitive data in the audit history mainly because we want to keep the audit history immutable for compliance purposes but um I'm not surprised with this question this has come up a few more times um so uh it would be great to understand a bit more about the use cases why you would like to anonymize sensitive data in the audit history and how are we going to uh handle compliance thereof so this is an item I'm working on as part of road map I cannot have I do not have any commitment for this um I I think at this stage I am just trying to understand more use cases and what will be the different areas where uh anonymizing audit history will be valuable yeah that sounds great and I think whoever posted that was an anonymous attendees so hopefully you have a way to reach us um and uh we're happy to sit down and have a chat with you uh if that's a requirement for your customer so um I don't see any other questions on our backlog um Bara did you want to have any other final parting thoughts or anything that you would like to leave otherwise I'm going to cover more resources and wrap this up yeah so why don't we cover resources and the other thing I would like to add is so you can try it today data disc Discovery and data and automization both in your sub prod so for prod yes you definitely need entitlement but for testing it in subpro you do not need entitlement you can test it today so test it out see what use cases it works for you what what are the different use cases it will work for you and we'll be happy to you know get any feedback from you any any concerns questions any feedback any road map items from you we'll be more than happy to discuss it further awesome awesome thank you Bara um and then as I mentioned uh there's several QR codes here you can go through our list of various uh platform privacy and security uh recordings on YouTube it's actually specifically for our data privacy track that goes deeper into classification Discovery and anonymization there's also our product documentation uh where you could just look and get a little more details of activating the plugins and all the different bells and whistles in the product and then last but not least we have our community uh blog and site and actually today we launched a blog post talking about real time our real-time API and real time uh Discovery and anonymization so definitely check that out um that's going to be that's kind of going hand inand with this Academy session and the only other thing that I wanted to mention is we will have additional Academy sessions this is our first one of the year so welcome back um and then we will have the next one on zero trust access session validation we're going to talk about access analyzer which has been growing tremendously and specifically user comparison platforming cryp deson and then August uh in the summertime we're going to get into access controls which I know is a big top- of Mind item for our customers last thing um not sure if anyone on the webinar is going to be at knowledge but definitely check out some of our talks at knowledge uh specifically safeguarding sensitive data where it lives with service now Vault uh mainly through the encryption angle um but you know that that applies sensitive data handling also applies to data PR privacy and then we're going to be talking about bar and I going to be talking about navigating AI privacy with confidence that's a big Topic at knowledge and last but not least we're going to talk about overall how do we address the various challenges customers have with security with service now Vault and that's actually going to be with our vpgm our outbound leader and our field security team uh leader and will with that I'd like to thank everyone so much for their time uh we look forward to interfacing with you the next Academy session uh hopefully you learn something new about data privacy today and don't hesitate to reach out to bark and I uh we are here to answer your questions and we really want to do right by our customers with that thank you so much have a great wonderful rest of your day good night we'll talk to you all soon thank you
https://www.youtube.com/watch?v=-6CX0IbWQvU