logo

NJP

Salary Survey Analysis Tool: Part 1

Import · Oct 16, 2020 · video

[Music] all right we are uh doing something a little bit different tonight um of course i've been working on processing the salary survey results no i will not be sharing the results or the raw data on this stream but i have been working on a tool to help me analyze the data and i'm a little ways into it at this point it is not on service now i've tried to keep it on my local server again for data protection purposes um but i figure it's code and we might be able to have something worth showing here so figured why not let's walk through it um so just to kind of catch you guys up on what i've been doing i have spun up a nginx server and i've stuck this little salary survey analysis tool it's basically just going to be a static site i've got my i don't know why that guy's down there anyway i've got my index.html here um i've got an app.js file and i'm basically using es6 modules to run this whole thing so it's a little different from what we usually end up showing on service now this is uh ui wise probably a lot closer to the ux framework although this is not the ux framework i want to emphasize that but the pattern is much more similar to what you would expect to see in the ux framework um so let's see here for what i have developed so far the first thing that i built was this little variance library here so um you know just to give you guys an idea of what i'm looking at doing the analysis i'm looking at performing is very similar to a regression tree which basically allows you it's a machine learning style and it basically allows you to take data sets and split up those data sets trying to find a good illustration of it [Music] medium here we go uh so you know here's an example data set i've actually been using this particular site a little bit as i've been building this out and in fact it's where my uh dummy data comes from um but you can see the raw data here uh outlook temperature humidity whether it's windy or not and uh how many hours were played is this gulf or something like that i'm not sure but basically we are trying to predict what the hours played will be given the outlook temperature humidity windy this is very similar to what i'm trying to do in terms of predicting um salary based on the inputs that uh the responses to the survey based on uh how folks responded uh if a given person wants to answer a certain way what's their most likely salary to be so this is kind of how decision trees work and how they split out and standard deviation um or variance is one of the way standard deviation of course is just a square root of variance uh but variance is one of the metrics we can use to determine where the splits should happen and a split is basically just uh identifying an important piece of criteria in the salary so in building this out i had to build out some statistics functions so over here i've got calculate the mean the average calculate population variance calculate sample variance which i plan on using the population one but just in case i threw sample in there anyway we've got calculate probability which is used in terms of calculating basically we calculate the variant the reduction in variance when we make the split to sunny overcast and rainy and we will multiply each of those individual variances by the probability that rainy appears or sunny appears or overcast appears so there's a little bit of statistics in this there's a little bit of math going on but i have built out a basic library here to handle a lot of the mathematical aspects i have also created a store which really is just kind of a poor man's redox um trying to keep this one dependency free and i've used this store mechanism and a couple of other projects so i'm going to go ahead and continue to use it here the basic functions of it are dispatch and subscribe and of course the store takes a reducer much like redux it will process through there are a couple differences uh first of all uh you know redux usually encourages using one of the there's a lot of folks that'll use different types of plugins on redux to handle asynchronous server calls and stuff like that i'm actually using something that is a little closer to [Music] the inspiration for redux which is a tool called elm and so i actually have the ability to do asynchronous dispatches within reducers and as i go through i don't know if i'll actually end up using that in this particular project but in uh one that i call ui flow which hopefully i'll be revealing publicly soon um the dispatch async is used heavily in that one but in this one i probably won't be using it we'll probably just be using dispatch and subscribe but again that's my poor man's redux then we've got some actions which all i've created so far is just the app loaded and yeah you know it's looking real exciting right now um i think we got a couple people viewing in here how's it going guys um and then of course our action creator which is just a function that returns the object and i'll give kind of a quick preview on that one in just a moment uh and then we've got the reducer so again we're we're following the kind of a redux react redux style but eliminating all of the boilerplate for it so for those just joining this project is not actually on servicenow and i'm doing that for data protection purposes so i'm doing it on just a local server that i'm running here on my machine um but some of the patterns we're using are going to be similar to what you'll end up doing in the ux framework uh from my experiences with it so far uh so the skills are at least transferable um and the last thing i got here is my uh is my raw data so this is not the actual survey data this is some dummy data that i got from the site that i had up a moment ago and all it is is just an array of objects that where each property is a column in the table so we've just got a handful of pretend survey responses and our goal is to essentially calculate when we split the data set on given properties um which ones most influence our target value which in this case is hours played but when we went when i actually used these and plugged the survey responses in here our target value is going to be salary and what we're looking for is which of the survey responses uh which of the answers reduce variation the most in terms of our salaries so in this case we're going to use uh all of these fields as our quote unquote survey responses and this one is going to stand in for our salary and i'm going to continue building out some of this stuff tonight and when i am ready i hope to have a tool where i can plug these values in have it spit out uh the variances so that i can select the ones with the lowest variance and also hopefully uh output a um a histogram which the histogram is just going to help me visualize it because part of the whole machine learning part of this is having a subject matter expert evaluate the data that that comes out of it without that evaluation you're just throwing numbers around so with that i'm going to do a quick inspect element here and drag this over and i'll show you kind of how my store is working right now actually let me get rid of a comment real quick uh that'll be an app.js so i've got the store subscribe and that's outputting a console log i'm going to get rid of that one for now but then what we'll do let's see if i can i don't have anything for here let's shrink that one this way let's shrink this one that way and we'll try to get a little bit more on the screen here i may end up rotating these around a little bit but alright so if i clear out my console down here i'm going to reload the page and once that page reloads oh it crashed here we'll restart that uh where is it this guy i am running it as a docker container and i am using uh v uh vs codes uh remote tools which i gotta say i have absolutely loved working with that okay there we go uh local simple web server now i gotta go to the survey analysis tool there we go all right so the survey analysis tool has loaded up and you can see i've got a couple of logs coming out here now these logs are coming from my poor man's redox so anytime i do a dispatch anytime i call the dispatch function within the finally section is going to output the action that was dispatched and the state uh the the resulting state and this is a really great way of logging it of course if i was using redux i could use redux tools and some other fancy stuff um but again i'm trying to minimize dependencies here and you know i wanted to understand how redux worked under the hood so i wrote my own for the fun of it um but you can see here that the output is that i dispatch the app loaded and my new application state has a series of responses this is coming from my reducer my responses reducer here and also the app.js is using that reducer so you can see the import up here and we are using that responses reducer to build our application state so basically anytime i dispatch anything um this function here gets run and rebuilds a new application state uh each time that this dispatch is called uh so this is going to give us real good separation between our actions rebuilding the state and then ultimately outputting it to ui which i haven't quite gotten to yet um so with that i've got my responses working so there's there it's outputting my data now the first thing that i want to do is i want to add um i want to take and use this the this this variance these variance calculations in order to basically compute for each property other than my target property for each property what is its reduction in variance if i split the data set on that property so i'm going to come over to my app we'll create a new file and we'll call this variant split all right so for this one i'm going to need to import something i'm not quite sure what yet but i'm going to have to import it from my lib folder and i'll figure out what that import needs to be in a minute uh let's go we'll call this uh we'll call this property split variances and it's going to take the previous state and the action and it needs to return a new state so one thing i like to do right from jump when i'm doing this method is default return previous state if we don't do anything else that way we're always returning something [Music] now in addition i'm going to jump over here and import and actually wait a second i need to export this one first and personally i like to do exports at the top i've seen that most folks using es6 seem to do inline export like that i personally like to be able to go to a single line up at the top and identify everything that it's exporting and maybe that's just my uh preferences from the revealing module pattern coming into play i don't know but that is a personal preference of mine so now we've got the export um and i need to import that in here from dot slash and i i will say this approach uh using es6 is is weird in some ways uh it's i feel like it's a lot more verbose uh it's it's there's a lot more typing there's a lot more bouncing around compared to what i usually do in servicenow especially with angularjs's angular providers i feel like i jump around a lot more but at the same time for all the jumping around and importing and exporting i feel like the code structure is so much more effective i feel like i've got so much more power at my fingertips in terms of where i put stuff and i mean my and let me just show an example of this one because uh so this here is some code from something i've been working on called ui flow i mentioned it a little bit earlier where's my where's the big one where's the big one there's an angular provider in here angular providers this guy this is just a single angular provider and it's broken up you know you can see i've got different modules on different lines if i was writing this in es6 every single one of these would be a different file and the code organization in that i had to go through to get this whole thing to actually make sense was insane and that was on service now in an angular angularjs provider es6 makes this structuring so much better from as far as i'm concerned but there is quite a bit more typing um property split variances is going to be actually i want to rename that one because i want to keep reducer at the end of it just as a convention dot property split variances pass the action and then we'll come back here and we'll rename this guy all right now if i refresh hopefully there's no errors in my new application state i should have two properties there we go so we've got responses and now we've got property split variances so uh and you know i will say there is a lot of work in terms of using the react redux going through all of the reducers um but especially as projects grow and i've had a lot of projects that started out seeming simple and then grow to really large sizes and i gotta say the i the more i've messed with this redux style the more i've kept a separation between the commands and queries and the more i've structured the code in this way the better off i've been in the long run so even though it's a lot of typing i do like it a lot better uh let's see so property split variances so basically what i want to do ultimately what i want the uh property to look like in the end is something like this uh where we've got property names well what was my what's my data show again let's look at the data okay so we've got outlook and some number and temp and some number and humidity and some number and windy and some number all right so this is kind of what i'm hoping the the outcome will end up looking like uh i want it i want this object here this currently undefined object to basically or i want this reducer to build an object that looks like this um it's going to iterate through each of the property types um it's going to iterate or not property type sorry it's going to iterate through each of these objects properties and for that property it's going to calculate the change in variance if it was split on that property and for those joining the when we're talking about split and variants uh i am talking about decision trees and the using the uh reduction in variance as a metric for determining which property to split on uh and this is going to help the salary so help help us in the salary survey because it's going to help identify which questions in the salary survey have the highest level of impact or the greatest reduction in variance on the resulting salaries um all right so what we need to do is first i need to get a collection of property names and i'm going to make a naive assumption because it's my project and i can um i'm going to make a naive assumption that the first object in our responses array is going to contain all of the properties so basically we're never going to leave a property out so in order to build that i'm going to say actually i think what i can do i think i can simplify this because i think what we can do is say object dot keys and we're going to use previous state uh dot responses so actually down here we'll have to change that one because i'm gonna pass the whole state into this one all right so object dot keys previous state.responses should give us each of the properties in there um i do need to ignore at some point [Music] um you know what i don't think i like this approach let's ignore that real quick let's do this let's do it this way let's make it part of our configuration uh response properties is equal to yeah we'll do it here object dot keys responses zero so that will get that should give us a list of the a list of those response properties and and we'll call we'll make one of these the target target property and in this case we'll say hours played is the target property uh let's see array.remove js i think uh now filter array.filter is what i want so response properties dot filter so if we'll do return property is equal to target property so that should exclude the target property and then we will export i'm going to export target property and response properties because i have a feeling i'm going to need both of them elsewhere all right so now we'll come back over here and we will pull an import and we're going to do [Music] response properties from and that's going to actually going to be 2 up or from data slash responses dot j s okay so now we've got our response properties and what we're going to say here is let's see so response properties for each all right all right so we want to iterate through each of the properties okay now let's take a look at our variance to see which one of these we can use because i think i've got the function this calculate two attribute variance what it's going to take is a list of samples a predictor attribute a target attribute and a boolean property indicating whether or not whether to use sample variance or population variance i'm going to leave that one blank to default to population variance uh let's see calculate two attribute variants those samples should be i think this is going to give me what i need because the return on this one is the variance so let's give this a shot all right um variances for each okay and we're going to say that the variances for that property name is equal to and i need this function from that library so let's add another import hey andrew how's it going we got andrew coming in from australia saying it's a perfect time to do some streaming [Laughter] i am glad that i can accommodate uh so yeah so uh welcome to the party andrew uh the the work i'm doing today is not actually on service now i'm trying to keep the survey data uh out of the cloud for obvious reasons so i'm actually running this on my local machine and going to i'm building this out just using es6 modules static site good to hear it says it's going going okay friday afternoon at work getting ready for the weekend let's see so i want lib slash variances no just variance.js all right so yes gem style jace would be proud yeah totally going jam stack on this one um i like it both because it has some familiarity to to the ux framework which i haven't gotten enough time to play with yet so i it makes me feel like i'm doing something in that direction um but also because it makes me feel like i might actually still have skills as a gener as a generic web developer hey hey hey peanut gallery back there you're you're either in the stream or you're not you can't just jump in randomly i'm getting made fun of from uh from sarah back there i don't trust you [Laughter] all right so i need to pass the samples in which is going to be previous state dot responses so our responses is the actual well it's dummy data now but it's going to be the survey responses because we're not sharing the raw data publicly i feel like i have to say that and if i say that enough times i feel like folks might believe it um predictor attribute ah andrew says he started a side project in vue.js cloud cloudflare i've heard a lot of good things about vue.js i've had a number of people recommend it to me and i just i haven't tried it yet i learned angularjs just as it died and became angular and i sh aside from service portal still using angularjs i have felt completely irrelevant for learning the wrong tool at the wrong time uh let's see my predictor property is going to be the property i'm currently on my target property is going to be the one from data responses target property so i need to import target property as well so that's my target property which will be salary in the uh in the salary survey and the last one i don't need and all right so return variances and for now i'm not going to care about the action on this one we'll just calculate it no matter what the action is uh at least for now you know what i changed my mind i changed my mind oh and i don't need that import statement anymore because i didn't use it so uh andrew what's the uh what's the side project that you're working on in view can care to share the details uh what am i looking for um what am i looking for oh actions so i need action types from the actions module it's a bottle management app for your whiskey collection nice big is this collection shelves with pictures of the bottles virtually man definitely need to share that one out uh when you when you get that one going i want to see this collection um actions import action types from there we go man a bit sloppy with this actions js it's in the same one there we go okay so if action.type is equal to action types dot app loaded whoa 40 bottles maybe more nice i want to come visit you okay so return we'll just return a blank object otherwise well now because we do need to do previous state dot variances there we go okay now that guy's ready to go door is always open nice all right let's see import that reducer from app slash oh i already have it i don't need to do anything there uh i do need to change this one though because we're passing the whole state in on that one now okay and reload and everything breaks all right cannot read property length of undefined calculate to attribute variance all right so let's go to our lib variants so samples there's an issue with our samples i see what the problem is best part of having lots of bottles open is you can share a tiny bit of heaps of bottles and the collection doesn't go down by much just a little off of each one and you get to keep them all um all right so here's what we're gonna going to do let's get rid of that then and we'll do it this way uh data responses responses since it wants to be stubborn basically what was happening here was i was calculating the responses and the property split variance in the same step so when i passed the previous state there was no responses object or responses array to deal with so instead i'm going to import it from my data and see how that goes reload there we go let's see what we've got oh nice okay so hours played well it's not supposed to calculate hours played so we'll have to deal with that one all right well before but before i dive into why that is popping up when it shouldn't you can see that what we've got here is a calculation of the different variances for each of these properties so essentially if you were to take let me pull up yet and maybe we can diagram some of this out yes that's fine all right so basically what we've got here is we're starting out with the complete data set so we have the whole data set here and we can split this data set up um amongst any number of these properties so for example if we were to split on the outlook the outlook can be rainy overcast or sunny so we would have rainy overcast and sunny so in the decision tree what we end up with is a split where this is what our split would look like something like that so when we split our decision tree up on uh what was it outlook this is what our data set breaks down into now uh you have one two three four five out of 14 that are rainy so what we do is we calculate the variance of rainy uh of the data set on on rainy so that would be the variance of the hours played how much the hours played varies within just the rainy data set and we would multiply that by one two three four five five over fourteen which is the probability that rainy occurs we do the same for overcast we do the same for sunny we add all of that up and that is the variance for that data split then we compare that against the variance of humidity the variance of outlook and their or sorry the variance of temperature and the variance of windy so basically uh you know this data set can be split this way or it could be split by temperature which would be either hot let's see humidity or sorry no sorry hot mild and cool andrew says he's glad he didn't do statistics at university yeah i had uh when i was in the marine corps i did a whole lot of statistics type stuff with lean six sigma they were real big into it at that time i don't know if it's still a big thing for them or not but it was when i was in um but basically we can divide this data set up by any of these properties and we can calculate the variances of each value of that property and doing that gives us a calculation of how much the variance of this set is and we can compare that variance to the variance of this set and the lowest variance indicates that that particular property likely had a straw the strongest influence on the target attribute hours played or when we do the salary when we plug the salary data into this on the salary outcome um so in this case we can look and we can see that the outlook had the lowest variance so it is most likely that if you wanted to predict hours played based off of this criteria it's most likely that the outlook is going to have the greatest influence on where it's going to land then we can turn around and for the rainy data set we can recalculate we we can split rainy and we can rerun these calculations for rainy and we can uh split overcast and rerun these calculations for overcast and what you end up with is a whole decision tree that says okay uh you know first what was your outlook okay well the outlook was rainy then what in terms of salary what that means is we can look at the data set and we can say okay the most important uh attribute here is the uh your geographic location um after that is your years of experience or sorry if you're in uh australia the next one is years of experience in the united states the next uh the the next most influencing characteristic might be something else um and in that way we can kind of determine which attributes influence and if you remember from taking the salary survey how many different uh questions and answers there were in it you might understand why i need an automated tool to process all of this building the tool is probably going to be faster than trying to do the whole thing manually because again as you split these trees out you have to recalculate for each and every one um so you know there's almost an exponential number of uh calculations andrew says excel isn't an automated tool question mark um i'm better in javascript [Laughter] uh i used to do some fancy stuff with excel but it's been a long time um and for these types of tree processing i wouldn't remember where to begin on excel but let's see if we can figure out why it's putting hours played in there um oh i see why filter does not is not an in place filter we have to assign it back uh so andrew asks what libraries are you using in this one none so far none so far all of the javascript i have written myself even down to the poor man's redux that i'm using here i kind of wrote my own version of a redux style store it doesn't do plug-ins it doesn't do anything fancy but it does the exact configuration the way i like it so uh and i'm using this one in a couple of other projects uh so it works uh hardcore uh yeah i i tend to do things the hard way uh but the good thing is that especially with stuff like this variance uh some of these variance calculations and such um i can run the tests myself and validate what it calculates how it calculates and i feel a lot more comfortable when i'm the one writing the tests on it um you know there's a time to use use frameworks and libraries and i guess for me this one just wasn't one of them uh let's see okay so that that gives me the going to plug in d3 or something to export pretty graphs so the pretty graph that i planned on using was going to be a histogram and um i might use a charting library high charts d3 something along those lines i may pull one of those in in order to plot the histogram one of the things i have to take a look at is how some of those libraries handle histograms because i i want to make sure that i calculate the right bins the right buckets and the right frequency distributions on that histogram and i don't know if i'll have to do those calculations outside the charting library or inside the charting library so i'll have to figure that one out as i go um but that's the visualization i'm going for is is the good old-fashioned histogram love the way it shows the distributions and especially uh in terms of displaying that reduction in variance i'll be able to actually show not just you know oh you know here's the oh i did it wrong uh not just here's the number but here's what that reduction in variance looks like and actually i need that to be not equal to because if you look here it limited my variance to only hours played which was the exact opposite of what i wanted reload and now we've got there we go everything but perfect that's exactly what i was hoping for all right so that gives me the split now i am getting towards a stopping point here soon uh it's ten o'clock my time and i do like to try to pretend i'm sleeping before midnight i always try to go to bed early and it never actually works out um let's see so where do we want to go next i've got the variances being calculated um that will allow me to display them out here what else do i need um so the next thing i wanted to be able to do was target a specific node so uh currently this is going to calculate the whole data set split what i want to be able to do andrew says uh his kids wake him up around five six so he needs to be in bed around ten yeah i'm gonna be honest the kids wake sarah up i am not the early bird i am usually the the last one down and she's the first one up um that's biting her in the rear end right now but when the second porn gets old enough that's gonna bite me in the rear end our second born doesn't it is just full of energy and i just know he's going to be uh a night owl like me let's see all right so what i want to do is where's that sample group um now that's the variance one variance split this one all right so right now i'm passing responses into uh into calculating that two attribute variants what i would like to do is i would like to be able to filter that primary data set so that i can select specific nodes so we can probably use the array filter function the question is where am i going to get that filter um andrew says he is last down and first up man you are kind you you are a kind soul sir i uh i am not an early bird by any stretch of the imagination um let's see so working through this one where do i want to get my inputs well i guess right now it doesn't entirely matter what my inputs are because like what i can do is i can just build the actions so let's create a new action type and we're going to call this one let's call it responses filtered and i do like to use past tense uh i use the event as opposed to the command approach um you know and you'll see different folks debate back and forth uh from from my standpoint i like to treat it as though this event already happened and then output the result so next we need function responses filtered return and this is my action creator action types dot is filtered get rid of those quotes and i need to be able to just as long as you use tabs and nut spaces [Laughter] i do i am a tavr um [Music] so i'm going to need to pass in hmm i'm going to commit a bad i'm going to commit a bad and i'm going to pass a function so usually with these actions you want them to be something that can be serialized so you know an encoded query would be really good to pass in usually it is not recommended to pass in functions which i guess kinda could be serialized into a string and be serialized but i think that's probably against best practices but i'm gonna not care about that right now because this is more about being able to process the data and less about code purity so we'll deal with the best practices video later alright so we need to export that particular uh that particular action creator and i want to test that action creator so i'm gonna come back here and this is another thing that i really like about this this redux style is it makes testing so much easier and you can basically build functionality without even concerning yourself with what the ui will actually look like you can just treat it from a pure code standpoint just how is the data going to move how is your application state going to change so i can import responses filtered and where do we want to pass this in i'm going to pass it in here and then we're going to go to the variance split and i'm going to add an additional parameter called filter function and what we're going to do is say responses dot filter filter function so now that should return an array that's filtered and now what we should be able to do is uh do a store uh dispatch responses filtered i was hoping it would automatically but since it won't i'll just copy it and what we want to do is give it a function and the function because it's a filter has to accept a response this uh this will represent the uh individual response in our array um andrew just asked that uh my post said i that i have a he's asking uh if i got a new pc because i mentioned that i had a new battle station yes uh it's a it's a fairly new one i did the this is my first custom build in since like 20 12 maybe earlier now 2012 was a refresh it's been a long time since i've done a uh a computer build so i had to like ask all the guys at glidefast what i needed in the machine i had to ask them all the questions and get help every step of the way on it but it is running a ryzen 7. um i forget what video card i got but i got like 32 gigs of ram in it um i love it i love it i i got a dual dual monitor uh dual 27-inch monitors along with it 1440 i am just i am in heaven compared to what i used to use which was my 15-inch laptop so huge improvement my productivity is going through the roof um but i've also been working on much harder projects recently so my output has stayed about the same uh so let's see so return okay so on this one let's say that we want to split on rainy and we want to calculate the rainy node so response dot outlook is equal to rainy now let's try refresh so if i look at our variances and then i look at the next variances undefined up i forgot to update my reducer if action type is loaded ah or action.type is equal to action types dot responses filtered andrew says he's gone through the complete opposite just bought an intel nuk i7 oh nice i was looking at getting one of uh one of those as well when i still had the laptop i was looking at getting the uh knuck as like a a home lab server for running running well frankly stuff like what i'm doing right now um instead i've got what is this thing hp hp z420 i i bought a used z420 um for a couple hundred bucks a few months back and uh that's where i run all of my all of my local stuff now and it's kind of funny because if we had this same conversation like six months ago i would sound much less like an actual tech guy um i've learned so much just here recently from from the guys at glidefast coaching me through all this stuff okay maybe maybe ah close that new state i don't feel like that did it because the variances are all still the same oh my gosh sorry andrew just said that he splurged and got a 49 inch dell ultra wide wow hold on all right i just got sidetracked i want to see this thing oh my word you could put the whole world on that thing dude that's beautiful it's got the little side ah that's so cool that's so cool okay okay distraction over [Laughter] andrew that was that that that is a beautiful monitor all right so property split variances now response.outlook is equal to rainy okay well let's come into here and let's do a quick console.log and let's say samples let's just see what we get when i do that nothing got a whole lot of nothing on that there we go all right let's see all right so 14 responses 14 responses that doesn't seem right all right so i am passing the filter function rain maybe i'm not i am not passing the filter function i am passing garbage um okay my bad let's flip back here i see the issue now so originally i thought i would be passing the filter function through that way and i'm not i am actually passing it through the action and i'm going to call this filter and then we will do in here action dot filter and then we will reload there we go well undefined is not a function okay so we need to enhance a little bit we're going to say uh bar filtered responses and we're going to say if action dot filter we're going to do filtered responses is equal to filtered responses dot filter action dot filter and basically on the initial run there is no filter on every run after that there is so we're going to add some conditional logic we'll default to responses if there's a filter then we will apply the filter and down here we will pass our filtered responses instead reload errors go away and you can see that we are getting 14 responses the first time we calculate and then when i dispatch these the responses filtered action it filters down to just the five that are rainy so now i can get rid of this console.log we solved that issue save oh yeah i i pr uh andrew i say i see what you're saying there about the the filter sort for each functions i love those functions um i love them so much more than than just the basic for loops and such not enough folks take advantage of them reload okay so now we should have what we're looking for so if i look at the property split variances before and after you'll see that humidity was at 81 had a variance of 81 on the total data set after i filtered it it now has a variance of 20. outlook went from 67 down to 60 temp went from 79 down to 19 and wendy went from 83 down to 50. so basically if we go back to our tree here on the first pass on the first pass we split by uh outlook outlook was the lowest so outlook one and the outlook was rainy sunny okay so our first split was on outlook like this data set here then what we did was we filtered down to where it was only raining which basically means we selected the rainy node and then we recalculated and when we recalculated we got uh now of course we are not going to split on outlook again because we've already split on outlook so we're just going to ignore that one uh although obviously the variance is going to reduce because we cut out that's actually odd because if i look at the responses ah those are i'm gonna have to double check some of my calculations let's see because technically the variance on that one for outlook should in theory be zero i think because we filtered down to where the outlook was only rainy so there should be oh no sorry sorry even among rainy the hours played varied that's what it was okay so yeah it recalculated and the variance was lower okay so the lowest though looks to be temperature all right so basically what we would say is that rainy the next most important criteria after you've selected rainey would be temperature so we would go back to the temperature and that's either hot mild or cool and so what we would end up with in this decision tree was hot mild and cool and so once again using using this method using this decision tree style approach um you know as opposed to your your typical salary surveys which you know will give you a uh salary range you know they'll tell you what the salary range is the goal here was to take that a step further the goal here was to have the salary survey done in such a way that we could not only tell okay if you're this job description the range is this what we really wanted to do was say not only is the range this but you have a higher probability of fitting into this range and if you happen to be in australia or the united states or canada or india or you know or have a certain certification you know what attributes influence uh the variability in your in your salary and so if i plug in do you have an itsm certification and the variance just doesn't go down much because we can also calculate the variance at the top level and if the variance just doesn't go down much uh then we can reasonably assume that there's not much significance in that particular variable uh so in that way you know for the folks that turn around and say that oh you know certifications don't don't have an impact well we'll actually have data that shows whether or not it has an impact and you know we'll have slightly more certainty as to to what we're talking about on it um now with from an app standpoint i now have two different actions i have the app loaded action uh let's see andrew says the data is like if you are in australia your name is andrew and you're the only person to fill the survey in you earn not enough um trying to remember i want to say there were a couple uh a couple of entries from australia but one thing i will tell you andrew and this is something that uh from a target variable standpoint and this is the first that i'm actually revealing this information but um from the start i had a plan built into this salary survey to attempt to deliver more value to international folks um you know i feel like other salary surveys have done a good job of capturing you know some data for you know u.s specific and uh there's a little less information for you know from a global perspective um and also in a lot of the responses that i got folks gave me feedback that they felt you know folks in india were saying you know we feel that even when even if you were to factor for cost of living um we feel that we're paid less than our us and western-based counterparts and so i did some research and i found something called the uh ppp what was that um um gosh what was what was what's that acronym for ppp uh purchasing power parity and what purchasing power power parity is is it's a metric that um eurostat and the oecd got together and they basically compared cost of living in different countries and they uh basically uh came up with a conversion ratio for that country's currency two u.s dollars that instead of just being a uh equivalent value you know this this many uh australian dollars is equal to this many us dollars in conversion instead the purchasing power parity is a ratio that converts to us dollars and back but from a standpoint of cost of living so a basket of goods in uh you know how much money would it take to uh cover the same basket of goods in both countries so my hope is to be able to show the distribution not just from a not just from a geographic split of saying well this is how this is what the split looks like in india this is what the split looks like in the united states this is what it looks like in australia and i have no idea how to com how to compare these countries to one another i'm hoping to use this uh uh purchasing power parity conversion factor uh in order to do some global comparisons from country to country uh to see if you know if when you factor for cost of living uh do some of those biases still exist and i you know i'll be honest i suspect that they do um i wouldn't be i wouldn't have gone through all the effort of trying to find this conversion factor uh if i thought that uh the the world was a fair place overall um but so i'm hoping to do some of those comparisons uh not just in a geographic isolated fashion uh but also to try and do kind of a a worldwide comparison and i will say that from some of my just initial looking over uh the the processed raw data i will say that a lot of the western countries fall into a very very similar range when you factor out the currency differences and the cost of living differences when you adjust that that purchasing power parity uh the ranges are really really similar for for a lot of countries and then there are uh other countries like india that j really do seem to pile up towards towards the bottom of the ratios now of course we will get the variances and we will get take the more scientific approach here but um i do suspect that when we adjust for those factors that we are going to see some of those biases in play and i'm curious to see what other biases exist both globally and locally again part of building this automated tool is to try and set something up where i can quickly you know click through uh and i you know maybe do like a tree view sort of thing or something like that where i can you know quickly run different filters and draw comparisons and uh you know try and uh come up with draw some conclusions from the uh distributions and the charts that get produced and try and put together some some narrative and comparisons and you know write uh write a bit of a white paper on this showing what i have found so uh it'll be interesting to see what it says honestly if anything because who knows we i may do all this work and run all these numbers and all the variances might be absolute garbage um i really don't know um but we'll see uh so let's see so at this point the next thing i wanted to show on the app was that you know we have two actions and those actions generate new application states when we apply them and you know you can see how and i love i i love this little event logging that i built into my little poor man's and it kind of gives you what the uh redux tools uh the the react redux tool devtools gives you uh where you can kind of see what the app state is at different parts it's really helpful for troubleshooting and also as i'm building the application it kind of lets me build it ui free i haven't done anything for the the ui tool or for the ui side of the tool yet but i'm still able to do some validations that stuff is happening it's happening the way i expect it to and now i can turn around and i can set up different things where you know me if i uh you know if in a tree view if i click on rainy for example it could generate the filter function in the background and dispatch a responses filtered action and now my application state gets rebuilt and i can create web components that subscribe to the store and respond to those changes in application state so it should make building the components easier as well fingers crossed um and yes andrew i am i'm definitely having fun with this uh this i'll be honest when i first started building this tool i was expecting to hit more uh brick walls than i have i expected the statistics to absolutely destroy me but i was able to validate my functions against uh some other demo data out there and those functions were working as designed and uh the redux poor man's redux store uh has been working well and it's all surprisingly coming together and i'm almost scared to say that it's working but it does seem to be um i mean that like i said that that filter the the i've basically created the ability that when i click on a a breakdown or apply some kind of filter i can basically step i can move through this step by step i think the next thing that i want to tackle is going to be building the histogram data because then i can generate the histogram visualization off of that and gosh um once i have that i'm probably gonna be pretty close to being able to start on the ui side and once i got the ui side then i can plug the salary data in and i've only been working on this one for like two hours now so this has gone faster than i expected frighteningly faster modern web is cool like this is the first time i've actually gotten to sit down and use es6 modules in an actual for real project i was working on this is forget ie we all need that mess this is awesome um you know and what's cool about what's really cool about it to me is the way that it kind of imports this stuff in the back end if you look at my html um i haven't touched the html yet because the only thing i'm importing is that app.js there's no babble there's no webpack there's no there's nothing it's just it's just running yeah yeah andrew says that it's a lot to uh wrap your head around though it it is it is and i've done uh the two hours worth of work to get to this point is built on a foundation of probably the last year of trying to understand what an es6 module even was um and how those imports worked and uh and such but from from what i'm seeing here i think that if i were and this is just i'm not going to do it for this project but this is just me forward thinking for other projects if i needed to introduce a babel or a webpack uh type thing to build uh ie compatibility you know backwards compatibility type stuff um i think i could pretty much use the structure i've already got with the es6 modules i've already got and i should be able to just kind of plug those additional tools into a build process i wouldn't even use it for the dev process if i could avoid it um because if you've ever tried to install the create react app like holy cow that thing takes forever um but i should just be able to attach those as a build tool and most of everything should still stay the same and that's crazy like the new stuff that they have built into browsers to be able to support this to be able to automatically re resolve these imports and and make javascript a truly modular language i i mean that's incredible and there may be some of y'all that have been using this for a while that are like uh congratulations old man tolson wait way to catch up way to join the program but uh you know for me finally stepping out of angularjs land um and you know using a simple gulp pipeline on uh you know usually i would have used angularjs and i would have used a gulp pipeline to just concat uh everything into a single js file and then maybe run a minify on it and be done so this is this is mind-blowing this is cool um i'm actually going to pack it up at this point uh for tonight next time around andrew says github action cicd oh man i have i have not gotten into that stuff yet um i'll get there i'm gonna have to get you to do a a live stream to show me how to do that stuff um i have never messed with the the github actions or the ci cd i'm lucky i can spell ci cd um but next time around i think i'm gonna start working on tackling the histogram data and i do occasionally plunk on this stuff during the day definitely not in meetings um never in meetings um maybe in meetings but uh i do plunk on this a little bit so i may end up doing some things behind the scenes and i'll kind of catch you guys up when i get back to streaming it or you know if things are you know i i don't know what meetings i've actually got tomorrow and what work i've got tomorrow i got to figure all that out but sometimes i'll write a lot sometimes i'll write a little bit of code i'll write a quick module or something but uh if i do i'll catch you all up next time and if not we'll pick up where i left off but next the plan is to work towards the histogram data um so yeah thanks a lot for joining appreciate y'all making making this evening a little less boring and you know not having me sit here and just talk to myself the whole time and i hope to see you all next time have a good one see y'all next time

View original source

https://www.youtube.com/watch?v=Lag14qxWYdU