2023 Dec – Data Warehouse 101 Webinar

In the “Data Warehouse 101” webinar presented by SchoolBI, Ryan Bowse, founder and CEO, introduced the concept of data warehouses, highlighting their importance in organizing and analyzing data for schools. He explained how data warehouses consolidate data from various sources, using processes like Extract, Transform, and Load (ETL) to make data accessible and actionable for reporting and analysis.

Hey, everyone. Welcome to the Data Warehouse one zero one webinar put on by SchoolBI.

I’ll do an introduction to myself in a second, but it’s great to see so many familiar, attend these names in there and also some new names And today’s webinar is really an introduction to what data warehouses are and how they can impact, your school or how they could maybe help your data, visualization effort.

So just a couple housekeeping to get out of the way. The content that I prepared for today is about twenty minutes, and then it’ll be about five minutes, at the end for q and a. If you wanna ask a question, the best thing to do is to ask in the Q and A section of Zoom. I will try to answer them in real time if I it, if not, all this answered at the end.

So feel free to type in there. I am recording this session, and I will send the recording out to those that attended as well those that are CP, but we’re not able to make it. So let’s get started. I do have a poll today, which I’m gonna kick off in one second.

So for those that don’t know, my name is Ryan Bowse. I’m the founder and CEO of SchoolBI. I’ve been in this business quite some time right now, now, and it’s kind of an embarrassing to say how many years I’ve actually been in it.

But it’s been about fifteen to sixteen years that I’ve been in the EdTech business. And so, I’ve worked at companies like, who was part of the founding team at Ravena, worked at, if you’ve been around long enough, you might know Whipple Hill, community brands ran a couple divisions and been consulting and starting companies in the space for quite some time. So I won’t go into any real great detail about my background. I won’t mention any of my favorite to Lasso episodes.

Or which John Steinbeck book I’m in, but, ultimately, I thought I’d give you a quick, screenshot of me and who I am. So before we get too far, I am gonna kick up that poll. So ultimately, we love data here at school BI, we’re a data driven company. So I’m gonna launch this poll.

And, when you launch it, you should get a question on your screen and I’ll read out the, results as we go through it. So, just so everybody is aware, you should get a pop up on your screen, and there should be this simple question. How familiar are you with data warehouse strategy and technology. Very familiar, familiar, slightly familiar, not familiar at all.

Pretty straightforward. So I’m just gonna let a little folks good. Interesting. Yep.

Awesome. So we’ve got a decent split here. We’ve got about, I’d say, fifty percent are either familiar or very familiar and the other fifty percent are slightly familiar. So this is gonna be a great webinar.

And for those that are, very familiar, ask questions, feel free to make some comments, maybe, how you’re using a data warehouse, but ultimately most of the content will be geared towards folks that are only slightly familiar or maybe not familiar, with the warehouse technology. So great. Thanks for filling that out and, we’ll get started. So quick before I go into the what is a data warehouse I’m gonna give you a quick explanation of what school BI is and why we’re talking about warehouse strategies and, technology today.

And school BI is a beautifully easy data analytic platform. We are a data warehouse that’s designed to enable schools to become truly data informed.

And so a lot of times people say, well, what does that look like in reality? And, this is just a couple ways that we help schools We have created a series of, data connectors, which reach out to the core data systems that a lot of schools use, whether that’s an enrollment management product, whether it’s Black bought Veracross Google sheets, razor’s edge, and we extract that data and pull it into a warehouse and put just purpose built for private independent schools.

And that is an important data point, which I’ll mention if you’re ever interested in knowing why I bring that up. And then we produce a series of libraries and metrics for dashboards. So folks can have, dashboards. They can right click and save as and make their own. So we’re trying to kind of, accelerate the ability and make it easier for schools to get to the data they need, make the analytical assessment, explore, and produce kind of a really powerful data informed or data driven culture.

So, that’s a little bit about school BI, and now we’re gonna talk about what is a data warehouse.

Now for those that are kind of familiar or slightly familiar to data warehouses, this is gonna be really, an easy kind of concept for you to get. But ultimately, if you have questions or get tripped up with any nomenclature that I say, just please ask in the chat.

So first and foremost, I wanna talk about why someone might want a data warehouse. Right? So we’ll get to the what is a data warehouse specifically But a lot of times, schools, and personally I’m on a board of a school, and I’m an active alum, and I’ve said some of these things about getting access to data. But schools oftentimes get stressed or they feel underprepared to make a decision because they don’t have access to the data that they need at any given time. They oftentimes feel like there’s too many roadblocks to get to the data I need to do my job.

They’ve gotta talk to too many people. They’ve gotta get too many, you know, consolidate too many sets of data. And a lot of folks are when there has to be a better way. Now the good news is there is.

And I’m here to talk a little bit about that. But one of the problems why this is so hard for schools is The amount of data and the sources they have is only increasing.

School data lives in silos and isn’t easy accessible. You might have Vericross. You might have BlackPA. You might have school admin, you might have, clarity or SSS. And pulling all this data together is really cumbersome.

To let alone, you know, there’s a lot of schools spend a lot of time and money. In getting the data to flow nicely through those systems, but to get it in a reporting prepared format is actually hard.

I mentioned how the data is messy in all these systems. The the data is not structured for reporting purposes. A lot of times, this is kind of an eye opening experience for schools when I talk about this, but The data you might be pulling on a Black Bot or Vericross or Clarity or whatever vendor might be more operationally focused rather than it’s not data that has been cleansed or standardized for reporting purposes.

And then the last thing that schools are finding when they kind of go through this process is that many of them are in the early stages of building out their data strength. I call that their data capacity or their data culture. Right? They’re that muscle yet isn’t there. I joke, but oftentimes, I’ll ask a school. How many people on staff, across your whole organization have, like, something on their CV or resume that says, expert in looker studio or expert in Power BI.

There’s not many. There are folks, but not as many as you would think. And I think ultimately or, you know, in a now it’s you have to know how to use office software, you have to know how to use all these different other technologies like BlackBOD, but it’s not too distant in our future that everyone will probably need how to use the need to know how to use those BI tools.

So, I’m gonna tell a quick anecdotal story, which I think is really funny. And the reason why data warehouses are becoming more and more interesting is that For anyone out there that’s a super geek around space travel or space exploration, you might know the Voyager story story. Well, the Voyager story is the following. Voyager spacecrafts were created, many, many, you know, several decades ago to, travel out into the outer distance space solar systems, our solar system, and beyond, to be a beacon of information about earth.

And so One of the things they put on the Voyager Crafts, there were two of them. They put something called the Golden Record. And the Golden Record was a highly curated sampling of what earth was supposed to be. It had art on it, it had music, it had words, it had scribbles, it had maps, it had drawings, What’s interesting about this, golden record is not so much what was on it in this particular conversation, but the concept was it was a curated list of data or of data, a collection of data that was put on there for purposes of consumption.

It didn’t just, you know, record all this and and make it kinda messy. It actually structured it and it was ready. And it went out on these face grafts for, you know, extraterrestrial life to find them and get a glimpse of what was important in what was going on with Earth. Right?

It’s a snapshot of humankind. Well, one of the funny things is that data lakes and warehouses, and I’ll put a pin in data lakes in right now for a second, but data warehouses are actually a solution to the age old problem that schools have been trying for. School have been putting a lot of pressure on their core providers to become the golden record. I need all things x y z so I can get to the data.

Well, the reality is is the warehouse, if you structure your data warehouse, it actually provides the benefit and the value of the golden record.

But you don’t actually have to touch your core systems. I’m gonna talk a little bit about that in a second. Now I mentioned data lakes. Now, in the industry, this is a big, not just a private school industry, not just a school industry, data lakes and warehouses are a big industry across all organizations.

There are slight differences between a data lake and a data warehouse. For all intents and purposes today, I’ll be talking strictly about data warehouses.

It’s one of those things that if you wanna get into detail and peel apart, I’m happy to get into, but for all intents and purposes, we’re gonna make them the same for this discussion. Today. And if you’re interested in this, I might do a data lake discussion later, but, just keep that in mind.

So what is a data warehouse? A data warehouse is a unified, efficient way to store manage and retrieve data for reporting and analysis purposes.

It’s a really big, in, kind of detail you should keep in mind. This central repository stores huge volumes.

Of structured and unstructured data from various sources.

And it’s specifically designed and architected for business intelligence, mean reporting and analytics.

So this is not a system that is used for operationally to do something, meaning pull a report, change a field, submit it back. Nope. This is actually, much different. It’s much more for creating analysis.

And business intelligence, analytics.

So it’s a really important thing. That is industry dependent. I mean, I’m sorry, non industry dependent. That is the general kind of guidance that all industries, no matter if you’re an education software banking retail, what a date data warehouse does.

So on a real practical level, and please forgive me I created this, you know, for this presentation, it is not my best work, but hopefully it gets the point across on a visualization.

Many of you on the call probably have some sort of configuration like this. You have, a bunch of core systems that you use. It could be back Vericross Black pod PowerSchool.

You name it, up in a cloud, running your school.

You might have a few on campus tools that have data stored and located and maybe on a server or on a system on campus. You might use office software like, power be I’m sorry, not power, but Google, for education, Office three sixty five to help you with your, day to day kind of office suite of products, whether it’s email, file sharing, the like. You might have a separate cloud over here, which you have payroll and accounting. Now look at this poor soul right here.

Right? If they need to actually do a report that might say something like, show me all the current faculty members who are giving that, are, that like the color blue. Right? So that might be pulling fundraising data, might be pulling, you know, a current roster of the the folks that you, have on staff, and pulling in some sort of attributes from, say, a custom CSE file on somebody’s desktop.

And this poor person actually with a lot of brute force can get these to work in those BI tools, but it’s really painful.

So Moving to a data warehouse is slight, but it’s an important shift. And that is think of the data warehouse sitting up in most data warehouses these days are in the cloud. If you’ve been around long enough, like, I have data warehouses could be stored locally on-site, but that is not that’s not what, folks are doing now. You create a data warehouse. And in that data warehouse, you have these commands or or plans that actually go out to all these different data sources and pull them into that structured data element.

And then the reporting, individual, whoever that person is trying to make heads or tails of all this actually only interacts with the data warehouse.

Now here’s the other really important term that you should get to know for data warehousing in this approach. It’s the term extract, transform, and load.

It’s ETL. A lot of people use that acronym. ETL is where the magic happens.

ETL is where the data warehouse goes out to these different core systems, pulls the data in in his raw form extracts it and still works extract. It puts it through a set of business rules and concepts that allows the, go back one. Sorry.

It puts the extract, data through a transformational business set of rules, the school set of rules, do this to the data and then load it ready for business intelligence analysis. So that whole process is actually rather, straightforward, but can also be a little bit complex.

And I’ll talk a little bit about that in a use case. So if you wanna see extract transform and load in real time with the school, here’s a fantastic example.

So you have two sources of data for a student named Abby. You have Abby’s admissions record, say that’s in Ravena, for example, and she has a name and the sending school and an ethnicity, self identified ethnicity or race there. And I’m calling that ethnicity one. Abbie then becomes a student. She has a name, now she has an advisor, and she has a different self identified ethnicity because maybe you pull the students after they’ve been there for a year to ask which they self identified for. Now if you were trying to build a report off that, it would be rather challenging. Which ethnicity do I use, for example, because she has two.

So in that extract process, we build out software from a data warehouse pulls it from Raven, pulls it from, say, in this particular case Blackbaud, and in that transform section, you create a rule, a process a config that says I have two ethnicities. I want one ethnicity, and that’s gonna be moving forward what it looks like in the data warehouse. So in the data warehouse, the field might say name, sending school advisor, ethnicity.

And if I’m building a report, I don’t have to worry about whether my ethnicity is the right In that transform section, we’ve created set set of rules and transformation, configurations that’s going to pick the one. Now this is a really simple example.

The other example is let’s say, this is another really common one I don’t have a visual for this. But let’s say, oftentimes, maybe you change the term that you use in one of your source records. So, let’s say you decide to oh, here’s a great one. Let’s say you decide, the the the the race, the dazzled definitions of races have changed over the years.

And maybe in your core system, you used previous used race naming, and now you wanna use a different naming. So for one would be Caucasian and white. You can envision that maybe in your core system, you have you know, ten years of individuals who are labeled as Caucasian. And now the the race that identified term for races, white, for that one is white.

And so you can envision you could have a rule or transformation engine that would convert that to the new race and then you would have it in your data and be loaded for consumption.

Now, I’m gonna just say that That is a really basic high level explanation of a data warehouse.

So you have, the ability for the data warehouse to collect from silo data sources, that technology and configuration to pull that directly in. You have that the extract level, and then you have the transform level, which is that, processing, kind of magic and business rules that you wanna imply. And then you have the load which is putting it up and getting it ready and servicing a business intelligence tool. Now what’s interesting about this is that we don’t do all that. Just to be cool and to use high-tech solutions for this. We use it so that we can tell meaningful data stories.

So I always wanna bring it back to the reality is that we do all that work so that folks can tell meaningful data stories. It’s not about technology for technology c. So this is a great example. If you’re really curious, I’m happy to send a link out, One of the most impactful data stories ever told is, the inconvenient truth by Al Gore. You can argue whether his data was accurate or not accurate, but talk about a story. This image is from him that he uses two or three amazing storytelling components, to highlight that we are exponentially seeing more c o two concentrations in our atmosphere than ever before.

He wouldn’t have been able to do this without a data warehouse. Because all that data over the years has to be normalized, then he has to be able to build off of and then put in models, And so I’m just bringing it up in the sense that, you know, behind every great data story is a data analytic platform, whether it’s a data warehouse, data lake, or a ton of work.

So I’m gonna finish up with this look to the future. And for those that are on the call, this may or may not be held but this is from kind of like my forty thousand foot high level assessment of being in the data business for quite some time. So At private schools, data collection is exponentially increasing across all school types and sizes, Some folks used to say, oh, this is a big school thing. They’re gonna figure it out.

The truth is no matter what size school you are, or what type of school you are, you are going to use data, and there’s gonna be more data available to you than there has been in the last There’s it’s been more available in the last two years than last hundred years. So, we’re gonna have a lot more data to kind of sift through. I do believe that the solution providers in this space are playing catch up to the, what I call, the commercial market, to the larger technology market, making and enhancing their data to be easily accessible and interoperable for data warehouse technology. So that’s a really great sign.

And I’m, you know, looking at the two big ones, I’m talking about Blackbaud and Veracross, they are investing heavily into allowing schools to get to their data. More often. That only helps companies like school BI, but also schools like yourselves on the phone. My prediction or our prediction is that within the next eighteen months, over fifty percent of the NIS, like, schools.

Meaning, I don’t know if every NIS member will be in this particular case, but a lot of the NIS, like, schools will be integrating at least starting a data warehouse project.

In other words, right now, there’s no data warehouse checkbox like there is for academics, advancement, admissions. I believe in the next eighteen to twenty four months that’s going to change. That a lot of these schools will be implementing a data warehouse like product.

The other thing which is really important, and there’s a lot of, and I’m gonna wrap up here and go to Q and A. I saw a couple questions come in.

What folks haven’t really talked a lot about, and I’m a huge advocate for is that There’s a lot of discussion around AI, generative AI in the classroom and helping students and figuring out. One of the things that I’m more focused on and school BI is more focused on is how do we get advanced modeling and machine learning and AI to help with efficiencies, program, support, professional development, how do we help a school run with AI rather than maybe delivering in the classroom. So what I’m gonna say is that, to do that, you need a data warehouse And I I like to say that to do a machine learning advanced modeling, approach you can’t do it in those core systems.

You need a a workspace or a play space to actually take data out model it, put it somewhere, remodel it, and store it, and data warehouse provides that. So it’s one of those things where as folks start to talk about, oh, we’re gonna do all this advanced modeling. We’re gonna do using machine learning and AI on all of our data. The truth is to do it right, Schools are gonna need to implement a data warehouse.

And then last but not least, I firmly believe, and this is one of my biggest beliefs is that the data warehouse in schools are going to solve the golden record.

And I can’t tell you how exciting this is because I’ve worked in those core systems and how hard it is to be all things to all people. And what really is interesting is schools are going to solve for the golden record methodology using a warehouse and be more more open to picking point solutions to run their school. Right? They’ll still have to get the data to flow, you know, an admissions kid has to actually get into the core academic product.

But as far as putting pressure on all those providers. You gotta do this. You gotta just it actually gives you more freedom. And so I’m really excited about that because as a volunteer, a board member, and someone who cares about this market deeply, You know, we were evolving banging our head against the wall trying to think, oh, you know, Blackbaud’s gonna do everything or Veritas is gonna do everything, and I gotta get all my reports out of there.

Well, the reality is is, yeah, they should do a bunch of things, but being this golden record should more than likely be in the data warehouse. Okay.

That is a quick overview.

After the I’m gonna just take a quick sip of water, and I’ll answer a couple questions.

There will be a post webinar feedback form. Please fill it out. It really provides me personally some feedback about the content I provided You know, was it helpful? Was it relevant?

Was it not enough? And you can make suggestions of future content. We also feel free to reach out to me on LinkedIn. Love connecting with new individuals.

I saw, you know, a quite a few number of new individuals on here that I don’t know.

We have a a sign up for, our monthly newsletter called Databuzz. It’s for us, it’s a great way to get all of this new exciting information out. Do have a data maturity assessment and guide. So if you’re interested in taking that, it’s a, you know, maybe a seven minute assessment.

You self take the assessment, you answer it best can. And out of that, you get a guide on where you fall in the data maturity journey. And, you know, it’s it’s just a way for you to kind of learn more about how a school could mature its data culture and, strategy. So I’m gonna take a quick sip of water and I’m gonna answer the questions.

I’m about four minutes over.

Yeah. Great question. Someone asked are there other providers, aka the core providers Black Bot or Veracross providing data warehouse options?

As of today, no. They’re they’re trying to figure out the reporting issues that a lot of the schools are having.

And they may be creating data warehouses to enable better reporting for their schools but they’re not providing a platform, so to speak, for you to manage your data. So, like, for example, if you’re a black bot school and you’re on the phone, you know, they have their they have this, good data platform which sits behind and that you can go to a report and click a button and kinda toggle some things, but you don’t have access to good data to create your own data warehouse and set up your own rules. So, no, that’s it. Yep.

Okay.

Yep. Yep. So someone asked if I go back to this slide, they asked a little bit about the extract transform and load, and they asked about, the transform section.

So there are things they wanted to know. Can are there standard ones and then ones you can create? So in school behind, the idea is that we will have ones that we roll out to all schools. Like, for example, when you ingest data, if this ethnicity had a space behind it, and we know we don’t want a space behind it. We would, cleanse and standardize that data. If we found that there was fields in that, save the data that comes in, like, maybe a thousand rows that are completely blank. We just we it trimmons.

So if there are some standard things that you do in the transform that all warehouses would do And then there are certain things depending on what you’re trying to report on, we will have individual rules. So it’s an exciting place, and that’s what school BI is trying to do. And that’s our kind of charter is to help schools, of all shapes and sizes get to this, you know, I I joke, but get to this golden record. Now that’s a journey.

We all know that You know, sometimes you just need a better admissions report, which we can help with or a board report, which are a board dashboard we can help with. But ultimately, you know, creating a platform from schools to, use is what we’re trying to do. Yep. Hope that helps.

Alright. I’ll take two more questions.

One came in.

Yeah, if you’re interested, yeah, they asked about really what is the difference between Data Lake in data warehouse.

It’s one of those things where I was trying to, describe to a non technical friend the other day. It’s kind of like We were joking, but what what what makes up a taco, right? Is does a taco have to be made with a corn tortilla? Does a taco just have to be a little piece of bread with some ground beef and and wrapped around it. Right? And so everyone has small nuances. But the industry has kind of agreed on what the difference is between a data lake and a data warehouse.

And, the data lakes are typically used in more advanced settings and data warehouses are used more broadly across all kind of commercial entities.

Really, one of the biggest things that makes them different is the fact that it’s how they actually how and when they write the data to a database and if it’s a database.

So, this is gonna this person sounds a little bit technical. So I’ll give you an example.

When the data is, grabbed, is it going to be entered into a non relational database or into a relational database and how it’s stored? And there’s benefits and reasons why you might wanna do one or the other. So, I won’t go into that detail, but all ultimately, for the vast majority of folks, and you can have both. It’s not like both, and actually not to go into too much detail, but school BI actually is designed around the concept of a data lake and data warehouse technology. So we we are doing this specifically for schools, because we see the value I’m saying folks we do data warehouse because it’s just easier for them to kind of comprehend and understand. But, there’s there’s value to vote. Hopefully, that helps.

Okay. So that’s it for today. I’m gonna wrap up. I really appreciate you all being on the call.

Please take the survey. I can’t tell you enough about how much feedback, helps us get better at what we do. I hope to connect with you in person or on Zoom sometime in the future and, hope you all have a great afternoon or evening or day wherever you are. Thanks.

Author