Summary
Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and the systems that are necessary to support the development and maintenance of those models.
Announcements
Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and the systems that are necessary to support the development and maintenance of those models.
Announcements
- Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
- Your host is Tobias Macey and today I'm interviewing Kevin Stumpf about the challenges and promise of real-time ML applications
- Introduction
- How did you get involved in machine learning?
- Can you describe what real-time ML is and some examples of where it might be applied?
- What are the operational and organizational requirements for being able to adopt real-time approaches for ML projects?
- What are some of the ways that real-time requirements influence the scale/scope/architecture of an ML model?
- What are some of the failure modes for real-time vs analytical or operational ML?
- Given the low latency between source/input data being generated or received and a prediction being generated, how does that influence susceptibility to e.g. data drift?
- Data quality and accuracy also become more critical. What are some of the validation strategies that teams need to consider as they move to real-time?
- What are the most interesting, innovative, or unexpected ways that you have seen real-time ML applied?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on real-time ML systems?
- When is real-time the wrong choice for ML?
- What do you have planned for the future of real-time support for ML in Tecton?
- @kevinmstumpf on Twitter
- From your perspective, what is the biggest barrier to adoption of machine learning today?
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Tecton
- Uber Michelangelo
- Reinforcement Learning
- Online Learning
- Random Forest
- ChatGPT
- XGBoost
- Linear Regression
- Train-Serve Skew
- Flink
[00:00:10]
Unknown:
Hello, and welcome to The Machine Learning Podcast. The podcast about going from idea idea to delivery with machine learning.
[00:00:20] Unknown:
Your host is Tobias Massey, and today, I'm interviewing Kevin Stumpf about the challenges and promises of real time ML applications and the work that he's doing at Tekton to help support them. So, Kevin, can you start by introducing yourself?
[00:00:31] Unknown:
Yeah. For sure. Happy to, and thanks for having me. I am the cofounder and CTO here at Tekton. And with Tekton, we help organizations get real time machine learning applications into production with our feature platform. Before starting Tekton, I was over at Uber where I helped build the central machine learning platform team called Michelangelo.
[00:00:57] Unknown:
And do you remember how you first got involved working in ML?
[00:01:01] Unknown:
Yeah. The first time where I got really deep into it after, like, taking grad school classes and whatnot on the topic was, roughly 10 years ago now when I, cofounded a trucking startup. We were actually matching long haul truck drivers with shipments that shippers or brokers, needed a truck driver for. And we used machine learning to predict prices, because oftentimes we didn't have accurate pricing information. And so we 1 of the most curious things that truckers are interested in is, well, how much am I gonna get paid for? It slowed. And so we tried to help them out by, having an ML algorithm that predicts the prices. And then through that startup, I eventually ended up at Uber, where I, as I just mentioned, was part of the central machine learning platform team, Michelangelo, which helps pretty much all the different ML teams at Uber get machine learning into production.
And that has everything from predicting ETAs when you open up the Uber app to recommend excuse me, recommending which restaurants you may wanna purchase from an Uber Eats, pricing predictions, etcetera. Like, there's a whole long, class of different ML problems that are solved with Michelangelo.
[00:02:22] Unknown:
And in terms of the focus for the conversation today on real time ML, I'm curious if you can describe a bit about what that is and how you might differentiate it from other approaches to machine learning that folks are using.
[00:02:36] Unknown:
Yeah. Real time machine learning should really be contrasted to batch machine learning or also sometimes called offline machine learning. Real time machine learning, I'd classify as any application that uses machine learning where the machine learning prediction affects the end user immediately, right, in the moment where that prediction is being made and not hours later, days later, or whatever it is. And common examples here are, like, real time transaction fraud detection where somebody's swapping a credit card somewhere, and it's either immediately denied or accepted, or it's real time recommendations where the items that you've just clicked on in Amazon affect the next recommendations that you get on the page that you're gonna browse a couple seconds later, or dynamic pricing systems, which react to the movements of supply and demand literally in the moment. Like, if you look at, like, the the ride sharing applications out there, real time bot detection, all that type of stuff. So all of these are applications where the prediction is made and the end user is immediately affected by it. And when you look at the offline machine learning, that's typically where, like, a a common example would be maybe you run a a model in big spikes in batches, like once a day, once a week or something where you look at all of your customers and you predict their likelihood of churning, And then you may send them an email with a coupon code or something like that.
[00:04:10] Unknown:
And in terms of the kind of utility of real time ML, I'm curious if there are any common categories of problem that it lends itself well to or particular industries that have a higher predominance of using real time ML and just some of the ways that people should be thinking about whether or not it is a suitable fit for the problem that they're trying to solve?
[00:04:33] Unknown:
The best thing to think about is actually, like, where and you can use your human intuition honestly for this, which is where do you have most of the information that will help you make a prediction about something that you wanna predict, like, say, a recommendation or whether you wanna detect fraud in a transaction. Is that information very new, very fresh? Is that information that's basically just come into existence a couple seconds ago, a couple minutes ago, a couple hours ago? Is that the type of information that you, if you personally had to decide and personally with infinite time, had to make a recommendation without a computer? Is that the type of information you would use in order to make that human prediction, so to say? If that type of information has only come into existence in the last couple of minutes, hours, up to milliseconds ago or up until right now, then that's typically a good use case and a good case for, real time machine learning.
If on the other hand, you'd say, hey. The like, 99% of the wealth of information is just buried in what the user, what I know about the user since the big bang up until a week ago, then, and I only need to react to that information in batches once a day, once a week, or something like that, then that's a good case that, hey, that's probably that is is best served by batch machine learning.
[00:05:58] Unknown:
And as far as the technical and organizational requirements for being able to actually take on this real time ML workflow, I'm curious what are some of the additional complexities or maybe even in some case, simplicities that get, incorporated into the supporting infrastructure and architecture to be able to actually start thinking about these real time predictions?
[00:06:23] Unknown:
Yeah. When we look at the operational and organizational requirements before we we we look at the the the scope and the architecture, of these real time ML applications, at the end of the day, real time machine learning really covers, 4 different parts. 1 is it it touches the operational system. That's the system in which your microservices are running or the other production applications like your back end system, your postgres database, all the stuff that cannot go down that's mission critical is the operational stack. That's 1. Then there's the data pipelines, right? Like somebody needs to actually calculate machine learning features, run training pipelines, make predictions, and that typically sits next to your organization, or close to your organization's data engineering efforts and other data pipelines, whether they're batch or stream. Then it also touches the observability parts of your organization because you need to know whether your model is going off the rails or not.
And then finally, of course, data science is involved because somebody needs to come up with the ideas to how to train a model, what features to implement, etcetera, and then finally train the model. And so you've got these 4 things, operational system, data pipelines, observability, data science, and your organization needs to be set up to cater to all of those for real time machine learning. And so typically what you'd find is that you either have, like trios of people like a data scientist and, data engineer and a software engineer work together or some lucky organization have them all baked into 1 in the form of a machine learning engineer.
Some platforms, like, obviously, Tekton, is trying to make that significantly easier so you don't need as many folks with different backgrounds and whatnot to, to actually power real time machine learning. But good indicators that your organization is actually ready for real time ML is that you can already spin out APIs and services with ease, and you follow DevOps best practices already, and all that stuff is is not very, very foreign to you.
[00:08:21] Unknown:
And real time as a term in the technology industry is always interesting because it's always so fraught and overloaded with concepts and kind of, different applications of use. And so far, what we've been talking about with real time ML is focused on being able to take data as input that was generated recently and then generate a prediction from it. There are also elements of ML or applications of ML that might be considered real time as far as things like reinforcement learning where you're actually actively updating the model itself with data as it is generated or kind of streaming ML where, again, you're constantly kind of retraining the model, and you never actually go back and do a big bang, retrain the model, deploy it, and and so you're you're constantly evolving the model as it sees new data. I'm wondering if you can talk through maybe some of the nuance of these kind of 3 different categories of how you might think about real time in this context of an ML application.
[00:09:16] Unknown:
Yeah. And and at the end of the day, it really just comes down to terminology and what you include and what you exclude. And personally, I include in real time machine learning anything where you have a model that's making a prediction online, even if it's just acting on precomputed batch features. It doesn't have to act on, like, real time data or streaming data. It's totally fine if it just acts on pre computed batch features, for instance, to make a prediction in the operational online system, and that has been most helpful in having conversations with our customers, to to differentiate it from batch machine learning. Reinforcement learning are, like, somewhat related is like online learning where you're continuously updating the model as it's running in production. I would generally classify that also as an online system. It's certainly acting online. That's in the operational world, where it's continuously acting on new information and updating itself. To be frank, it's something that I basically never see in production used by anyone. It's super heavily touched upon in the research field, but I think there's a scenario where research is, many, many years ahead of actually industry.
And there are some companies who of course use it. Google, I know Twitter has some, online learning systems and whatnot, but by and large, the the vast, vast majority of real time machine learning applications don't leverage these,
[00:10:42] Unknown:
these mechanisms yet. And as far as the actual model architecture of what you're deploying to be able to do these kind of low latency predictions, I'm curious how those constraints and requirements influence the, I guess, styles of machine learning algorithms that people are able to employ. So thinking in terms of, like, stochastic models versus deep learning, etcetera, as well as the size of the model that is, kind of viable to be used in this context? So, again, thinking in terms of, you know, a random forest versus chat GPT or something.
[00:11:17] Unknown:
Yeah. Actually, the the the whether you look at the decision trees like your XGBoost or the linear models or the deep learning models, At the end of the day, that can all be used and they all are heavily used in real time systems. Yes. There is, certainly a very big difference in the inference cost and whether you need a GPU cluster under the hood, or whether a a small CPU is enough to quickly enough, make the inference, but they can all be used for real time applications. The size of them is, of course, different, and your JET GVT model is gonna be significantly larger than your good old, linear regression. But they're yeah. They they can all be used, but the operational complexity is, of course, a little bit more significantly higher the more complex the model is. And the other interesting element of using
[00:12:12] Unknown:
machine learning in this online mode versus a batch workflow is the potential for things to go wrong where, you know, obviously, there are some notable examples of, like the Tay chatbot where it's continually learning and and trying out new things, but also just in the case of, you know, maybe I have a bad batch of data that made it into my feature store, and so now I'm giving erroneous predictions or or recommendations. I'm just curious how the kind of failure modes and guardrails around machine learning models need to be thought about in this online versus batch context.
[00:12:49] Unknown:
Yeah. 1 of the biggest failure modes here is train serve SKU, and train serve SKU for those who are not familiar with it basically means that you're training your model on data that looks different in its shape or form from the data that you make predictions with in the production system. If there's a skew, then your bottle model basically is not set up for success because it sees completely different shapes and forms of data on the production system. So it doesn't know what to do with it. So the prediction is gonna be a a flip of coin basically. And it's very hard to notice that because you're not gonna get a exception, say from the model running the production system. It's not gonna know that it's never seen anything like it. It's just gonna do its best and make a prediction, and it's very hard to identify these types of issues. And so trained service queues is very easy to introduce with real time machine learning and significantly easier than with your common batch or, offline machine learning application. And, 1 funny example that I have here is we talked to 1 company where 1 of the large consulting firms came in and it was a big bank, and they went in and they trained a model for them. I think it was a fraud detection model. The model was trained, put in production, and then it just ran in production for literally years, and it was never retrained and updated. And nobody actually looked at whether this model is, going off the cliff or whether you're leaving a ton of money on the table by not retraining it, which is almost guaranteed to be the case.
And the reason why I'm bringing this anecdote up is that with real time machine learning, it's extremely easy to just not continuously retrain your model. Just train it once, you throw it behind an endpoint, and then you let it run-in production, rather than retraining it frequently. Versus with batch ML applications. What I most commonly see is that a lot of companies actually, they retrain the entire model before they make batch predictions. And it's just all the same Jupyter notebook or the same Python file that's running on an airflow. Just retrain the whole thing and make a prediction. And as long as it's not cost prohibitive to do to do the retraining step, it's actually not a bad idea to do that and to include the information that you have up until the point in time in which you're making the predictions. Besides retraining, another thing that's important to keep in mind is that with real time ML applications, 1 of the biggest challenges is that you need to calculate the features both for your training pipeline, which runs offline, and for your inference pipeline, which runs online. And so you need to be able to provide the feature values, to your real time machine learning model that's running in the production system at super low latency. Right? Like, you need to be able to provide this feature in a handful of milliseconds with high availability while offline, you just crunch through a ton of data in huge spikes as you generate the training data. So to make this a bit more concrete, what you would typically do if you have a real time feature or a streaming feature is in the production system, you may run a Flink streaming job or a Spark structure streaming job, or maybe just run some Rust code that's executing your transformation in real time.
While offline, you reimplement everything using, say, Snowflake SQL because all of your data is in your data warehouse. And now you have to make sure that the 2 implementations are identical because if they're not, you introduce train search skew. That's very hard to debug, and your model performance may be awful or not as good as it possibly could be. So those are 2 very common failure modes that are very easy to introduce with real time machine learning and best avoided, of course, with something like a a feature platform. And as far as that
[00:16:28] Unknown:
kind of train, serve, skew challenge, it also feeds back into the question of data quality, particularly in an ML context where the quality of the data is a huge predictor of the quality of the output model. And I'm curious how teams are addressing some of the question of how to, you know, identify the critical elements of what quality data looks like for a given model and some of the kind of monitoring and ongoing validation that needs to be performed as new data as is fed into the platform so that you can, for instance, maybe, you know, shunt bad data into a dev letter queue to be reconsidered or, you know, factored into updated experimentation and to some of the aspects of kind of ongoing validation and maintenance of the model as it continues to serve without having
[00:17:16] Unknown:
to kind of take it down for maintenance or have it be spitting bad, you know, bad predictions for x period of time without being aware of it. When it comes to observability and monitoring with machine learning, I'd broadly look at 3 different categories. 1 is just data quality monitoring, the next 1 is prediction monitoring, and the next 1 is operational health monitoring. Operational health monitoring is what we already know from how we run microservices in production. Like, what are the what are the serving latencies look like? What does the uptime look like? In the case of a feature store or feature platform, what's the what does the feature freshness look like that I'm continuously serving? All of that, I would bucket under operational health where if things go off, oftentimes you have a, say, a provisioning issue or you're you're running on an unhealthy node or something like that. So that's super important to monitor because if that's going off the rails, then your predictions are not gonna you're not gonna get good predictions. Then there is model prediction monitoring where you wanna make sure that the prediction that you're making is still having a positive impact on the business situation where where you wanna create value? Like, am I are my recommendation predictions, are they still serving the business? Like, am I still selling as many items as I expect to? Like, what's my click through rate on the recommended items? And if those are suddenly if the click through rate is suddenly falling off the cliff, then you have a problem which may very well be related to your model. And then the third is data quality monitoring associated to it, which is like the what does the quality of the features look like that is being fed into the ML model? Then here, you can again bucketize, and you can either say, well, do I have full on outages where my maybe there's no data coming in or the distribution of my ML features is now, or the really, the the statistics now look completely off. Maybe I'm suddenly multiplying the user's age by a 100, and that's not anymore within the range the and the schema that the model expects.
So those will be full on outages. In the separate category under these data quality issues is Drift. Common problem in machine learning where the the data may still be real and representative of reality, but it's shifted over time where, maybe your, users' preferences have just shifted over time, and you really ought to now retrain your ML model because your model just hasn't seen these types of distributions, these types of means and standard deviations or categories of your ML features before, so it needs to be retrained. And so sophisticated solution should monitor all of these elements. Of course, that's quite a lot, and that's why platforms and tooling helps you. But those are the different areas where you wanna have visibility, in order to make sure that your model isn't going off the rails. And
[00:20:02] Unknown:
for teams who are trying to build these types of systems, I'm curious what are some of the gaps in knowledge or understanding that you see them commonly encounter and maybe some of the useful patterns that they have built up to be able to account for some of these error cases and failure modes so that there is kind of a a general team understanding of how to address these systems and build them robustly versus the kind of the cowboy case of, hey. I built this model and put it in production. Isn't it swell? And then everything just breaks.
[00:20:34] Unknown:
Yeah. I think the worst things here are the silent errors. The loud errors are, you know, even if you don't think about them, you'll realize that when they're in production, and the loud errors had classified as suddenly your latencies are going through the roof, and so you're not able to make the the item recommendation within the 200 milliseconds that the rest of your team needs it. Or maybe the prediction isn't returning anything useful, and you're just clearly, you're recommending garbage. And those will be the loud failure modes, which are easy to detect, the silent ones, where you're silently introducing train serving skew because you're not monitoring properly the quality of your data. Those are really, really tricky to detect, so you better have a good data quality data quality monitoring solution in place. And again, here, it's it's not enough to just have data quality monitoring solution because that may just show you that the your batch data that you use for training is of good quality and doesn't change, but you also need to make sure that you are, for, at prediction time, calculating your features the exact same way.
And that's why, of course, I I would recommend to anybody to use a feature store, a feature platform to minimize the risk of introducing train serve, SKU simply by incorrectly, calculating the feature values or or having differentiated implementations between serving and and training. And so, takeaway here should be, like, data quality monitoring, super important. And then second, have a a feature platform or a system where it's guaranteed that the implementation of your serving features and your training features is identical because if you don't have these systems, silent errors will eventually smuggle themselves into your system that are really, really, really hard to detect and very hard to debug.
[00:22:21] Unknown:
And in terms of the kind of featurization strategy, what are some of the useful tactics in terms of being able to actually build the features in such a way that they are debuggable and composable so that you don't have some giant complex feature that you can do kind of a a surface level validation of, hey, This gave me a feature, and it's within a range that I think I expect. But then as time goes on and you're you undergo the issue of kind of inevitable data drift because the world moves on, You kind of waste a lot of time with debugging things until you get to the point of, hey. It's this feature, and I'm putting too many different conflated requirements into it.
[00:22:59] Unknown:
Yeah. 2 things I'd say here. 1 is, it's important to not only monitor the quality of your predictions and the output of your model to see, hey. Are they suddenly looking super different? Are the distributions of my prediction suddenly shifting entirely? But it's extremely important to look at the inputs to the model because typically for every single 1 prediction, you have on the order of tens, 100, 1,000, maybe even higher than that or of magnitude features, and you wanna know which 1 of those is changing, which may, at the end of the day, affect the quality of your prediction. And so may even have some leading indicators here in your feature where you don't model's performance. And so that's where modularity in a way or really more granular monitoring comes in. You just wanna break it up and look at the feature quality and not just the at the prediction quality. And then apart from that, you touched on modularity and feature engineering. And here, of course, it's important to try to use a system where you reuse individual building blocks off your, in your feature engineering system where everybody doesn't reimplement customer lifetime value or 30 minute transaction count feature on their own, but they all have the same common understanding of what the definition should look like and share it and reuse it. And then similarly, if you have derivatives of features, they shouldn't they should depend on the literal feature implementation rather than have a redundant and copy implementation of the upstream computation.
And so you should manage those features in code, and feature engineering transformation should be managed in code backed by Git, etcetera, and use common software engineering principles to keep it dry, to not repeat yourself, and to reuse whatever you can.
[00:24:53] Unknown:
And I guess, in terms of the experiences that you've had of working with customers who are starting to go down this path of building machine learning systems, wanting to be able to react, you know, promptly to input data, what are some of the kind of pitfalls that they have encountered and some of the useful mistakes that, you have seen people make that you and others have been able to learn from?
[00:25:19] Unknown:
Yeah. The most common mistake is just always forgetting the train service queue issue or to even to start with a system where you say, ah, it's gonna be good enough. I don't need to think too much about my foundation. I'm just gonna hack something up, put it in production, and then you've got the stickiness effect of your application that's running in production. It's not implemented using good practices, and then it's gonna be pretty hard to replace later on with good practices where you can avoid trained serve SKU. Another thing that's quite interesting is when you look at real time machine learning, the the real time data processing industry as a whole is actually still far behind the batch processing industry.
Right? Like, we when you look at the sparks of the world, the snowflakes of the world, we're really good now at conveniently processing large amounts of data. Like, over the last 10 years, 20 years, we've come a long way, from the map reduces of the world, to just being able to write a simple SQL query that can easily process, terabytes, petabytes of data, and we kind of forget what that all looked like 10, 15 years ago when you had to implement individual Java applications to do any map reduce type stuff. Streaming today is still pretty hard, and it is a significantly harder challenge because you're dealing with fundamentally unbounded datasets. Like, a batch dataset is bound. It's limited. You know exactly how many rows there are in there. Of course, it can be growing, but you can arbitrarily slice it while streams are, unbounded. And so when you think about things like stream aggregation, stream joins, etcetera, those are hard problems.
And there are a lot of different players who are out there and who are trying to make stream processing easier, but those aren't really deeply proliferated yet. And, that makes it harder to introduce, specifically streaming features in machine learning applications. Of course, TechTown, we we have our own answers to make that simpler for customers to express common streaming, machine learning features. But as a whole, that has just been a pretty interesting realization after having come out of Uber where we had a pretty sophisticated streaming stack on how immature the industry generally still is.
[00:27:40] Unknown:
And in your experience of building Tekton and helping customers through this journey of being able to build and deploy these real time ML systems,
[00:28:06] Unknown:
is is just like a a really good cause that can actually save lives, which I really like. And apart from that, I'm also always excited to see and sometimes surprised to see when I see safety applications of machine learning. So it's not just about fraud detection, but it's literal end user safety, whether that's the safety of a rider of a of a of a Uber trip or the safety of someone who's playing a game where you can literally protect whether somebody is could be potentially under in a in a in a potentially harmful situation that you may wanna do something about. And those use cases aren't all too commonly talked about, but I love seeing those. And then maybe finally, 1 thing that I also find always interesting is when I look at customers who use, feature stores is oftentimes what comes before you have a machine learning model that's running in production is that you actually have a human implemented heuristic, a rule that's making a decision based on input data, which can actually get you quite far as well.
And these rule based systems, they also need to act on data, that's streaming data or batch data or, real time data. And we see customers actually use Tecton as well to power some of their rule based systems before they're ready to upgrade those to, machine learning driven systems.
[00:29:25] Unknown:
In your experience of working in this space, I'm curious what are some of the notable evolutions in terms of kind of capability and functionality and some of the, future potential that you envision once we do do get to a point where these kind of streaming data systems are more on par with batch capabilities?
[00:29:45] Unknown:
Yeah. The the future that I have in mind is 1 where you can look at an organization's data systems, and you basically see this giant web of, with nodes and edges that connect to different nodes. And every every, node is a data consumer or a data producer, and the edges are the actual data processing. And it doesn't matter whether it's streaming or batch, but you've got the full lineage graph. You see exactly where the data is coming from and where it's going to, what's failing, what's succeeding. It doesn't matter whether it's, analytics feeding into a dashboard or whether it's a machine learning feature that's feeding into your fraud application.
You see exactly how things are shared. That's, the vision that I and the others have in mind that we're all working towards where you just have this fully integrated view of all your different data producers, consumers, transformers. And we're, of course, far from that where even today within the same team, you oftentimes don't have this visibility, let alone across teams in the same organization. And maybe at some point down the road, we can even have this type of visibility across organizations and not just within the same organization.
[00:31:06] Unknown:
And for people who are considering whether and how to apply ML to their problems, what are the cases where a real time approach is the wrong choice?
[00:31:16] Unknown:
Real time ML is not the right choice if you don't care about the information that happened in the seconds or minutes or hours leading up to the prediction. Like, if it's 1 of those use cases where it's more about, hey, everything from big bang until yesterday and that'll always stay this way. You don't think that you'll get much more juice out of the prediction by bringing in more recent real time information. If that's the case, coupled with you making these predictions and acting on them in in batch, like once a day or once a week, then that is a batch machine learning use case, and there's no real need to force, real time machine learning complexity on top of that. If on the other hand, you need information up until the moment where you're making a prediction or if you are
[00:32:05] Unknown:
acting immediately on the prediction as you as you make it and it affects the end user, then real time machine learning is a good choice. Are there any other aspects of the kind of real time ML ecosystem and applications and the ways that you're working to help support those capabilities that we didn't discuss yet that you would like to cover before we close out the show? I think we went pretty broad and comprehensive here. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team at Tekton are doing and the kind of future direction of real time ML applications. I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest barrier to adoption for machine learning today. The biggest barrier
[00:32:50] Unknown:
for adopting real time machine learning is typically that companies don't yet have the right tooling and the right platforms in place to actually make it easy to develop these real time machine learning models and features that can be consumed by ML models that are running in production. That's, of course, why self servingly, we're we're going after this market, and we're trying to make it significantly easier for enterprises to have a system that
[00:33:21] Unknown:
is easy for them to develop real time machine learning applications with. Alright. Well, thank you very much for taking the time today to join me and share your experiences of working in the space of real time machine learning and some of the ways that it can be applied and how teams should be thinking about it, potential for their problem domains. Appreciate all the time and energy that you and your, team are putting into making this a more tractable and approachable problem. So thank you again for that, and I hope you enjoy the rest of your day. Thank you. Thanks for having me.
[00:33:54] Unknown:
Thank you for listening, and don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at themachinelearningpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelarningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to The Machine Learning Podcast. The podcast about going from idea idea to delivery with machine learning.
[00:00:20] Unknown:
Your host is Tobias Massey, and today, I'm interviewing Kevin Stumpf about the challenges and promises of real time ML applications and the work that he's doing at Tekton to help support them. So, Kevin, can you start by introducing yourself?
[00:00:31] Unknown:
Yeah. For sure. Happy to, and thanks for having me. I am the cofounder and CTO here at Tekton. And with Tekton, we help organizations get real time machine learning applications into production with our feature platform. Before starting Tekton, I was over at Uber where I helped build the central machine learning platform team called Michelangelo.
[00:00:57] Unknown:
And do you remember how you first got involved working in ML?
[00:01:01] Unknown:
Yeah. The first time where I got really deep into it after, like, taking grad school classes and whatnot on the topic was, roughly 10 years ago now when I, cofounded a trucking startup. We were actually matching long haul truck drivers with shipments that shippers or brokers, needed a truck driver for. And we used machine learning to predict prices, because oftentimes we didn't have accurate pricing information. And so we 1 of the most curious things that truckers are interested in is, well, how much am I gonna get paid for? It slowed. And so we tried to help them out by, having an ML algorithm that predicts the prices. And then through that startup, I eventually ended up at Uber, where I, as I just mentioned, was part of the central machine learning platform team, Michelangelo, which helps pretty much all the different ML teams at Uber get machine learning into production.
And that has everything from predicting ETAs when you open up the Uber app to recommend excuse me, recommending which restaurants you may wanna purchase from an Uber Eats, pricing predictions, etcetera. Like, there's a whole long, class of different ML problems that are solved with Michelangelo.
[00:02:22] Unknown:
And in terms of the focus for the conversation today on real time ML, I'm curious if you can describe a bit about what that is and how you might differentiate it from other approaches to machine learning that folks are using.
[00:02:36] Unknown:
Yeah. Real time machine learning should really be contrasted to batch machine learning or also sometimes called offline machine learning. Real time machine learning, I'd classify as any application that uses machine learning where the machine learning prediction affects the end user immediately, right, in the moment where that prediction is being made and not hours later, days later, or whatever it is. And common examples here are, like, real time transaction fraud detection where somebody's swapping a credit card somewhere, and it's either immediately denied or accepted, or it's real time recommendations where the items that you've just clicked on in Amazon affect the next recommendations that you get on the page that you're gonna browse a couple seconds later, or dynamic pricing systems, which react to the movements of supply and demand literally in the moment. Like, if you look at, like, the the ride sharing applications out there, real time bot detection, all that type of stuff. So all of these are applications where the prediction is made and the end user is immediately affected by it. And when you look at the offline machine learning, that's typically where, like, a a common example would be maybe you run a a model in big spikes in batches, like once a day, once a week or something where you look at all of your customers and you predict their likelihood of churning, And then you may send them an email with a coupon code or something like that.
[00:04:10] Unknown:
And in terms of the kind of utility of real time ML, I'm curious if there are any common categories of problem that it lends itself well to or particular industries that have a higher predominance of using real time ML and just some of the ways that people should be thinking about whether or not it is a suitable fit for the problem that they're trying to solve?
[00:04:33] Unknown:
The best thing to think about is actually, like, where and you can use your human intuition honestly for this, which is where do you have most of the information that will help you make a prediction about something that you wanna predict, like, say, a recommendation or whether you wanna detect fraud in a transaction. Is that information very new, very fresh? Is that information that's basically just come into existence a couple seconds ago, a couple minutes ago, a couple hours ago? Is that the type of information that you, if you personally had to decide and personally with infinite time, had to make a recommendation without a computer? Is that the type of information you would use in order to make that human prediction, so to say? If that type of information has only come into existence in the last couple of minutes, hours, up to milliseconds ago or up until right now, then that's typically a good use case and a good case for, real time machine learning.
If on the other hand, you'd say, hey. The like, 99% of the wealth of information is just buried in what the user, what I know about the user since the big bang up until a week ago, then, and I only need to react to that information in batches once a day, once a week, or something like that, then that's a good case that, hey, that's probably that is is best served by batch machine learning.
[00:05:58] Unknown:
And as far as the technical and organizational requirements for being able to actually take on this real time ML workflow, I'm curious what are some of the additional complexities or maybe even in some case, simplicities that get, incorporated into the supporting infrastructure and architecture to be able to actually start thinking about these real time predictions?
[00:06:23] Unknown:
Yeah. When we look at the operational and organizational requirements before we we we look at the the the scope and the architecture, of these real time ML applications, at the end of the day, real time machine learning really covers, 4 different parts. 1 is it it touches the operational system. That's the system in which your microservices are running or the other production applications like your back end system, your postgres database, all the stuff that cannot go down that's mission critical is the operational stack. That's 1. Then there's the data pipelines, right? Like somebody needs to actually calculate machine learning features, run training pipelines, make predictions, and that typically sits next to your organization, or close to your organization's data engineering efforts and other data pipelines, whether they're batch or stream. Then it also touches the observability parts of your organization because you need to know whether your model is going off the rails or not.
And then finally, of course, data science is involved because somebody needs to come up with the ideas to how to train a model, what features to implement, etcetera, and then finally train the model. And so you've got these 4 things, operational system, data pipelines, observability, data science, and your organization needs to be set up to cater to all of those for real time machine learning. And so typically what you'd find is that you either have, like trios of people like a data scientist and, data engineer and a software engineer work together or some lucky organization have them all baked into 1 in the form of a machine learning engineer.
Some platforms, like, obviously, Tekton, is trying to make that significantly easier so you don't need as many folks with different backgrounds and whatnot to, to actually power real time machine learning. But good indicators that your organization is actually ready for real time ML is that you can already spin out APIs and services with ease, and you follow DevOps best practices already, and all that stuff is is not very, very foreign to you.
[00:08:21] Unknown:
And real time as a term in the technology industry is always interesting because it's always so fraught and overloaded with concepts and kind of, different applications of use. And so far, what we've been talking about with real time ML is focused on being able to take data as input that was generated recently and then generate a prediction from it. There are also elements of ML or applications of ML that might be considered real time as far as things like reinforcement learning where you're actually actively updating the model itself with data as it is generated or kind of streaming ML where, again, you're constantly kind of retraining the model, and you never actually go back and do a big bang, retrain the model, deploy it, and and so you're you're constantly evolving the model as it sees new data. I'm wondering if you can talk through maybe some of the nuance of these kind of 3 different categories of how you might think about real time in this context of an ML application.
[00:09:16] Unknown:
Yeah. And and at the end of the day, it really just comes down to terminology and what you include and what you exclude. And personally, I include in real time machine learning anything where you have a model that's making a prediction online, even if it's just acting on precomputed batch features. It doesn't have to act on, like, real time data or streaming data. It's totally fine if it just acts on pre computed batch features, for instance, to make a prediction in the operational online system, and that has been most helpful in having conversations with our customers, to to differentiate it from batch machine learning. Reinforcement learning are, like, somewhat related is like online learning where you're continuously updating the model as it's running in production. I would generally classify that also as an online system. It's certainly acting online. That's in the operational world, where it's continuously acting on new information and updating itself. To be frank, it's something that I basically never see in production used by anyone. It's super heavily touched upon in the research field, but I think there's a scenario where research is, many, many years ahead of actually industry.
And there are some companies who of course use it. Google, I know Twitter has some, online learning systems and whatnot, but by and large, the the vast, vast majority of real time machine learning applications don't leverage these,
[00:10:42] Unknown:
these mechanisms yet. And as far as the actual model architecture of what you're deploying to be able to do these kind of low latency predictions, I'm curious how those constraints and requirements influence the, I guess, styles of machine learning algorithms that people are able to employ. So thinking in terms of, like, stochastic models versus deep learning, etcetera, as well as the size of the model that is, kind of viable to be used in this context? So, again, thinking in terms of, you know, a random forest versus chat GPT or something.
[00:11:17] Unknown:
Yeah. Actually, the the the whether you look at the decision trees like your XGBoost or the linear models or the deep learning models, At the end of the day, that can all be used and they all are heavily used in real time systems. Yes. There is, certainly a very big difference in the inference cost and whether you need a GPU cluster under the hood, or whether a a small CPU is enough to quickly enough, make the inference, but they can all be used for real time applications. The size of them is, of course, different, and your JET GVT model is gonna be significantly larger than your good old, linear regression. But they're yeah. They they can all be used, but the operational complexity is, of course, a little bit more significantly higher the more complex the model is. And the other interesting element of using
[00:12:12] Unknown:
machine learning in this online mode versus a batch workflow is the potential for things to go wrong where, you know, obviously, there are some notable examples of, like the Tay chatbot where it's continually learning and and trying out new things, but also just in the case of, you know, maybe I have a bad batch of data that made it into my feature store, and so now I'm giving erroneous predictions or or recommendations. I'm just curious how the kind of failure modes and guardrails around machine learning models need to be thought about in this online versus batch context.
[00:12:49] Unknown:
Yeah. 1 of the biggest failure modes here is train serve SKU, and train serve SKU for those who are not familiar with it basically means that you're training your model on data that looks different in its shape or form from the data that you make predictions with in the production system. If there's a skew, then your bottle model basically is not set up for success because it sees completely different shapes and forms of data on the production system. So it doesn't know what to do with it. So the prediction is gonna be a a flip of coin basically. And it's very hard to notice that because you're not gonna get a exception, say from the model running the production system. It's not gonna know that it's never seen anything like it. It's just gonna do its best and make a prediction, and it's very hard to identify these types of issues. And so trained service queues is very easy to introduce with real time machine learning and significantly easier than with your common batch or, offline machine learning application. And, 1 funny example that I have here is we talked to 1 company where 1 of the large consulting firms came in and it was a big bank, and they went in and they trained a model for them. I think it was a fraud detection model. The model was trained, put in production, and then it just ran in production for literally years, and it was never retrained and updated. And nobody actually looked at whether this model is, going off the cliff or whether you're leaving a ton of money on the table by not retraining it, which is almost guaranteed to be the case.
And the reason why I'm bringing this anecdote up is that with real time machine learning, it's extremely easy to just not continuously retrain your model. Just train it once, you throw it behind an endpoint, and then you let it run-in production, rather than retraining it frequently. Versus with batch ML applications. What I most commonly see is that a lot of companies actually, they retrain the entire model before they make batch predictions. And it's just all the same Jupyter notebook or the same Python file that's running on an airflow. Just retrain the whole thing and make a prediction. And as long as it's not cost prohibitive to do to do the retraining step, it's actually not a bad idea to do that and to include the information that you have up until the point in time in which you're making the predictions. Besides retraining, another thing that's important to keep in mind is that with real time ML applications, 1 of the biggest challenges is that you need to calculate the features both for your training pipeline, which runs offline, and for your inference pipeline, which runs online. And so you need to be able to provide the feature values, to your real time machine learning model that's running in the production system at super low latency. Right? Like, you need to be able to provide this feature in a handful of milliseconds with high availability while offline, you just crunch through a ton of data in huge spikes as you generate the training data. So to make this a bit more concrete, what you would typically do if you have a real time feature or a streaming feature is in the production system, you may run a Flink streaming job or a Spark structure streaming job, or maybe just run some Rust code that's executing your transformation in real time.
While offline, you reimplement everything using, say, Snowflake SQL because all of your data is in your data warehouse. And now you have to make sure that the 2 implementations are identical because if they're not, you introduce train search skew. That's very hard to debug, and your model performance may be awful or not as good as it possibly could be. So those are 2 very common failure modes that are very easy to introduce with real time machine learning and best avoided, of course, with something like a a feature platform. And as far as that
[00:16:28] Unknown:
kind of train, serve, skew challenge, it also feeds back into the question of data quality, particularly in an ML context where the quality of the data is a huge predictor of the quality of the output model. And I'm curious how teams are addressing some of the question of how to, you know, identify the critical elements of what quality data looks like for a given model and some of the kind of monitoring and ongoing validation that needs to be performed as new data as is fed into the platform so that you can, for instance, maybe, you know, shunt bad data into a dev letter queue to be reconsidered or, you know, factored into updated experimentation and to some of the aspects of kind of ongoing validation and maintenance of the model as it continues to serve without having
[00:17:16] Unknown:
to kind of take it down for maintenance or have it be spitting bad, you know, bad predictions for x period of time without being aware of it. When it comes to observability and monitoring with machine learning, I'd broadly look at 3 different categories. 1 is just data quality monitoring, the next 1 is prediction monitoring, and the next 1 is operational health monitoring. Operational health monitoring is what we already know from how we run microservices in production. Like, what are the what are the serving latencies look like? What does the uptime look like? In the case of a feature store or feature platform, what's the what does the feature freshness look like that I'm continuously serving? All of that, I would bucket under operational health where if things go off, oftentimes you have a, say, a provisioning issue or you're you're running on an unhealthy node or something like that. So that's super important to monitor because if that's going off the rails, then your predictions are not gonna you're not gonna get good predictions. Then there is model prediction monitoring where you wanna make sure that the prediction that you're making is still having a positive impact on the business situation where where you wanna create value? Like, am I are my recommendation predictions, are they still serving the business? Like, am I still selling as many items as I expect to? Like, what's my click through rate on the recommended items? And if those are suddenly if the click through rate is suddenly falling off the cliff, then you have a problem which may very well be related to your model. And then the third is data quality monitoring associated to it, which is like the what does the quality of the features look like that is being fed into the ML model? Then here, you can again bucketize, and you can either say, well, do I have full on outages where my maybe there's no data coming in or the distribution of my ML features is now, or the really, the the statistics now look completely off. Maybe I'm suddenly multiplying the user's age by a 100, and that's not anymore within the range the and the schema that the model expects.
So those will be full on outages. In the separate category under these data quality issues is Drift. Common problem in machine learning where the the data may still be real and representative of reality, but it's shifted over time where, maybe your, users' preferences have just shifted over time, and you really ought to now retrain your ML model because your model just hasn't seen these types of distributions, these types of means and standard deviations or categories of your ML features before, so it needs to be retrained. And so sophisticated solution should monitor all of these elements. Of course, that's quite a lot, and that's why platforms and tooling helps you. But those are the different areas where you wanna have visibility, in order to make sure that your model isn't going off the rails. And
[00:20:02] Unknown:
for teams who are trying to build these types of systems, I'm curious what are some of the gaps in knowledge or understanding that you see them commonly encounter and maybe some of the useful patterns that they have built up to be able to account for some of these error cases and failure modes so that there is kind of a a general team understanding of how to address these systems and build them robustly versus the kind of the cowboy case of, hey. I built this model and put it in production. Isn't it swell? And then everything just breaks.
[00:20:34] Unknown:
Yeah. I think the worst things here are the silent errors. The loud errors are, you know, even if you don't think about them, you'll realize that when they're in production, and the loud errors had classified as suddenly your latencies are going through the roof, and so you're not able to make the the item recommendation within the 200 milliseconds that the rest of your team needs it. Or maybe the prediction isn't returning anything useful, and you're just clearly, you're recommending garbage. And those will be the loud failure modes, which are easy to detect, the silent ones, where you're silently introducing train serving skew because you're not monitoring properly the quality of your data. Those are really, really tricky to detect, so you better have a good data quality data quality monitoring solution in place. And again, here, it's it's not enough to just have data quality monitoring solution because that may just show you that the your batch data that you use for training is of good quality and doesn't change, but you also need to make sure that you are, for, at prediction time, calculating your features the exact same way.
And that's why, of course, I I would recommend to anybody to use a feature store, a feature platform to minimize the risk of introducing train serve, SKU simply by incorrectly, calculating the feature values or or having differentiated implementations between serving and and training. And so, takeaway here should be, like, data quality monitoring, super important. And then second, have a a feature platform or a system where it's guaranteed that the implementation of your serving features and your training features is identical because if you don't have these systems, silent errors will eventually smuggle themselves into your system that are really, really, really hard to detect and very hard to debug.
[00:22:21] Unknown:
And in terms of the kind of featurization strategy, what are some of the useful tactics in terms of being able to actually build the features in such a way that they are debuggable and composable so that you don't have some giant complex feature that you can do kind of a a surface level validation of, hey, This gave me a feature, and it's within a range that I think I expect. But then as time goes on and you're you undergo the issue of kind of inevitable data drift because the world moves on, You kind of waste a lot of time with debugging things until you get to the point of, hey. It's this feature, and I'm putting too many different conflated requirements into it.
[00:22:59] Unknown:
Yeah. 2 things I'd say here. 1 is, it's important to not only monitor the quality of your predictions and the output of your model to see, hey. Are they suddenly looking super different? Are the distributions of my prediction suddenly shifting entirely? But it's extremely important to look at the inputs to the model because typically for every single 1 prediction, you have on the order of tens, 100, 1,000, maybe even higher than that or of magnitude features, and you wanna know which 1 of those is changing, which may, at the end of the day, affect the quality of your prediction. And so may even have some leading indicators here in your feature where you don't model's performance. And so that's where modularity in a way or really more granular monitoring comes in. You just wanna break it up and look at the feature quality and not just the at the prediction quality. And then apart from that, you touched on modularity and feature engineering. And here, of course, it's important to try to use a system where you reuse individual building blocks off your, in your feature engineering system where everybody doesn't reimplement customer lifetime value or 30 minute transaction count feature on their own, but they all have the same common understanding of what the definition should look like and share it and reuse it. And then similarly, if you have derivatives of features, they shouldn't they should depend on the literal feature implementation rather than have a redundant and copy implementation of the upstream computation.
And so you should manage those features in code, and feature engineering transformation should be managed in code backed by Git, etcetera, and use common software engineering principles to keep it dry, to not repeat yourself, and to reuse whatever you can.
[00:24:53] Unknown:
And I guess, in terms of the experiences that you've had of working with customers who are starting to go down this path of building machine learning systems, wanting to be able to react, you know, promptly to input data, what are some of the kind of pitfalls that they have encountered and some of the useful mistakes that, you have seen people make that you and others have been able to learn from?
[00:25:19] Unknown:
Yeah. The most common mistake is just always forgetting the train service queue issue or to even to start with a system where you say, ah, it's gonna be good enough. I don't need to think too much about my foundation. I'm just gonna hack something up, put it in production, and then you've got the stickiness effect of your application that's running in production. It's not implemented using good practices, and then it's gonna be pretty hard to replace later on with good practices where you can avoid trained serve SKU. Another thing that's quite interesting is when you look at real time machine learning, the the real time data processing industry as a whole is actually still far behind the batch processing industry.
Right? Like, we when you look at the sparks of the world, the snowflakes of the world, we're really good now at conveniently processing large amounts of data. Like, over the last 10 years, 20 years, we've come a long way, from the map reduces of the world, to just being able to write a simple SQL query that can easily process, terabytes, petabytes of data, and we kind of forget what that all looked like 10, 15 years ago when you had to implement individual Java applications to do any map reduce type stuff. Streaming today is still pretty hard, and it is a significantly harder challenge because you're dealing with fundamentally unbounded datasets. Like, a batch dataset is bound. It's limited. You know exactly how many rows there are in there. Of course, it can be growing, but you can arbitrarily slice it while streams are, unbounded. And so when you think about things like stream aggregation, stream joins, etcetera, those are hard problems.
And there are a lot of different players who are out there and who are trying to make stream processing easier, but those aren't really deeply proliferated yet. And, that makes it harder to introduce, specifically streaming features in machine learning applications. Of course, TechTown, we we have our own answers to make that simpler for customers to express common streaming, machine learning features. But as a whole, that has just been a pretty interesting realization after having come out of Uber where we had a pretty sophisticated streaming stack on how immature the industry generally still is.
[00:27:40] Unknown:
And in your experience of building Tekton and helping customers through this journey of being able to build and deploy these real time ML systems,
[00:28:06] Unknown:
is is just like a a really good cause that can actually save lives, which I really like. And apart from that, I'm also always excited to see and sometimes surprised to see when I see safety applications of machine learning. So it's not just about fraud detection, but it's literal end user safety, whether that's the safety of a rider of a of a of a Uber trip or the safety of someone who's playing a game where you can literally protect whether somebody is could be potentially under in a in a in a potentially harmful situation that you may wanna do something about. And those use cases aren't all too commonly talked about, but I love seeing those. And then maybe finally, 1 thing that I also find always interesting is when I look at customers who use, feature stores is oftentimes what comes before you have a machine learning model that's running in production is that you actually have a human implemented heuristic, a rule that's making a decision based on input data, which can actually get you quite far as well.
And these rule based systems, they also need to act on data, that's streaming data or batch data or, real time data. And we see customers actually use Tecton as well to power some of their rule based systems before they're ready to upgrade those to, machine learning driven systems.
[00:29:25] Unknown:
In your experience of working in this space, I'm curious what are some of the notable evolutions in terms of kind of capability and functionality and some of the, future potential that you envision once we do do get to a point where these kind of streaming data systems are more on par with batch capabilities?
[00:29:45] Unknown:
Yeah. The the future that I have in mind is 1 where you can look at an organization's data systems, and you basically see this giant web of, with nodes and edges that connect to different nodes. And every every, node is a data consumer or a data producer, and the edges are the actual data processing. And it doesn't matter whether it's streaming or batch, but you've got the full lineage graph. You see exactly where the data is coming from and where it's going to, what's failing, what's succeeding. It doesn't matter whether it's, analytics feeding into a dashboard or whether it's a machine learning feature that's feeding into your fraud application.
You see exactly how things are shared. That's, the vision that I and the others have in mind that we're all working towards where you just have this fully integrated view of all your different data producers, consumers, transformers. And we're, of course, far from that where even today within the same team, you oftentimes don't have this visibility, let alone across teams in the same organization. And maybe at some point down the road, we can even have this type of visibility across organizations and not just within the same organization.
[00:31:06] Unknown:
And for people who are considering whether and how to apply ML to their problems, what are the cases where a real time approach is the wrong choice?
[00:31:16] Unknown:
Real time ML is not the right choice if you don't care about the information that happened in the seconds or minutes or hours leading up to the prediction. Like, if it's 1 of those use cases where it's more about, hey, everything from big bang until yesterday and that'll always stay this way. You don't think that you'll get much more juice out of the prediction by bringing in more recent real time information. If that's the case, coupled with you making these predictions and acting on them in in batch, like once a day or once a week, then that is a batch machine learning use case, and there's no real need to force, real time machine learning complexity on top of that. If on the other hand, you need information up until the moment where you're making a prediction or if you are
[00:32:05] Unknown:
acting immediately on the prediction as you as you make it and it affects the end user, then real time machine learning is a good choice. Are there any other aspects of the kind of real time ML ecosystem and applications and the ways that you're working to help support those capabilities that we didn't discuss yet that you would like to cover before we close out the show? I think we went pretty broad and comprehensive here. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team at Tekton are doing and the kind of future direction of real time ML applications. I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest barrier to adoption for machine learning today. The biggest barrier
[00:32:50] Unknown:
for adopting real time machine learning is typically that companies don't yet have the right tooling and the right platforms in place to actually make it easy to develop these real time machine learning models and features that can be consumed by ML models that are running in production. That's, of course, why self servingly, we're we're going after this market, and we're trying to make it significantly easier for enterprises to have a system that
[00:33:21] Unknown:
is easy for them to develop real time machine learning applications with. Alright. Well, thank you very much for taking the time today to join me and share your experiences of working in the space of real time machine learning and some of the ways that it can be applied and how teams should be thinking about it, potential for their problem domains. Appreciate all the time and energy that you and your, team are putting into making this a more tractable and approachable problem. So thank you again for that, and I hope you enjoy the rest of your day. Thank you. Thanks for having me.
[00:33:54] Unknown:
Thank you for listening, and don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at themachinelearningpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelarningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Guest Introduction
Kevin's Journey into Machine Learning
Understanding Real-Time Machine Learning
Applications and Industries for Real-Time ML
Technical and Organizational Requirements
Real-Time ML vs. Batch ML
Model Architecture and Constraints
Failure Modes and Guardrails
Data Quality and Monitoring
Common Pitfalls and Best Practices
Feature Engineering and Modularity
Customer Experiences and Lessons Learned
Future of Streaming Data Systems
When Real-Time ML is the Wrong Choice
Final Thoughts and Contact Information