Summary
In this episode of the AI Engineering Podcast Ron Green, co-founder and CTO of KungFu AI, talks about the evolving landscape of AI systems and the challenges of harnessing generative AI engines. Ron shares his insights on the limitations of large language models (LLMs) as standalone solutions and emphasizes the need for human oversight, multi-agent systems, and robust data management to support AI initiatives. He discusses the potential of domain-specific AI solutions, RAG approaches, and mixture of experts to enhance AI capabilities while addressing risks. The conversation also explores the evolving AI ecosystem, including tooling and frameworks, strategic planning, and the importance of interpretability and control in AI systems. Ron expresses optimism about the future of AI, predicting significant advancements in the next 20 years and the integration of AI capabilities into everyday software applications.
Announcements
Parting Question
In this episode of the AI Engineering Podcast Ron Green, co-founder and CTO of KungFu AI, talks about the evolving landscape of AI systems and the challenges of harnessing generative AI engines. Ron shares his insights on the limitations of large language models (LLMs) as standalone solutions and emphasizes the need for human oversight, multi-agent systems, and robust data management to support AI initiatives. He discusses the potential of domain-specific AI solutions, RAG approaches, and mixture of experts to enhance AI capabilities while addressing risks. The conversation also explores the evolving AI ecosystem, including tooling and frameworks, strategic planning, and the importance of interpretability and control in AI systems. Ron expresses optimism about the future of AI, predicting significant advancements in the next 20 years and the integration of AI capabilities into everyday software applications.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit aiengineeringpodcast.com/cognee to learn more and elevate your AI apps and agents.
- Your host is Tobias Macey and today I'm interviewing Ron Green about the wheels that we need for harnessing the power of the generative AI engine
- Introduction
- How did you get involved in machine learning?
- Can you describe what you see as the main shortcomings of LLMs as a stand-alone solution (to anything)?
- The most established vehicle for harnessing LLM capabilities is the RAG pattern. What are the main limitations of that as a "product" solution?
- The idea of multi-agent or mixture-of-experts systems is a more sophisticated approach that is gaining some attention. What do you see as the pro/con conversation around that pattern?
- Beyond the system patterns that are being developed there is also a rapidly shifting ecosystem of frameworks, tools, and point solutions that plugin to various points of the AI lifecycle. How does that volatility hinder the adoption of generative AI in different contexts?
- In addition to the tooling, the models themselves are rapidly changing. How much does that influence the ways that organizations are thinking about whether and when to test the waters of AI?
- Continuing on the metaphor of LLMs and engines and the need for vehicles, where are we on the timeline in relation to the model T Ford?
- What are the vehicle categories that we still need to design and develop? (e.g. sedans, mini-vans, freight trucks, etc.)
- The current transformer architecture is starting to reach scaling limits that lead to diminishing returns. Given your perspective as an industry veteran, what are your thoughts on the future trajectory of AI model architectures?
- What is the ongoing role of regression style ML in the landscape of generative AI?
- What are the most interesting, innovative, or unexpected ways that you have seen LLMs used to power a "vehicle"?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working in this phase of AI?
- When is generative AI/LLMs the wrong choice?
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
- Kungfu.ai
- Llama open generative AI models
- ChatGPT
- Copilot
- Cursor
- RAG == Retrieval Augmented Generation
- Mixture of Experts
- Deep Learning
- Random Forest
- Supervised Learning
- Active Learning)
- Yann LeCunn
- RLHF == Reinforcement Learning from Human Feedback
- Model T Ford
- Mamba selective state space
- Liquid Network
- Chain of thought
- OpenAI o1
- Marvin Minsky
- Von Neumann Architecture
- Attention Is All You Need
- Multilayer Perceptron
- Dot Product
- Diffusion Model
- Gaussian Noise
- AlphaFold 3
- Anthropic
- Sparse Autoencoder
[00:00:05]
Tobias Macey:
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Seamless data integration into AI applications often falls short, leading many to adopt RAG methods which come with high costs, complexity, and limited scalability. Cogni offers a better solution with its open source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cogni enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data and LLM apps without unnecessary overhead.
Visitaiengineeringpodcast.com/cogni, that's cogne, today to learn more and elevate your AI apps and agents. Your host is Tobias Macy. And today, I'm interviewing Ron Green about the wheels that we need for harnessing the power of the generative AI engine. So, Ron, can you start by introducing yourself?
[00:01:10] Ron Green:
Yeah. I'm Ron Green. I'm cofounder and chief technology of KungFu AI.
[00:01:15] Tobias Macey:
And do you remember how you first got started working in the ML and AI space?
[00:01:19] Ron Green:
I do. I remember vividly. I was actually a computer science major at the University of Texas at Austin. And I was in my last semester, and it was, you know, just such a grind going through all of those, you know, really deeply technical courses. And I remember my last semester, heading into it, I was pretty burned out. And I remember thinking to myself, I don't know what I'm gonna do professionally, but it's probably not gonna involve software. Because I was at that point just I was thinking I needed a big change. And I had, like, one elective left, and I took an introduction to artificial intelligence.
And instantly, I mean, within, like, 2 weeks, I knew exactly what I wanted to do with the rest of my life. And, what's funny about this is that this is in the nineties, and, I remember thinking, oh, man. I'm too late. They've they've got everything figured out. You know, there were textbooks and all these, you know, really deep theorems and perspectives, and I thought, you know, shoot. Am I smart enough to do this? That probably doesn't even matter. I'm too late to do this anyway. You know, little did I know it was gonna be a, you know, like, another 20 years before it really took off, but that's how I got involved.
[00:02:40] Tobias Macey:
Yeah. It's definitely funny that the cycles that the industry goes through of we hit a certain peak and we think, oh, we've done as much as we can here, and we're on a glide path and then everything stalls out. And we have to go through another cycle of discovery to realize, oh, this actually was just a local maxima. There's a whole other mountain range to climb.
[00:03:03] Ron Green:
That's exactly right. I mean, funny story is I did, I did a master's in artificial intelligence at the University of Sussex in England. And I remember probably in, like, 2,005, I was talking to, a colleague. And I'd gotten out of AI at this point because, you know, in 2,005, there was almost nothing happening outside academia. And I was talking to a professor a colleague of mine and mentioned that I'd specialized in artificial intelligence. And he his reaction was like, oh, you know, oh, tough, tough choice, man. That was a a a big waste of time. And in, o five, it kinda felt like that might have been the case.
[00:03:41] Tobias Macey:
Now here we are where AI is on the tips of everyone's tongues. It has bridged the divide and has now entered into the consumer arena where everybody is talking about AI in different contexts. And I'm wondering given the current arena of AI in the space of generative AI and LLMs, what you see as the main shortcomings of those LLMs as a standalone solution to basically anything?
[00:04:08] Ron Green:
Yeah. The the biggest the biggest problem we're having right now with large language models as a production tool is control. If you are using a chatbot, if you're inter you know, if you're using chat gbt or you're using, you know, llama or something like that, and you're interacting with it, if you are prompting it, looking at the output, assessing it, it works great. And the same thing goes for code assistant tools. Let's say, for example, Copilot or Cursor. You can prompt it to, you know, refactor something or generate some some code from scratch. But in all of these instances, nobody at this point in time would just take the output and and use it sight unseen. Right? You wouldn't have it write an email, generate code, and just push it commit it and push it up. And so control is the issue. And it and it's not it's it's not just, like, hallucinations, which I think are probably the biggest risks.
But we've we've done many generative AI production engagements at Kung Fu AI. And the main challenge is you may want to steer the model away from certain domains. Right? As a company, you may, I won't name the company we're working with, but we're doing this generative AI project for, sort of a photo, site where you could put together, you know, scrapbooks and and photo books and things like that. And the generative solution was able to auto organize the photos, understand what was in the photos, put them together, in a sort of themed flows, and then caption those with, you know, really great, examples. I remember there was one. There were a bunch of dogs on the page, and it and the the the caption it put was, like, furry friends forever. And that's just terrific.
But there were photos that us the people had taken, like, on vacation in Europe, and they they were around, churches and mosques, and it started outputting content around religion. And, you know, understandably, the client was like, I don't wanna touch that. Let's completely steer away from religion. And getting models aligned where they will say what you want and steer away from things you don't want is the hardest problem right now with LLMs in production.
[00:06:31] Tobias Macey:
I think that in many ways, they are effectively very talkative 5 year olds where they'll say lots of things. You can get them to do interesting stuff, but they're also gonna say things that you never saw coming.
[00:06:44] Ron Green:
That's exactly right. And so what we we we always tell our clients is generative AI, incredibly powerful, but at this stage, it really should be viewed as a human augmenter. So you can take things and transform content or generate new content, whatever it may be. But there almost always needs to be either a human in the loop or another model in the loop performing some type of assessment on that because you it we it the lack of control, the lack of explicit hard control is the challenging part about putting generative AI to production.
[00:07:22] Tobias Macey:
And in preparing for this conversation, I came across a blog post that you have on your site that framed this in the metaphor of LLMs are very powerful engine that are in search of a vehicle to add that sense of control and steering. And so given that framing, the most established vehicle that we have for putting that LLM engine into a guiding context is the rag stack. And I'm curious what you see as some of the main limitations or shortcomings of that as a product and production oriented solution.
[00:07:56] Ron Green:
Yeah. The RAG approach, you know, retrieval, augmented generation, has been fantastically successful. It's it's probably the number one generative approach that companies are taking out there in the world as sort of their first step into AI. And it it works, I would say, holistically pretty well most of the time. The the biggest challenges are, one, it's really only as good as the data you have. And so we will occasionally work with clients who, you know, may may have a little bit of misconceptions based upon their use of, like, chat GPT and things like that. It may not understand that, you know, if the data is outdated or incomplete or poorly organized within their own infrastructure, you know, that is not something that a reg pipeline can fix. Another limitation is large documents, I mean, really, really large documents can still, present a problem. Because the context windows for LLMs are growing, but they struggle beyond certain, sizes. And so if you're dealing with documents that have, you know, 100 of thousands of words within them, that that can present a problem. And, of course, hallucinations, you know, and control, like we talked about, even with rag, you do you do not have certitude that everything the model produces, even if it produces it with, citations, will be accurate. We're finding this is actually really interesting. We're finding, there's more and more evidence that using richer content markup is more effective. So for example, if you have HTML documents, use them as is. Don't don't pre process them into text. It that additional formatting structure, there's increasing evidence. It actually improves the outputs of these RAG pipelines.
So it's early days, but, and there are challenges there, but I would definitely recommend for most companies some type of RAG solution is a great, first entry into AI.
[00:10:09] Tobias Macey:
And another more sophisticated approach to that guidance system for the LLMs is the idea of multiagent or mixture of experts where you have multiple LLMs working in concert to try and keep each other in check, which conceptually sounds reasonable. And it sounds like it would be effective, but still is subject to the challenge of hallucinations where if one of those models does go off the rails, then maybe it acts as a compounding factor to bring the whole system further afield than it would have gone on its own. And I'm wondering how you see the pro versus con conversation happening around that pattern and also the way that it exists in conjunction with that rag pattern.
[00:10:54] Ron Green:
Yeah. That's a great question. I'm I'm really excited about multi agent and mixture of expert approaches. This also obviously is sort of very close to the the momentum that is growing around Agintiq AI. So, you know, the pros are if you can if you can deal with sort of a mixture of expert or a multi agent scenario, you do get improved performance in those individual agents because you're essentially not asking for for one model to be good at everything. You can specialize and have experts or agents that are refined on just performing one set of tasks really, really well. It also means that you can scale a little bit more easily because it reduces the computational overhead and latency associated with that. It can be cost effective because each of those smaller models will cost us to train, will cost less on inference. And this matters probably less, but it you do get improved interpretability if you need to because each of those smaller models could be designed in isolation to maximize or even to be explicitly interpretable.
And that can that can vary. If you're dealing with, like, product recommendations, it's probably not really critical. You're dealing with, loan decisioning, you might have regulatory requirements around, you know, explainability, interpretability. The the cons of this approach are, you know, the complexity. It's it's hard to orchestrate these complex systems. Latency sometimes can become an issue too because you you have all this task routing and this inter agent communication. Because the agents themselves are typically pretty lightweight, that's not going to be a deal breaker.
And then the last one is it's a little bit more of, a a wildcard, but, you know, you do have emergent behavior risk. Orchestration is complicated, and you are also dealing with agents acting, you know, potentially in unpredictable ways. And this, you know, kinda comes full circle to our original topic, which is we're at this we're at this really interesting stage of AI where the systems are incredibly powerful, but they're they're the fact that they're kinda black box and the fact that they do have these very impressive emergent behaviors makes control a little bit more difficult. And so I'm excited about this this move, but there are it's definitely early days still.
[00:13:39] Tobias Macey:
Another aspect of the current ecosystem that we're in is that there's all this excitement around generative AI of, oh, it's so powerful. It will solve all of my problems that I think it also causes us to overlook a lot of the more well established ML patterns, whether that is something like a linear regression or a random forest or even deep learning in favor of these transformer models. And I also am curious how you're seeing some of the challenges around the technical and organizational sophistication required for generative AI, maybe leapfrogging the organization's actual capabilities where they never actually established that capacity for some of the earlier generations of ML.
[00:14:24] Ron Green:
I this is actually a a topic I'm pretty passionate about because I'm a big believer in the power of generative AI. I absolutely think it's a transformative capability. But I personally think at this stage in our maturation, most companies should be looking at what what I call domain specific AIs. And I I you know, it's really kind of immaterial whether you're you're looking at, like, as you said, deep learning or random forest or or, hell, you know, even even linear regression or something like that. The bigger point is that generative AI, as powerful as it is, is, as we've talked about, more difficult to control. And so the investment can be quite high aside from, you know, sort of like rag type systems.
What we typically advise our companies to do is really look at domain specific AIs. So for example, very, very often the best first step that companies can take in to adopting artificial intelligence is build a capability that is very narrowly focused with a high with a really high ROI. Like, we'll advise our companies, you know, at Kung Fu AI, we do just a ton of, you know, custom engineering and strategy development. We won't recommend any projects that don't have at least a 10 x ROI. So, for example, we built a loan decision system for one of our clients. It went live earlier this summer. That thing is now trading 60% of their 2,600,000,000 in transactions per month, And that's all it can do. It's very narrow. It knows how to do one thing. It's, you know, it's not generative.
It's not broadly capable. It has no emergent capabilities, but this is going to transform their business. Their stock is up their stock is up, like, 36% since that system was released. This is a publicly traded company. And so I would encourage everybody that's listening to this to absolutely explore generative AI approaches, but don't miss out on the opportunity for more narrow domain specific AI that will, frankly, cost less to implement and and and operate and may deliver many, many times more ROI than some type of, LLM approach.
[00:16:51] Tobias Macey:
The other aspect of the generative AI ecosystem beyond the models and their capabilities and the patterns around them is also the ecosystem of tooling and frameworks and point solutions to the various problems in productionizing these LLMs. And I'm curious how you're seeing that volatility in the market, the current lack of maturity for some of those solutions, and the rapid pace of change influencing the ways that organizations are thinking about adoption of generative AI or their willingness to actually invest in a more generalized framework for l l m usage versus just let me just pay company x to do it all for me.
[00:17:39] Ron Green:
Right. Right. It's a it's a great question also. We deal with companies every day that are, you know, sort of early in their AI adoption curve. And we see a lot of the same things that you might expect, sort of decision paralysis. Like, where do you even start? Like, how do you assess how do you assess AI products? How do you assess, the feasibility of different potential initiatives that you that the company might take on? How do you even how do you even figure out what AI initiatives might be feasible? And so one of the things we really recommend is a strategy to holistically look at your business and make assessments that are geared to the domain and the context that your company's within, and come up with a roadmap. And so, you know, I at Comfortway Point AI, we've been around 7 years now. In the I'd say the 1st 5 years, we were mostly working with really early adopters, you know, people that were on the cutting edge, who had specific problems that they wanted to try to solve.
More and more now, we're talking with companies, and they say things like, we need some AI. Don't care what it is, but we've gotta have some AI to, you know, make Wall Street happy. And and that's a dangerous perspective. So start with strategy, look holistically, and be aware that, you know, AI products can be difficult to integrate. And the the reason is that almost all of the dominant powerful techniques within AI right now are deep learning based supervised learning algorithms. And so that requires, you know, strong data. And, you know, one of the one of the, challenges with with current AI is that, you know, garbage in, garbage out as far as data. And so it can take quite a bit to productionalize systems, if your data if your data story, if your data context is not very clean. And so custom solutions are very often the way to go initially on versus some productized solutions where you might be sold something that actually cannot quite live up to the hype, if that makes sense.
[00:20:03] Tobias Macey:
Given that context of supervised learning and your point about the challenges of data for these systems, What do you see as the viability of using the LLMs more for that data labeling, synthetic data generation method for them feeding into maybe a deep learning system that is the actual production unit that you deploy where you're using the engine to power your tooling system and run your assembly line so that you can build a bicycle.
[00:20:33] Ron Green:
Yeah. I love that. I love that idea. That is a powerful approach. It can actually work quite well for I'll give you some examples that I that I think everybody would you maybe enjoy hearing about. So for example, we've seen situations where companies are dealing with datasets that are really heterogeneous, and they need they they literally had, you know, hundreds and hundreds of different predicate rules that they had to manage and and keep up to date. We were able to build an LLM and then fine tune that, and it can make contextual decisions on extracting information and formatting the data, not only in all the situations that they were able to explicitly, you know, state with predicates before, but for analogous situations or situations they'd never even seen before. And that's, you know, again, the power of these AI techniques is that they will generalize.
And so that that, you know, really makes a a big difference. The other really interesting thing about these language models is that as you train them, like you said, you can use them to kind of bootstrap yourself into a more powerful net net solution. But you can also do that with a technique called active learning, where you you take a model, it may know nothing. You point it at a bunch of data, and you have a user evaluate the model's predictions on that data. And so imagine you're, you're trying to detect fraud as an example. And so the model will start off, you know, with just nothing more than a random guess. And as the user corrects the model's predictions, the model will then retrain on the feedback that the humans have given it and then go run its predictions on the dataset. And this is where the clever part happens.
It will go look at the entire dataset and find all of the inputs where it has the most uncertainty, where it's like 5050. It'll flip a coin like, I'm 50% sure that's fraud, and I'm 50% sure it's not. And it has the highest entropy. And it will ask the humans to label those. And then by doing that with this active learning sort of feedback loop, you're essentially maximizing the amount of information that the model's learning on. And you can speed through datasets like that, and you're essentially bootstrapping the model. And it can auto label more and more of that dataset as it learns to generalize with the human.
[00:23:07] Tobias Macey:
That aspect of bringing the models into the process of building the models is interesting. I'm also seeing some of that being applied in the data engineering context of using the models to understand how to build the pipelines that feed into the data that powers the model. So it it's turning into the the Ouroboros system where I I also see some of the challenge there too of any error in those models act as a compounding factor where you need to be able to identify early on in the process where it's starting to go wrong because, otherwise, it's going to amplify that problem. And I know that I'm seeing that in the training of these transformer models and the foundation models too where as we're consuming more of the web to power the data that goes into the models, a lot of the data on the web is now being generated by those models. And so the models are sort of working together to make themselves dumber.
[00:24:03] Ron Green:
Yeah. Yeah. The, you know, this this whole idea of, of, you know, the entire world being drowned in synthetic data and the models, you know, kinda losing their way. I I'm I'm largely optimistic there on the on the grand scale because I I think that I think what we're gonna see and you you it's exactly what you articulated is we are finally at the stage with AI where models can now be used to train the next generation. And we you know, we've seen things analogous to this in technology before. Like, if you look at CPUs, you know, you know, the CPUs from the previous generations could be used to design more powerful chips.
And there was this sort of positive feedback loop. Every generation of chip was more powerful and allowed us to design more powerful chips. The difference with AI is that, it can do this in a much, much tighter loop, and it can do it to itself. So it's these AI systems can actually be used to train the next AI system without a human in the loop. And and, for example, you know, the META open source models, the LAMA open source models for META, both LAMA open source models for META, both LAMA 1 was used to train LAMA 2, LAMA 2 was used to train LAMA 3, LAMA 31 was used to train LAMA 32. And right now, Yann LeCun, the chief AI scientist in Medell, actually just last week said they're training LAMA 4 right now.
Owen, I I I remember this, he said 100000 h 100 GPUs. That's about that's somewhere in the order of about $2,000,000,000 worth of GPUs. But much of that knowledge and guidance on especially on the reinforcement learning with human feedback phase with the sort of the alignment phase after the pretraining phase, a lot of that is gonna be done by the previous models, by the LAMA 3, family of models. It's amazing.
[00:26:10] Tobias Macey:
And to that point, beyond just the tooling and the frameworks, the models themselves are in a rapid state of flux with either larger models or more specific models being introduced constantly? And how does that change the way that businesses are thinking about whether and when to invest in that AI capability because of the fact that, oh, well, whatever model I select now is going to be outdated by next week.
[00:26:38] Ron Green:
It's true. And we see it even with the techniques, meaning problems that we solved 3 years ago that might have taken us 4 months, we could approach with, you know, an an entirely new class of algorithm or or modeling techniques. And not only achieve much, much better accuracy at the top line, but we probably could have done it much more quickly and more easily. You know, I think this is a classic sort of technological progression question, which is like, when is it too late to jump in? When is it too early? The way I think about it is there's gonna be a certain amount of investment that you have to write off in long term just simply because things are moving too fast as a business. And I I would I I think businesses have to think about it that way because the benefits you're gonna get in the short term are gonna be more than sufficient to accommodate that that write off. And the other the other fact is, you know, if you ignore AI, your competitors aren't. And so that is gonna put you sort of at a massive competitive advantage. And, again, this is the reason I would encourage people, don't jump in and just do something in AI because you feel like you have to or you you you feel like there's a much pressure. Be really thoughtful about it and make sure that there is a really, really strong ROI associated with any initiative. Because most companies haven't done anything, there's an enormous amount of, low hanging fruit for almost any company to embrace AI in a way where it will really be immaterial. If you have to go and replace some some modeling system 3 years from now, you won't care because the return on that investment would have been so high. And I would just encourage companies go in open eyed like this and and move forward with the understanding that it's a rapidly advancing field.
[00:28:30] Tobias Macey:
And to that point of where we are in the timeline of AI and bringing us back around to that metaphor of needing vehicles, where do you see us on the timeline of the automobile? Are we at the point of the model t yet? Or are we before that? Are we past that? I feel like to some degree, we're maybe at the point where we're at the model a where everybody's building their own special hot rods.
[00:28:53] Ron Green:
I I think I think you're right about that. I don't think we're at the model t yet. And the reason is that, you know, like we like we said at the beginning, I've been doing this a long time. And I get asked every now and then. There's been 2 AI winners. You know, why am I so confident that there won't be a 3rd winter? And it's and it's really simple. It's because it's a few things. One is we were always we were always overpromising on what AI could do before. We we would we would get good results, and we would extrapolate out, but the the curves didn't hold. And so then we would end up having overpromised and underdelivered. And you do that too many times in investors and adoption just stops.
We finally now have AI systems that can operate at the human level or superhuman level across almost all the tasks that you might care to think about, whether it's vision or speech or generative capabilities across almost any domain. Right? So we're not going back. That said, we're basically day 0 because there are really simple things we haven't even tried yet. Like, if you take the transformer architecture, it's got this quadratic computational complexity, which is really powerful, but it is not gonna scale. We're not gonna get to, context sizes in the trillions with that that type of architecture.
And there are simpler, approaches coming out almost daily that are showing really, really great capabilities, like, I think a Mamba with, like, sort of its like, the state space model approach. And so the lack of control that we've mentioned as well, I think, is the reason. We'll be at we'll be at the model t stage once we've sussed out these control and interpretability issues, and then there is really gonna take off. And I I I genuinely think that most people have no idea how much AI is gonna mature in the next 20 years. It's it's going to be mind blowing. And and to take one example of software development, it will be baked into every piece of software. Right? Because why would you not wanna have the ability for the tools you're working with to understand speech and have sophisticated vision capabilities and all that stuff? And right now it's it's it's the exception, it will become the rule. Just like every just like every piece of software now has networking, Internet capabilities baked in and it would be silly to think that they would operate in isolation, we're gonna see that the same thing with, AI adoption. So I I believe we're really at the early stages.
[00:31:26] Tobias Macey:
To that point of transformers being the dominant architecture for this current generation of Gen AI models, I know that we have been seeing a lot of reports recently of starting to hit the scaling limits of that transformer architecture where feeding more data, feeding more tokens is having diminishing returns in terms of the successive capabilities of those models. And given your perspective as somebody who's been in this industry for a while and seeing the successive generations of machine learning techniques and architectures, what are your thoughts on some of the future trajectory of AI model architectures? Are we going to continue trying to push those limits of the transformer architecture by throwing better hardware at it, or are we at a inflection point where we need to be looking at other approaches? I'm thinking in particular in terms of the liquid network techniques that came out of MIT recently.
[00:32:23] Ron Green:
You know, I'm I'm not convinced we're at the end of the scaling. I think I think it's I think we're seeing some slowdown, but it's not clear to me exactly how much slowdown and where we're on that curve. I I I I could be wrong. My guess is we're probably gonna see one more order of magnitude increase before we really have the slope shift downward. The the the things that I'm really excited about right now, though, and the reason I think that we're we're gonna continue to see really big performance improvements are we are just just now starting to look at sort of inference time, investments. So to date, it's all been about how big can we make these models, how much data can we pump into these models, and the scaling laws have held for about 10 orders of magnitude. You can go back over 20 years, and the scaling laws hold hold pretty well, hold pretty predictably.
Just in the last, you know, 18, 24 months have we started looking at the inference time and started focusing on in exploring the idea of, like, well, what if the model's inference compute wasn't fixed? What if the model was able to, use techniques like chain of thought where, you know, you can think of it almost like the model's talking to itself, producing output, assessing whether it's on the right track, altering approaches, and and iterating in that inference time compute cycle in ways that will allow it to improve itself and not just have some sort of, you know, fixed finite deterministic output. And the early results we're seeing from probably the leader on this is OpenAI with their o one models. The preview models are are already showing much improved reasoning capabilities, and the, the OpenAI claims the 0 one, full model will be staggeringly capable on that side. And, again, it's early days there. We've barely begun exploring this part of the spectrum.
So I think we're gonna see, if anything, modest slowdowns on the scaling, at least probably for the next 2 or 3 years before before we need to go back to the well.
[00:34:44] Tobias Macey:
Another interesting aspect of all of the conversations that happen around AI is the language that we use to talk about how it operates, where you use the concept of reasoning in that example of chain of thought where there's also a lot of debate around the level of actual understanding or ascensions or etcetera, what whatever terminology you want to use to anthropomorphize these models. What are some of the challenges that that imposes in terms of how we actually think about applying these models where because we want to anthropomorphize things, we say, oh, well, the model understands the input that I'm giving it, so it gives me this output where, really, it's just sophisticated statistics, and the model has no concrete understanding of it in the way that we think about our understanding of the world around us. And so there have been investments in terms of things like, cognitive AI where we start with maybe a more simplistic model, but we use means of trying to generate these contextual maps of the environment that it's executing in, the idea of GraphRag where you have an underlying knowledge graph for being able to give some sort of semantic semantic framing of the context that is being fed, the idea of memory being bolted onto the models in terms of the runtime to be able to contextualize things a bit better. And I'm wondering how you see some of those aspects of cognitive science and conceptual understanding being folded back into the ways that the models are built versus being just a bolt on to the runtime environment.
[00:36:27] Ron Green:
I love that question. I personally think that these large language models are hands down the most important scientific discovery of the 21st century. And what I mean by that is the emergent behavior that we get out of these large language models, which, again, you know, all they were trained to do is given some input, predict the next, you know, token, predict the next word. I don't think there's anybody on the planet who anticipated the type of capabilities we would see that that are that emerge at scale in these large language models. In fact, I have colleagues I've worked with, like, when the GPT 2 paper came out in 2020, didn't believe it. Thought, you know, some of the few shot examples within the within the paper were were impossible. It just couldn't be true. And so I say I say that I think that this is the most important scientific discovery of the 21st century because the emergent capabilities weren't predicted. And I think it tells us a lot about intelligence.
You know, if I say that the model is, you know, quote, unquote reasoning during inference time, I don't really mean that it's reasoning exactly in the same way we do. But that presupposes we even know how we reason, and we don't. And, you know, if you go back to the history of AI, it's really kinda funny. You know, in the fifties sixties, they thought, oh, if we could build a computer, and that computer could play chess at, you know, the the, you know, grand master level, it would certainly have, you know, AGI capabilities. And it turned out not to be true. We we solved that problem in the nineties, and things that we didn't think were complicated, things that we took for granted like our vision systems or in our speech systems and our auditory systems, we just thought were relatively simple problems to solve. In fact, Minsky famously in the sixties gave, like, an undergrad at MIT, like, the a summer project to build a computer vision system because they didn't think it was that complicated.
And the reason is that we can't introspect our cognitive processes. And so, you know, our visual cortex is unbelievably complicated. So the point I'm trying to make is this. We don't really know how we see at a deep way. We can't introspect our consciousness or our thought process. So I don't know exactly how my own brain works. So it's kinda hard to speak deeply about the differences in what might be consciousness or what might be intelligence, what might be reasoning within AI when we can't even speak deeply about it with humans. All I know is that it is absolutely stunning that large language models have these emergent capabilities at scale, and I think we should keep exploring that and see how far we can push this.
[00:39:26] Tobias Macey:
And another pressure that AI is having on the world that we live in is in terms of the computing systems that we build where for a long time, we've had the Von Neumann architecture that has served us well. And now with the growth of AI both on the training side, but in particular on inference, which is from a distribution perspective, more ubiquitous, everybody needs to be able to do inference and particularly as we start to push things into the edge and on mobile devices. And I'm wondering how you see the engine of AI forcing us to rethink how we construct the drive train to be able to actually harness that power and some of the effect that it's having on the systems architecture at the compute level and how we think about actually building our computing systems?
[00:40:19] Ron Green:
That is a really difficult question to answer. There are there are all kinds of examples within, AI right now where the techniques bend to accommodate the hardware. And then there are instances where the hardware will be modified to specialize in optimizing for some algorithmic advancement, transformer being, you know, the best example of that. Right now, you know, it it is absolutely fair to say that that deep learning is a dominant approach, and within deep learning, transformers are dominant approach. And if you look at if you look at a if you look at a transformer, you know, it it one of the funny jokes is that that the the famous paper that the transformer architecture came out of was called attention is all you need. But if you actually look at the amount of parameters within any transformer model, most of the parameters are still on the multilayer perceptrons that are that are at the end of each of the attention blocks. And so that's just that's just linear algebra. That's just matrix multiplication. And so, I think for at least the foreseeable future, the bottleneck within AI is going to be that ability to do dot products at scale. And I think we're gonna see companies like NVIDIA just pouring more and more money and, you know, resources and and and time into seeing how much they can scale up and and move to a, concurrent parallel, computation of these enormous, you know, matrix operations.
Beyond that, candidly, I just don't I just don't have a lot of visibility.
[00:42:00] Tobias Macey:
In your work of investing in this ecosystem of generative AI and helping organizations figure out how best to harness that motivating force of the LLM as engine, what are some of the most interesting or challenging or innovative ways that you have seen people try to conceive of the ways that those LLMs are able to have a transformative force on their organization or on the ecosystem in which they're operating?
[00:42:33] Ron Green:
I okay. I think probably the the thing that I'm most excited about are within sort of the that domain, the way you described it there, are not just pure LLMs, but these sort of multimodal language models. So these large language vision models. And we're seeing more and more examples of sort of multimodal models that are conditioned in a way that allow them to provide outputs and capabilities that, you know, frankly, it just seems like magic to me. So I I'll give you maybe a couple examples. We're seeing companies take, multimodal language models and condition them on, 3d3d sort of CAD space like problems. And then you can literally, write in English, in text, what you want the CAD to generate and manipulate it with really pretty high success, you know, these these AI generated meshes.
We're we're seeing this also at the intersection of health care on health care data for assessing that. There was there was actually, an article just, like 2 days ago in the New York Times talking about how LLMs were dramatically beating doctors in this relatively small case study of doing patient assessment. And even when, even when the doctors were paired with the language models and they were able to collaborate with them, language models actually outperform the doctors. And in their sort of, like, post evaluation of why, it was because the doctors came in with some preconceptions.
And when the language models pointed out flaws in that, they basically ignored it. And another example and again, this is why I say we're at, like, day 0. We are very early into this. Is, you know, there are now these multimodal models that that are capable of on the fly game generation. So there was an example of, like, a sort of a Minecraft generation game that you can type in and build the world, but its world model is really weak. So, like, if you're looking at a view and you turn and you do a 360, when you come back, it's changed. Like, in the moment, right, its world view is just very ephemeral. But it was conditioned on those Minecraft, contexts and and can generate, you know, at a at a high frame rate, you know, this imagined world already.
So I think that those are probably maybe the most radical examples, and you'll notice all those are kinda mostly toys still, and that's because it's just really, really early days.
[00:45:06] Tobias Macey:
And in your own work of navigating this space and trying to grasp the current phase of AI that we're in, what are some of the most interesting or unexpected or challenging lessons that you've learned personally?
[00:45:20] Ron Green:
I I think that I am continually surprised at the power of the diffusion approach, I think. I think that may be the thing that I'm most excited about right now overall. You know, the diffusion the diffusion approach just for our listeners, is this idea of taking some some input and adding some some proturbance to it, typically noise. And so if you take maybe the canonical example of images, you take an image, you gradually add, let's say, Gaussian noise, and you train a model to be able to remove that noise at different stages of that process.
And at the end of the process with images, you know, you've just got an image that's just full noise. There's nothing there that's that's even remotely, recognizable. But you've conditioned that model throughout this whole process on a text input that was embedded in such a way that the model can learn what the image contains semantically. And at the end of this, you can you can literally take a text string of something you wanna create that maybe has never existed in the universe and give that model an image with just pure noise in that string describing the what you want, and you lie to the model. And you say, this actually is that image. It's just actually is that image. It's just got a bunch of noise in it, and it will denoise it. That approach, we're seeing that work in robotics. We're seeing that work in protein folding. For example, AlphaFold 3, which is the just breathtakingly powerful computational biology model released by Google DeepMind this year. In fact, DeepMind CEO, Dimas Hassabis, and, John John Burdick both won Nobel Prizes in chemistry for this work. It uses a diffusion model. What they do is they basically put in coordinates of the atoms, the the different atoms within protein molecules, and they perturb it. And what this allows them to do is use what's called a pair former. It's a variation on a transformer to generate potential, proteins that, amino acid sequences will generate and then use the diffusion models to refine them, and they're getting fantastic accuracy on this. And so we're gonna be able to do, you know, genetic therapies, drug therapies, infectious disease therapies that are all going to be AI generated approaches, each one of which might have been a PhD dissertation. Right? You would have spent maybe 5 years trying to figure out how that that protein folded. Now you can you can enter the amino acid sequence and go get a cup of coffee and come back and have the answer. So I think the diffusion approach right now is the most important thing happening within sort of architectural advancements within AI.
[00:48:17] Tobias Macey:
Given all of the excitement and fewer over generative AI as a solution to whatever problem domain you want to introduce it to, what are the cases where you would advise against the application of generative AI or LLMs?
[00:48:34] Ron Green:
Anytime you need absolute certitude, I would I would say you need to be very careful. Now if you're willing to have a human in the loop, which I would argue, you absolutely should with almost almost any generative approach right now then you're fine. But, you know, you you you definitely would not wanna live in a world where, you know, the doctor comes to you and says, well, we need to perform surgery. And you say, why? And the doctor says, I don't know. But, you know, the AI model told me that's what we need to do. So generative language models, etcetera, incredibly powerful.
At this stage, treat them as human augmentations, and you can go to town. You you can you can build really, really powerful systems. Just avoid them as sources of truth at this point because we're still struggling with control.
[00:49:26] Tobias Macey:
Are there any other aspects of LLMs and the vehicles that we need to build for them or the aspects of control and challenges around that, or you just experience working in this space that we didn't discuss that you would like to cover before we close out the show?
[00:49:41] Ron Green:
Probably the probably the only area that we didn't discuss that I'm pretty excited about is interpretability. And in particular, I think the work from Anthropic over the last year has been fascinating. They're using sparse auto encoders to really dig in and try to understand how these large language models are representing inside the parameter space, different concepts. And they have the famous example where they they, were able to isolate, let's like, a concept like the Golden Gate Bridge in San Francisco. And they found some really fascinating things. One, that that concept was spread out across many neurons within the model.
2, that it didn't matter what language you were operating in, whether it was English or Korean or Russian, the same representation was used across those languages, including images. So if you for a multimodal language model, they found that the Golden Gate Bridge image capabilities also uses same neurons. And then lastly, they did this I just think this is so fascinating. What they did is they they asked the the model, you know, to describe what it looked like physically. And the model said, well, you know, I'm I have no physical form. I am a I'm an AI program, etcetera. And then they, manipulated the model, and they took the neurons that, is that they'd learned that encoded the concept of the Golden Gate Bridge, and they they forced those to output it 10 times their normal level.
And ask the question again. And the model came back and said, oh, I'm the Golden Gate Bridge, and I have, you know, this shape and this form and this color. And so you could manipulate the model to say things you want. And so this is, I think, a very, very major step forward in interpretability and explainability, and I think that this will bear fruit over the next 5 years in a big way. And it will allow us to not only get around some of the control issues we're seeing right now, but it will also make these models much more likely to be used in domains where explainability, interpretability, like medical cases is just, you know, nonnegotiable.
It absolutely has to be there. So I'm I'm super excited about that stuff.
[00:51:57] Tobias Macey:
Yeah. The visibility into the internal state, I think, is definitely a very important area of investment where we need to dig into. So I'm I'm excited to see more progress in that space. So, So, yeah, definitely excited to see where things go from here when we get to the point of the model t and when we progress to the point where we actually have some of the current generation of vehicles where they have all of the bells and whistles of safety features, and it knows where I'm about to, park too close to the guardrail or what have you and starts beeping at me. So
[00:52:31] Ron Green:
Exactly. Yeah. The the these models are really powerful and smart. We need to we need to, get them to be a little more reliable.
[00:52:40] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human trading that's available for AI systems
[00:52:57] Ron Green:
today? I think the biggest limitation are the 2 things we've hit on, which are, control and interpretability. And they are not deal breakers, but they are, I think, limiting the velocity, of adoption in different domains where we really where we really need them. But I'm absolutely optimistic that we'll figure that out. It is, it is not an exaggeration to say that I think as a part of this journey towards understanding in a deeper way the way these large, deep learning systems work and and as we make them less of a black box, we are simultaneously probably going to start understanding how our own brain works. It'll probably go in tandem. And even though, you know, we can build jets and they don't flap their wings, you know, there are many different ways to fly. I think that's also true with, intelligence, but I think we'll probably be surprised to find there are going to be a lot more overlaps than we initially suspected.
[00:54:05] Tobias Macey:
Well, thank you very much for taking the time today to join me and share your experience and expertise in the space and your perspective on where we are in the journey of AI adoption and AI capabilities and some of the areas of investment that we need to make to improve the operability of these models. So thank you again for taking the time and for the work that you're doing to help organizations tackle those problems, and I hope you enjoy the rest of your day.
[00:54:30] Ron Green:
Thank you so much. This was a really, really fun conversation.
[00:54:38] Tobias Macey:
Thank you for listening, and don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at the machine learning podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Seamless data integration into AI applications often falls short, leading many to adopt RAG methods which come with high costs, complexity, and limited scalability. Cogni offers a better solution with its open source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cogni enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data and LLM apps without unnecessary overhead.
Visitaiengineeringpodcast.com/cogni, that's cogne, today to learn more and elevate your AI apps and agents. Your host is Tobias Macy. And today, I'm interviewing Ron Green about the wheels that we need for harnessing the power of the generative AI engine. So, Ron, can you start by introducing yourself?
[00:01:10] Ron Green:
Yeah. I'm Ron Green. I'm cofounder and chief technology of KungFu AI.
[00:01:15] Tobias Macey:
And do you remember how you first got started working in the ML and AI space?
[00:01:19] Ron Green:
I do. I remember vividly. I was actually a computer science major at the University of Texas at Austin. And I was in my last semester, and it was, you know, just such a grind going through all of those, you know, really deeply technical courses. And I remember my last semester, heading into it, I was pretty burned out. And I remember thinking to myself, I don't know what I'm gonna do professionally, but it's probably not gonna involve software. Because I was at that point just I was thinking I needed a big change. And I had, like, one elective left, and I took an introduction to artificial intelligence.
And instantly, I mean, within, like, 2 weeks, I knew exactly what I wanted to do with the rest of my life. And, what's funny about this is that this is in the nineties, and, I remember thinking, oh, man. I'm too late. They've they've got everything figured out. You know, there were textbooks and all these, you know, really deep theorems and perspectives, and I thought, you know, shoot. Am I smart enough to do this? That probably doesn't even matter. I'm too late to do this anyway. You know, little did I know it was gonna be a, you know, like, another 20 years before it really took off, but that's how I got involved.
[00:02:40] Tobias Macey:
Yeah. It's definitely funny that the cycles that the industry goes through of we hit a certain peak and we think, oh, we've done as much as we can here, and we're on a glide path and then everything stalls out. And we have to go through another cycle of discovery to realize, oh, this actually was just a local maxima. There's a whole other mountain range to climb.
[00:03:03] Ron Green:
That's exactly right. I mean, funny story is I did, I did a master's in artificial intelligence at the University of Sussex in England. And I remember probably in, like, 2,005, I was talking to, a colleague. And I'd gotten out of AI at this point because, you know, in 2,005, there was almost nothing happening outside academia. And I was talking to a professor a colleague of mine and mentioned that I'd specialized in artificial intelligence. And he his reaction was like, oh, you know, oh, tough, tough choice, man. That was a a a big waste of time. And in, o five, it kinda felt like that might have been the case.
[00:03:41] Tobias Macey:
Now here we are where AI is on the tips of everyone's tongues. It has bridged the divide and has now entered into the consumer arena where everybody is talking about AI in different contexts. And I'm wondering given the current arena of AI in the space of generative AI and LLMs, what you see as the main shortcomings of those LLMs as a standalone solution to basically anything?
[00:04:08] Ron Green:
Yeah. The the biggest the biggest problem we're having right now with large language models as a production tool is control. If you are using a chatbot, if you're inter you know, if you're using chat gbt or you're using, you know, llama or something like that, and you're interacting with it, if you are prompting it, looking at the output, assessing it, it works great. And the same thing goes for code assistant tools. Let's say, for example, Copilot or Cursor. You can prompt it to, you know, refactor something or generate some some code from scratch. But in all of these instances, nobody at this point in time would just take the output and and use it sight unseen. Right? You wouldn't have it write an email, generate code, and just push it commit it and push it up. And so control is the issue. And it and it's not it's it's not just, like, hallucinations, which I think are probably the biggest risks.
But we've we've done many generative AI production engagements at Kung Fu AI. And the main challenge is you may want to steer the model away from certain domains. Right? As a company, you may, I won't name the company we're working with, but we're doing this generative AI project for, sort of a photo, site where you could put together, you know, scrapbooks and and photo books and things like that. And the generative solution was able to auto organize the photos, understand what was in the photos, put them together, in a sort of themed flows, and then caption those with, you know, really great, examples. I remember there was one. There were a bunch of dogs on the page, and it and the the the caption it put was, like, furry friends forever. And that's just terrific.
But there were photos that us the people had taken, like, on vacation in Europe, and they they were around, churches and mosques, and it started outputting content around religion. And, you know, understandably, the client was like, I don't wanna touch that. Let's completely steer away from religion. And getting models aligned where they will say what you want and steer away from things you don't want is the hardest problem right now with LLMs in production.
[00:06:31] Tobias Macey:
I think that in many ways, they are effectively very talkative 5 year olds where they'll say lots of things. You can get them to do interesting stuff, but they're also gonna say things that you never saw coming.
[00:06:44] Ron Green:
That's exactly right. And so what we we we always tell our clients is generative AI, incredibly powerful, but at this stage, it really should be viewed as a human augmenter. So you can take things and transform content or generate new content, whatever it may be. But there almost always needs to be either a human in the loop or another model in the loop performing some type of assessment on that because you it we it the lack of control, the lack of explicit hard control is the challenging part about putting generative AI to production.
[00:07:22] Tobias Macey:
And in preparing for this conversation, I came across a blog post that you have on your site that framed this in the metaphor of LLMs are very powerful engine that are in search of a vehicle to add that sense of control and steering. And so given that framing, the most established vehicle that we have for putting that LLM engine into a guiding context is the rag stack. And I'm curious what you see as some of the main limitations or shortcomings of that as a product and production oriented solution.
[00:07:56] Ron Green:
Yeah. The RAG approach, you know, retrieval, augmented generation, has been fantastically successful. It's it's probably the number one generative approach that companies are taking out there in the world as sort of their first step into AI. And it it works, I would say, holistically pretty well most of the time. The the biggest challenges are, one, it's really only as good as the data you have. And so we will occasionally work with clients who, you know, may may have a little bit of misconceptions based upon their use of, like, chat GPT and things like that. It may not understand that, you know, if the data is outdated or incomplete or poorly organized within their own infrastructure, you know, that is not something that a reg pipeline can fix. Another limitation is large documents, I mean, really, really large documents can still, present a problem. Because the context windows for LLMs are growing, but they struggle beyond certain, sizes. And so if you're dealing with documents that have, you know, 100 of thousands of words within them, that that can present a problem. And, of course, hallucinations, you know, and control, like we talked about, even with rag, you do you do not have certitude that everything the model produces, even if it produces it with, citations, will be accurate. We're finding this is actually really interesting. We're finding, there's more and more evidence that using richer content markup is more effective. So for example, if you have HTML documents, use them as is. Don't don't pre process them into text. It that additional formatting structure, there's increasing evidence. It actually improves the outputs of these RAG pipelines.
So it's early days, but, and there are challenges there, but I would definitely recommend for most companies some type of RAG solution is a great, first entry into AI.
[00:10:09] Tobias Macey:
And another more sophisticated approach to that guidance system for the LLMs is the idea of multiagent or mixture of experts where you have multiple LLMs working in concert to try and keep each other in check, which conceptually sounds reasonable. And it sounds like it would be effective, but still is subject to the challenge of hallucinations where if one of those models does go off the rails, then maybe it acts as a compounding factor to bring the whole system further afield than it would have gone on its own. And I'm wondering how you see the pro versus con conversation happening around that pattern and also the way that it exists in conjunction with that rag pattern.
[00:10:54] Ron Green:
Yeah. That's a great question. I'm I'm really excited about multi agent and mixture of expert approaches. This also obviously is sort of very close to the the momentum that is growing around Agintiq AI. So, you know, the pros are if you can if you can deal with sort of a mixture of expert or a multi agent scenario, you do get improved performance in those individual agents because you're essentially not asking for for one model to be good at everything. You can specialize and have experts or agents that are refined on just performing one set of tasks really, really well. It also means that you can scale a little bit more easily because it reduces the computational overhead and latency associated with that. It can be cost effective because each of those smaller models will cost us to train, will cost less on inference. And this matters probably less, but it you do get improved interpretability if you need to because each of those smaller models could be designed in isolation to maximize or even to be explicitly interpretable.
And that can that can vary. If you're dealing with, like, product recommendations, it's probably not really critical. You're dealing with, loan decisioning, you might have regulatory requirements around, you know, explainability, interpretability. The the cons of this approach are, you know, the complexity. It's it's hard to orchestrate these complex systems. Latency sometimes can become an issue too because you you have all this task routing and this inter agent communication. Because the agents themselves are typically pretty lightweight, that's not going to be a deal breaker.
And then the last one is it's a little bit more of, a a wildcard, but, you know, you do have emergent behavior risk. Orchestration is complicated, and you are also dealing with agents acting, you know, potentially in unpredictable ways. And this, you know, kinda comes full circle to our original topic, which is we're at this we're at this really interesting stage of AI where the systems are incredibly powerful, but they're they're the fact that they're kinda black box and the fact that they do have these very impressive emergent behaviors makes control a little bit more difficult. And so I'm excited about this this move, but there are it's definitely early days still.
[00:13:39] Tobias Macey:
Another aspect of the current ecosystem that we're in is that there's all this excitement around generative AI of, oh, it's so powerful. It will solve all of my problems that I think it also causes us to overlook a lot of the more well established ML patterns, whether that is something like a linear regression or a random forest or even deep learning in favor of these transformer models. And I also am curious how you're seeing some of the challenges around the technical and organizational sophistication required for generative AI, maybe leapfrogging the organization's actual capabilities where they never actually established that capacity for some of the earlier generations of ML.
[00:14:24] Ron Green:
I this is actually a a topic I'm pretty passionate about because I'm a big believer in the power of generative AI. I absolutely think it's a transformative capability. But I personally think at this stage in our maturation, most companies should be looking at what what I call domain specific AIs. And I I you know, it's really kind of immaterial whether you're you're looking at, like, as you said, deep learning or random forest or or, hell, you know, even even linear regression or something like that. The bigger point is that generative AI, as powerful as it is, is, as we've talked about, more difficult to control. And so the investment can be quite high aside from, you know, sort of like rag type systems.
What we typically advise our companies to do is really look at domain specific AIs. So for example, very, very often the best first step that companies can take in to adopting artificial intelligence is build a capability that is very narrowly focused with a high with a really high ROI. Like, we'll advise our companies, you know, at Kung Fu AI, we do just a ton of, you know, custom engineering and strategy development. We won't recommend any projects that don't have at least a 10 x ROI. So, for example, we built a loan decision system for one of our clients. It went live earlier this summer. That thing is now trading 60% of their 2,600,000,000 in transactions per month, And that's all it can do. It's very narrow. It knows how to do one thing. It's, you know, it's not generative.
It's not broadly capable. It has no emergent capabilities, but this is going to transform their business. Their stock is up their stock is up, like, 36% since that system was released. This is a publicly traded company. And so I would encourage everybody that's listening to this to absolutely explore generative AI approaches, but don't miss out on the opportunity for more narrow domain specific AI that will, frankly, cost less to implement and and and operate and may deliver many, many times more ROI than some type of, LLM approach.
[00:16:51] Tobias Macey:
The other aspect of the generative AI ecosystem beyond the models and their capabilities and the patterns around them is also the ecosystem of tooling and frameworks and point solutions to the various problems in productionizing these LLMs. And I'm curious how you're seeing that volatility in the market, the current lack of maturity for some of those solutions, and the rapid pace of change influencing the ways that organizations are thinking about adoption of generative AI or their willingness to actually invest in a more generalized framework for l l m usage versus just let me just pay company x to do it all for me.
[00:17:39] Ron Green:
Right. Right. It's a it's a great question also. We deal with companies every day that are, you know, sort of early in their AI adoption curve. And we see a lot of the same things that you might expect, sort of decision paralysis. Like, where do you even start? Like, how do you assess how do you assess AI products? How do you assess, the feasibility of different potential initiatives that you that the company might take on? How do you even how do you even figure out what AI initiatives might be feasible? And so one of the things we really recommend is a strategy to holistically look at your business and make assessments that are geared to the domain and the context that your company's within, and come up with a roadmap. And so, you know, I at Comfortway Point AI, we've been around 7 years now. In the I'd say the 1st 5 years, we were mostly working with really early adopters, you know, people that were on the cutting edge, who had specific problems that they wanted to try to solve.
More and more now, we're talking with companies, and they say things like, we need some AI. Don't care what it is, but we've gotta have some AI to, you know, make Wall Street happy. And and that's a dangerous perspective. So start with strategy, look holistically, and be aware that, you know, AI products can be difficult to integrate. And the the reason is that almost all of the dominant powerful techniques within AI right now are deep learning based supervised learning algorithms. And so that requires, you know, strong data. And, you know, one of the one of the, challenges with with current AI is that, you know, garbage in, garbage out as far as data. And so it can take quite a bit to productionalize systems, if your data if your data story, if your data context is not very clean. And so custom solutions are very often the way to go initially on versus some productized solutions where you might be sold something that actually cannot quite live up to the hype, if that makes sense.
[00:20:03] Tobias Macey:
Given that context of supervised learning and your point about the challenges of data for these systems, What do you see as the viability of using the LLMs more for that data labeling, synthetic data generation method for them feeding into maybe a deep learning system that is the actual production unit that you deploy where you're using the engine to power your tooling system and run your assembly line so that you can build a bicycle.
[00:20:33] Ron Green:
Yeah. I love that. I love that idea. That is a powerful approach. It can actually work quite well for I'll give you some examples that I that I think everybody would you maybe enjoy hearing about. So for example, we've seen situations where companies are dealing with datasets that are really heterogeneous, and they need they they literally had, you know, hundreds and hundreds of different predicate rules that they had to manage and and keep up to date. We were able to build an LLM and then fine tune that, and it can make contextual decisions on extracting information and formatting the data, not only in all the situations that they were able to explicitly, you know, state with predicates before, but for analogous situations or situations they'd never even seen before. And that's, you know, again, the power of these AI techniques is that they will generalize.
And so that that, you know, really makes a a big difference. The other really interesting thing about these language models is that as you train them, like you said, you can use them to kind of bootstrap yourself into a more powerful net net solution. But you can also do that with a technique called active learning, where you you take a model, it may know nothing. You point it at a bunch of data, and you have a user evaluate the model's predictions on that data. And so imagine you're, you're trying to detect fraud as an example. And so the model will start off, you know, with just nothing more than a random guess. And as the user corrects the model's predictions, the model will then retrain on the feedback that the humans have given it and then go run its predictions on the dataset. And this is where the clever part happens.
It will go look at the entire dataset and find all of the inputs where it has the most uncertainty, where it's like 5050. It'll flip a coin like, I'm 50% sure that's fraud, and I'm 50% sure it's not. And it has the highest entropy. And it will ask the humans to label those. And then by doing that with this active learning sort of feedback loop, you're essentially maximizing the amount of information that the model's learning on. And you can speed through datasets like that, and you're essentially bootstrapping the model. And it can auto label more and more of that dataset as it learns to generalize with the human.
[00:23:07] Tobias Macey:
That aspect of bringing the models into the process of building the models is interesting. I'm also seeing some of that being applied in the data engineering context of using the models to understand how to build the pipelines that feed into the data that powers the model. So it it's turning into the the Ouroboros system where I I also see some of the challenge there too of any error in those models act as a compounding factor where you need to be able to identify early on in the process where it's starting to go wrong because, otherwise, it's going to amplify that problem. And I know that I'm seeing that in the training of these transformer models and the foundation models too where as we're consuming more of the web to power the data that goes into the models, a lot of the data on the web is now being generated by those models. And so the models are sort of working together to make themselves dumber.
[00:24:03] Ron Green:
Yeah. Yeah. The, you know, this this whole idea of, of, you know, the entire world being drowned in synthetic data and the models, you know, kinda losing their way. I I'm I'm largely optimistic there on the on the grand scale because I I think that I think what we're gonna see and you you it's exactly what you articulated is we are finally at the stage with AI where models can now be used to train the next generation. And we you know, we've seen things analogous to this in technology before. Like, if you look at CPUs, you know, you know, the CPUs from the previous generations could be used to design more powerful chips.
And there was this sort of positive feedback loop. Every generation of chip was more powerful and allowed us to design more powerful chips. The difference with AI is that, it can do this in a much, much tighter loop, and it can do it to itself. So it's these AI systems can actually be used to train the next AI system without a human in the loop. And and, for example, you know, the META open source models, the LAMA open source models for META, both LAMA open source models for META, both LAMA 1 was used to train LAMA 2, LAMA 2 was used to train LAMA 3, LAMA 31 was used to train LAMA 32. And right now, Yann LeCun, the chief AI scientist in Medell, actually just last week said they're training LAMA 4 right now.
Owen, I I I remember this, he said 100000 h 100 GPUs. That's about that's somewhere in the order of about $2,000,000,000 worth of GPUs. But much of that knowledge and guidance on especially on the reinforcement learning with human feedback phase with the sort of the alignment phase after the pretraining phase, a lot of that is gonna be done by the previous models, by the LAMA 3, family of models. It's amazing.
[00:26:10] Tobias Macey:
And to that point, beyond just the tooling and the frameworks, the models themselves are in a rapid state of flux with either larger models or more specific models being introduced constantly? And how does that change the way that businesses are thinking about whether and when to invest in that AI capability because of the fact that, oh, well, whatever model I select now is going to be outdated by next week.
[00:26:38] Ron Green:
It's true. And we see it even with the techniques, meaning problems that we solved 3 years ago that might have taken us 4 months, we could approach with, you know, an an entirely new class of algorithm or or modeling techniques. And not only achieve much, much better accuracy at the top line, but we probably could have done it much more quickly and more easily. You know, I think this is a classic sort of technological progression question, which is like, when is it too late to jump in? When is it too early? The way I think about it is there's gonna be a certain amount of investment that you have to write off in long term just simply because things are moving too fast as a business. And I I would I I think businesses have to think about it that way because the benefits you're gonna get in the short term are gonna be more than sufficient to accommodate that that write off. And the other the other fact is, you know, if you ignore AI, your competitors aren't. And so that is gonna put you sort of at a massive competitive advantage. And, again, this is the reason I would encourage people, don't jump in and just do something in AI because you feel like you have to or you you you feel like there's a much pressure. Be really thoughtful about it and make sure that there is a really, really strong ROI associated with any initiative. Because most companies haven't done anything, there's an enormous amount of, low hanging fruit for almost any company to embrace AI in a way where it will really be immaterial. If you have to go and replace some some modeling system 3 years from now, you won't care because the return on that investment would have been so high. And I would just encourage companies go in open eyed like this and and move forward with the understanding that it's a rapidly advancing field.
[00:28:30] Tobias Macey:
And to that point of where we are in the timeline of AI and bringing us back around to that metaphor of needing vehicles, where do you see us on the timeline of the automobile? Are we at the point of the model t yet? Or are we before that? Are we past that? I feel like to some degree, we're maybe at the point where we're at the model a where everybody's building their own special hot rods.
[00:28:53] Ron Green:
I I think I think you're right about that. I don't think we're at the model t yet. And the reason is that, you know, like we like we said at the beginning, I've been doing this a long time. And I get asked every now and then. There's been 2 AI winners. You know, why am I so confident that there won't be a 3rd winter? And it's and it's really simple. It's because it's a few things. One is we were always we were always overpromising on what AI could do before. We we would we would get good results, and we would extrapolate out, but the the curves didn't hold. And so then we would end up having overpromised and underdelivered. And you do that too many times in investors and adoption just stops.
We finally now have AI systems that can operate at the human level or superhuman level across almost all the tasks that you might care to think about, whether it's vision or speech or generative capabilities across almost any domain. Right? So we're not going back. That said, we're basically day 0 because there are really simple things we haven't even tried yet. Like, if you take the transformer architecture, it's got this quadratic computational complexity, which is really powerful, but it is not gonna scale. We're not gonna get to, context sizes in the trillions with that that type of architecture.
And there are simpler, approaches coming out almost daily that are showing really, really great capabilities, like, I think a Mamba with, like, sort of its like, the state space model approach. And so the lack of control that we've mentioned as well, I think, is the reason. We'll be at we'll be at the model t stage once we've sussed out these control and interpretability issues, and then there is really gonna take off. And I I I genuinely think that most people have no idea how much AI is gonna mature in the next 20 years. It's it's going to be mind blowing. And and to take one example of software development, it will be baked into every piece of software. Right? Because why would you not wanna have the ability for the tools you're working with to understand speech and have sophisticated vision capabilities and all that stuff? And right now it's it's it's the exception, it will become the rule. Just like every just like every piece of software now has networking, Internet capabilities baked in and it would be silly to think that they would operate in isolation, we're gonna see that the same thing with, AI adoption. So I I believe we're really at the early stages.
[00:31:26] Tobias Macey:
To that point of transformers being the dominant architecture for this current generation of Gen AI models, I know that we have been seeing a lot of reports recently of starting to hit the scaling limits of that transformer architecture where feeding more data, feeding more tokens is having diminishing returns in terms of the successive capabilities of those models. And given your perspective as somebody who's been in this industry for a while and seeing the successive generations of machine learning techniques and architectures, what are your thoughts on some of the future trajectory of AI model architectures? Are we going to continue trying to push those limits of the transformer architecture by throwing better hardware at it, or are we at a inflection point where we need to be looking at other approaches? I'm thinking in particular in terms of the liquid network techniques that came out of MIT recently.
[00:32:23] Ron Green:
You know, I'm I'm not convinced we're at the end of the scaling. I think I think it's I think we're seeing some slowdown, but it's not clear to me exactly how much slowdown and where we're on that curve. I I I I could be wrong. My guess is we're probably gonna see one more order of magnitude increase before we really have the slope shift downward. The the the things that I'm really excited about right now, though, and the reason I think that we're we're gonna continue to see really big performance improvements are we are just just now starting to look at sort of inference time, investments. So to date, it's all been about how big can we make these models, how much data can we pump into these models, and the scaling laws have held for about 10 orders of magnitude. You can go back over 20 years, and the scaling laws hold hold pretty well, hold pretty predictably.
Just in the last, you know, 18, 24 months have we started looking at the inference time and started focusing on in exploring the idea of, like, well, what if the model's inference compute wasn't fixed? What if the model was able to, use techniques like chain of thought where, you know, you can think of it almost like the model's talking to itself, producing output, assessing whether it's on the right track, altering approaches, and and iterating in that inference time compute cycle in ways that will allow it to improve itself and not just have some sort of, you know, fixed finite deterministic output. And the early results we're seeing from probably the leader on this is OpenAI with their o one models. The preview models are are already showing much improved reasoning capabilities, and the, the OpenAI claims the 0 one, full model will be staggeringly capable on that side. And, again, it's early days there. We've barely begun exploring this part of the spectrum.
So I think we're gonna see, if anything, modest slowdowns on the scaling, at least probably for the next 2 or 3 years before before we need to go back to the well.
[00:34:44] Tobias Macey:
Another interesting aspect of all of the conversations that happen around AI is the language that we use to talk about how it operates, where you use the concept of reasoning in that example of chain of thought where there's also a lot of debate around the level of actual understanding or ascensions or etcetera, what whatever terminology you want to use to anthropomorphize these models. What are some of the challenges that that imposes in terms of how we actually think about applying these models where because we want to anthropomorphize things, we say, oh, well, the model understands the input that I'm giving it, so it gives me this output where, really, it's just sophisticated statistics, and the model has no concrete understanding of it in the way that we think about our understanding of the world around us. And so there have been investments in terms of things like, cognitive AI where we start with maybe a more simplistic model, but we use means of trying to generate these contextual maps of the environment that it's executing in, the idea of GraphRag where you have an underlying knowledge graph for being able to give some sort of semantic semantic framing of the context that is being fed, the idea of memory being bolted onto the models in terms of the runtime to be able to contextualize things a bit better. And I'm wondering how you see some of those aspects of cognitive science and conceptual understanding being folded back into the ways that the models are built versus being just a bolt on to the runtime environment.
[00:36:27] Ron Green:
I love that question. I personally think that these large language models are hands down the most important scientific discovery of the 21st century. And what I mean by that is the emergent behavior that we get out of these large language models, which, again, you know, all they were trained to do is given some input, predict the next, you know, token, predict the next word. I don't think there's anybody on the planet who anticipated the type of capabilities we would see that that are that emerge at scale in these large language models. In fact, I have colleagues I've worked with, like, when the GPT 2 paper came out in 2020, didn't believe it. Thought, you know, some of the few shot examples within the within the paper were were impossible. It just couldn't be true. And so I say I say that I think that this is the most important scientific discovery of the 21st century because the emergent capabilities weren't predicted. And I think it tells us a lot about intelligence.
You know, if I say that the model is, you know, quote, unquote reasoning during inference time, I don't really mean that it's reasoning exactly in the same way we do. But that presupposes we even know how we reason, and we don't. And, you know, if you go back to the history of AI, it's really kinda funny. You know, in the fifties sixties, they thought, oh, if we could build a computer, and that computer could play chess at, you know, the the, you know, grand master level, it would certainly have, you know, AGI capabilities. And it turned out not to be true. We we solved that problem in the nineties, and things that we didn't think were complicated, things that we took for granted like our vision systems or in our speech systems and our auditory systems, we just thought were relatively simple problems to solve. In fact, Minsky famously in the sixties gave, like, an undergrad at MIT, like, the a summer project to build a computer vision system because they didn't think it was that complicated.
And the reason is that we can't introspect our cognitive processes. And so, you know, our visual cortex is unbelievably complicated. So the point I'm trying to make is this. We don't really know how we see at a deep way. We can't introspect our consciousness or our thought process. So I don't know exactly how my own brain works. So it's kinda hard to speak deeply about the differences in what might be consciousness or what might be intelligence, what might be reasoning within AI when we can't even speak deeply about it with humans. All I know is that it is absolutely stunning that large language models have these emergent capabilities at scale, and I think we should keep exploring that and see how far we can push this.
[00:39:26] Tobias Macey:
And another pressure that AI is having on the world that we live in is in terms of the computing systems that we build where for a long time, we've had the Von Neumann architecture that has served us well. And now with the growth of AI both on the training side, but in particular on inference, which is from a distribution perspective, more ubiquitous, everybody needs to be able to do inference and particularly as we start to push things into the edge and on mobile devices. And I'm wondering how you see the engine of AI forcing us to rethink how we construct the drive train to be able to actually harness that power and some of the effect that it's having on the systems architecture at the compute level and how we think about actually building our computing systems?
[00:40:19] Ron Green:
That is a really difficult question to answer. There are there are all kinds of examples within, AI right now where the techniques bend to accommodate the hardware. And then there are instances where the hardware will be modified to specialize in optimizing for some algorithmic advancement, transformer being, you know, the best example of that. Right now, you know, it it is absolutely fair to say that that deep learning is a dominant approach, and within deep learning, transformers are dominant approach. And if you look at if you look at a if you look at a transformer, you know, it it one of the funny jokes is that that the the famous paper that the transformer architecture came out of was called attention is all you need. But if you actually look at the amount of parameters within any transformer model, most of the parameters are still on the multilayer perceptrons that are that are at the end of each of the attention blocks. And so that's just that's just linear algebra. That's just matrix multiplication. And so, I think for at least the foreseeable future, the bottleneck within AI is going to be that ability to do dot products at scale. And I think we're gonna see companies like NVIDIA just pouring more and more money and, you know, resources and and and time into seeing how much they can scale up and and move to a, concurrent parallel, computation of these enormous, you know, matrix operations.
Beyond that, candidly, I just don't I just don't have a lot of visibility.
[00:42:00] Tobias Macey:
In your work of investing in this ecosystem of generative AI and helping organizations figure out how best to harness that motivating force of the LLM as engine, what are some of the most interesting or challenging or innovative ways that you have seen people try to conceive of the ways that those LLMs are able to have a transformative force on their organization or on the ecosystem in which they're operating?
[00:42:33] Ron Green:
I okay. I think probably the the thing that I'm most excited about are within sort of the that domain, the way you described it there, are not just pure LLMs, but these sort of multimodal language models. So these large language vision models. And we're seeing more and more examples of sort of multimodal models that are conditioned in a way that allow them to provide outputs and capabilities that, you know, frankly, it just seems like magic to me. So I I'll give you maybe a couple examples. We're seeing companies take, multimodal language models and condition them on, 3d3d sort of CAD space like problems. And then you can literally, write in English, in text, what you want the CAD to generate and manipulate it with really pretty high success, you know, these these AI generated meshes.
We're we're seeing this also at the intersection of health care on health care data for assessing that. There was there was actually, an article just, like 2 days ago in the New York Times talking about how LLMs were dramatically beating doctors in this relatively small case study of doing patient assessment. And even when, even when the doctors were paired with the language models and they were able to collaborate with them, language models actually outperform the doctors. And in their sort of, like, post evaluation of why, it was because the doctors came in with some preconceptions.
And when the language models pointed out flaws in that, they basically ignored it. And another example and again, this is why I say we're at, like, day 0. We are very early into this. Is, you know, there are now these multimodal models that that are capable of on the fly game generation. So there was an example of, like, a sort of a Minecraft generation game that you can type in and build the world, but its world model is really weak. So, like, if you're looking at a view and you turn and you do a 360, when you come back, it's changed. Like, in the moment, right, its world view is just very ephemeral. But it was conditioned on those Minecraft, contexts and and can generate, you know, at a at a high frame rate, you know, this imagined world already.
So I think that those are probably maybe the most radical examples, and you'll notice all those are kinda mostly toys still, and that's because it's just really, really early days.
[00:45:06] Tobias Macey:
And in your own work of navigating this space and trying to grasp the current phase of AI that we're in, what are some of the most interesting or unexpected or challenging lessons that you've learned personally?
[00:45:20] Ron Green:
I I think that I am continually surprised at the power of the diffusion approach, I think. I think that may be the thing that I'm most excited about right now overall. You know, the diffusion the diffusion approach just for our listeners, is this idea of taking some some input and adding some some proturbance to it, typically noise. And so if you take maybe the canonical example of images, you take an image, you gradually add, let's say, Gaussian noise, and you train a model to be able to remove that noise at different stages of that process.
And at the end of the process with images, you know, you've just got an image that's just full noise. There's nothing there that's that's even remotely, recognizable. But you've conditioned that model throughout this whole process on a text input that was embedded in such a way that the model can learn what the image contains semantically. And at the end of this, you can you can literally take a text string of something you wanna create that maybe has never existed in the universe and give that model an image with just pure noise in that string describing the what you want, and you lie to the model. And you say, this actually is that image. It's just actually is that image. It's just got a bunch of noise in it, and it will denoise it. That approach, we're seeing that work in robotics. We're seeing that work in protein folding. For example, AlphaFold 3, which is the just breathtakingly powerful computational biology model released by Google DeepMind this year. In fact, DeepMind CEO, Dimas Hassabis, and, John John Burdick both won Nobel Prizes in chemistry for this work. It uses a diffusion model. What they do is they basically put in coordinates of the atoms, the the different atoms within protein molecules, and they perturb it. And what this allows them to do is use what's called a pair former. It's a variation on a transformer to generate potential, proteins that, amino acid sequences will generate and then use the diffusion models to refine them, and they're getting fantastic accuracy on this. And so we're gonna be able to do, you know, genetic therapies, drug therapies, infectious disease therapies that are all going to be AI generated approaches, each one of which might have been a PhD dissertation. Right? You would have spent maybe 5 years trying to figure out how that that protein folded. Now you can you can enter the amino acid sequence and go get a cup of coffee and come back and have the answer. So I think the diffusion approach right now is the most important thing happening within sort of architectural advancements within AI.
[00:48:17] Tobias Macey:
Given all of the excitement and fewer over generative AI as a solution to whatever problem domain you want to introduce it to, what are the cases where you would advise against the application of generative AI or LLMs?
[00:48:34] Ron Green:
Anytime you need absolute certitude, I would I would say you need to be very careful. Now if you're willing to have a human in the loop, which I would argue, you absolutely should with almost almost any generative approach right now then you're fine. But, you know, you you you definitely would not wanna live in a world where, you know, the doctor comes to you and says, well, we need to perform surgery. And you say, why? And the doctor says, I don't know. But, you know, the AI model told me that's what we need to do. So generative language models, etcetera, incredibly powerful.
At this stage, treat them as human augmentations, and you can go to town. You you can you can build really, really powerful systems. Just avoid them as sources of truth at this point because we're still struggling with control.
[00:49:26] Tobias Macey:
Are there any other aspects of LLMs and the vehicles that we need to build for them or the aspects of control and challenges around that, or you just experience working in this space that we didn't discuss that you would like to cover before we close out the show?
[00:49:41] Ron Green:
Probably the probably the only area that we didn't discuss that I'm pretty excited about is interpretability. And in particular, I think the work from Anthropic over the last year has been fascinating. They're using sparse auto encoders to really dig in and try to understand how these large language models are representing inside the parameter space, different concepts. And they have the famous example where they they, were able to isolate, let's like, a concept like the Golden Gate Bridge in San Francisco. And they found some really fascinating things. One, that that concept was spread out across many neurons within the model.
2, that it didn't matter what language you were operating in, whether it was English or Korean or Russian, the same representation was used across those languages, including images. So if you for a multimodal language model, they found that the Golden Gate Bridge image capabilities also uses same neurons. And then lastly, they did this I just think this is so fascinating. What they did is they they asked the the model, you know, to describe what it looked like physically. And the model said, well, you know, I'm I have no physical form. I am a I'm an AI program, etcetera. And then they, manipulated the model, and they took the neurons that, is that they'd learned that encoded the concept of the Golden Gate Bridge, and they they forced those to output it 10 times their normal level.
And ask the question again. And the model came back and said, oh, I'm the Golden Gate Bridge, and I have, you know, this shape and this form and this color. And so you could manipulate the model to say things you want. And so this is, I think, a very, very major step forward in interpretability and explainability, and I think that this will bear fruit over the next 5 years in a big way. And it will allow us to not only get around some of the control issues we're seeing right now, but it will also make these models much more likely to be used in domains where explainability, interpretability, like medical cases is just, you know, nonnegotiable.
It absolutely has to be there. So I'm I'm super excited about that stuff.
[00:51:57] Tobias Macey:
Yeah. The visibility into the internal state, I think, is definitely a very important area of investment where we need to dig into. So I'm I'm excited to see more progress in that space. So, So, yeah, definitely excited to see where things go from here when we get to the point of the model t and when we progress to the point where we actually have some of the current generation of vehicles where they have all of the bells and whistles of safety features, and it knows where I'm about to, park too close to the guardrail or what have you and starts beeping at me. So
[00:52:31] Ron Green:
Exactly. Yeah. The the these models are really powerful and smart. We need to we need to, get them to be a little more reliable.
[00:52:40] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human trading that's available for AI systems
[00:52:57] Ron Green:
today? I think the biggest limitation are the 2 things we've hit on, which are, control and interpretability. And they are not deal breakers, but they are, I think, limiting the velocity, of adoption in different domains where we really where we really need them. But I'm absolutely optimistic that we'll figure that out. It is, it is not an exaggeration to say that I think as a part of this journey towards understanding in a deeper way the way these large, deep learning systems work and and as we make them less of a black box, we are simultaneously probably going to start understanding how our own brain works. It'll probably go in tandem. And even though, you know, we can build jets and they don't flap their wings, you know, there are many different ways to fly. I think that's also true with, intelligence, but I think we'll probably be surprised to find there are going to be a lot more overlaps than we initially suspected.
[00:54:05] Tobias Macey:
Well, thank you very much for taking the time today to join me and share your experience and expertise in the space and your perspective on where we are in the journey of AI adoption and AI capabilities and some of the areas of investment that we need to make to improve the operability of these models. So thank you again for taking the time and for the work that you're doing to help organizations tackle those problems, and I hope you enjoy the rest of your day.
[00:54:30] Ron Green:
Thank you so much. This was a really, really fun conversation.
[00:54:38] Tobias Macey:
Thank you for listening, and don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at the machine learning podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction to AI Engineering Podcast
Interview with Ron Green: AI Beginnings
AI Cycles and Industry Evolution
Challenges with Large Language Models
RAG Stack and Its Limitations
Multi-Agent Systems in AI
Domain-Specific AI vs. Generative AI
AI Tooling and Frameworks
Data Labeling and Synthetic Data
AI's Impact on Computing Systems
Future of AI Model Architectures
Cognitive Science in AI Models
Innovative Uses of LLMs
When Not to Use Generative AI
Interpretability and Control in AI