How LLMs Actually Work—and Why It Matters

AI Knowhow: Episode

AI Knowhow Episode 63 Summary

LLMs use a process called tokenization to understand the relationships between words and predict what will come next
While LLMs have undoubtedly revolutionized what’s possible in the workplace, with even more change on the horizon, understanding their limitations is also vital
The next big step for LLMs is a concept called unsupervised learning, where they will essentially be able to train themselves if provided quality data

Large Language Models such as ChatGPT, Claude, and Gemini have taken the world by storm since ChatGPT was unveiled in November 2022. These models offer revolutionary ways to interact with data, perform complex tasks, and enhance business intelligence. But what goes on behind the scenes of these technological marvels, and why is it crucial for professionals across various fields to understand them?

Unpacking the Mechanics of LLMs

At their core, LLMs are next-word prediction systems. As Lead Data Scientist Ramsri Golla and Chief Science Officer John Fowler explain, what LLMs do is akin to predicting the subsequent numbers in a sequence, transforming words into numbers through a process called tokenization. This conversion enables LLMs to predict upcoming tokens with remarkable accuracy.

What truly differentiates LLMs from simple pattern recognition tools is their sophistication in processing vast amounts of data—from internet text to academic publications. This extensive training enables these models to learn complex associations between words and concepts, simulating reasoning even though it truly comes down to statistical probabilities.

Why Should Business Leaders Care About How LLMs Work?

For executives and businesses, understanding LLMs is not merely about recognizing their capabilities and how AI can change their entire business model but also acknowledging their limitations, such as biases and possible “hallucinations”—instances where the model confidently generates incorrect information. Organizations need to critically assess the outcomes these models deliver and integrate human oversight to ensure accuracy and relevance.

CEO David DeWolf points out a fundamental distinction: LLMs don’t “understand” in a human sense but can aid in tasks traditionally thought to require cognitive ability. This distinction highlights the importance of discerning what tasks are best suited for AI and which require the inherent understanding only humans provide.

Navigating the Challenges and Leveraging the Opportunities

Professional service firms must realize that comprehension of LLMs helps anticipate potential pitfalls and properly leverage these tools in compliance-heavy sectors. As Chief Product and Technology Officer Mohan Rao notes, simply accepting LLM results without understanding their workings could lead to regulatory risks.

The conversation extends to the trend of Retrieval Augmented Generation (RAG), which fine-tunes model outputs by providing specific, contextual data at the time of a query. This strategy minimizes biases and ensures more precise results, highlighting a shift from traditional model fine-tuning to a context-driven approach, as detailed by Ramsri.

Understanding LLMs in the Context of AI Transformation

As the digital age continues to transform industries, understanding AI’s advances—like LLMs—is imperative. The insights shared by our expert panel equip business leaders with the knowledge to navigate AI’s complexities, harness its capabilities effectively, and future-proof their organizations against technological disruptions.

Watch the Episode

Watch the full episode below, and be sure to subscribe to our YouTube channel.

Listen to the Episode

You can tune in to the full episode via the Spotify embed below, and you can find AI Knowhow on Apple Podcasts and anywhere else you get your podcasts.

Show Notes & Related Links

Watch a guided Knownwell demo
Read “Here’s what’s really going on inside an LLM’s neural network” in ArsTechnica
Read the “Attention Is All You Need” research paper from a number of leaders at Google that’s referenced in the episode
Connect with Ramsri Goutham Golla on LinkedIn
Connect with John Fowler on LinkedIn
Connect with David DeWolf on LinkedIn
Connect with Mohan Rao on LinkedIn
Connect with Courtney Baker on LinkedIn
Connect with Pete Buer on LinkedIn
Follow Knownwell on LinkedIn

Have you ever wondered what’s really going on behind the scenes of LLMs like ChatGPT, Claude and Gemini?

And why should it even matter in the first place?

You don’t need to know how JavaScript works to surf the web, obviously.

You certainly don’t need to write Swift code to use iPhone apps.

So stay tuned, friends.

By the end of this episode, you will have a much deeper understanding of how LLMs actually work and why it matters.

Hi, I’m Courtney Baker, and this is AI Knowhow from Knownwell, helping you reimagine your business in the AI era.

Mohan Rao recently posted an article in our company Slack titled, Here’s What’s Really Going On Inside an LLM’s Neural Network that really got everybody talking.

It generated so much buzz that we knew we had to bring it to the larger audience.

And so here we are.

Today, Ramsri Goula and John Fowler are joining Mohan, David and me to dive into how LLMs actually work.

We’ll break down not only how they work, but why it matters to every professional, even if you’re not in the trenches building or fine tuning these models.

This is a conversation that brings LLMs to life, making them accessible and relevant for all of us.

Hope you enjoy.

Okay, two big questions for this episode.

The first being, how do LLMs actually work and why does it matter?

Maybe not for the general public, but at least to those of us in business, as executives, just working in our roles.

What do we need to know?

Why does it matter for us to understand LLMs?

So before we get into this, I’m going to put on my glasses since we have the super smart people here today.

Let’s start with the first question.

How do LLMs actually work?

10,000 foot overview.

You can think of LLMs like next word or token predictor.

When I mean token, think of it like subword, a part of the word or something like that.

So in short, LLMs are next token or word predictor.

And the thing is, if you ask any question or if you give a paragraph and ask it to rephrase or ask it to expand, just like how a human would, it will start predicting token after token, word after word.

And you can think of it like recursively where the next predicted token is input.

And now you have partially filled sentence, another word or token is generated that is appended and given as input and you generate the next word.

So kind of like fill in the blank generator where the blank is at the very end.

That’s the 10,000 foot overview, yeah.

Okay, so what I’m hearing you say is, it’s basically just like a mad lib.

That’s what we’re going here.

The most transformational technology of our lifetimes is it’s a mad lib.

Almost, you know, I’ll make it even simpler.

When we talk about predicting tokens, we’re really just talking about predicting numbers.

Like if I were to ask you, given the sequence 1, 2, 3, 4, what are the next numbers in the sequence?

You would know it’s 5, 6, 7, 8.

In truth, that’s really all that an LLM is doing.

It’s taking the words that you put in, it’s tokenizing them, which really means taking those words and turning them into numbers.

And then it has a very complex process of predicting based on the numbers that are there, what are the next numbers?

What are the next tokens that ought to be?

And then based on that prediction, it then fills out whatever ought to come next.

And it takes those numbers and converts them back into text and that’s the output that we see.

So really it’s more than anything else.

So LLM is just an excellent predictor of what are the next tokens or the next numbers that need to come in the sequence.

Where does that ability to predict come from John, right?

When I think about your example, it seems so simple.

One, two, three, four.

I know five, six, seven, eight because my mom and dad taught me and then math and my math class in second grade reinforced it, right?

And all those types of things.

But this LLM is predicting things that are 10 times, no, 10 billion times more complex.

Yeah, that’s absolutely right.

Like, blow our minds now.

What is it that allows it to do that?

Great question.

So really, if you want to think about the lifetime or lifespan of an LLM, when an LLM starts pre-training, it really has no understanding of anything.

And the idea is that we take 10,000 billion, 10 billion documents of information, of text, and we’ll just talk about text LLMs for now, and we just feed it to the LLM, essentially asking it or telling it, go ahead and start interpreting these and see if you can predict what ought to come next.

And then what happens is through a trial of error, feeding it training data, we are able to make the LLM work better and better over time.

So really, it’s more that it has processed tremendously large sets of data, tremendously large sets of text from all across the internet, from tons of books, news articles, whatever have you.

It’s just processed a significant amount of text and that text has trained it to understand what are the right ways in which these tokens that it sees, these numbers that it sees, what are the right ways that those go together?

Got it.

Now that’s a gross simplification.

You know, it has this thing, like it takes the tokens and embeds them, and it understands a little bit better, the concept of what’s going on and how the tokens relate to each other.

But all in all, what it’s just doing is by being fed a tremendous amount of data, it’s mathematically learning how to relate everything together and therefore how to predict what ought to come next.

It all goes back to my mom.

It’s kind of like those flashcards.

She put them in front of me so often that I got used to saying five, six, seven, eight after one, two, three, four.

That’s exactly it.

That’s exactly what it’s doing.

So, Ramsri and John, let’s take an example with what David’s question and see what are the gears that turn in an LLM.

So if he asks the question, what is the capital of the state where Tom Brady won his last Super Bowl?

So for example, in that question, here too, for our audience who are not football fans, I’ll just explain and then please explain all the gears that need to turn to come up with the answer.

You need to know that Tom Brady won his last Super Bowl with the Tampa Bay Buccaneers.

That’s an important piece of information.

Then you need to know that Tampa Bay is in Florida.

Then you need to know that the capital of Florida is Tallahassee.

That’s the answer.

What are all the gears?

How many gears?

Is this like three gears or is it like 100 gears?

What is turning in the LLM?

At the end of the day, we have a series of, let’s say, neural networks where parameters are stored.

That’s what an LLM is, where data passes from one layer to another.

When I mean data, again, numbers are flowing with some calculations, multiplications on to the next layer and to the next layer.

On the final layer, we have some probabilities.

What I mean by that is, my name is, if you just ask fill in the blank, we would get probabilities with the most popular words, highest probability and then less probable ones at the bottom.

Similarly, that’s what you get.

The question that you asked might be a little complex for LLM at times, because in the training data, if it has seen all those elements, and if it has knowledge of that, it will try to synthesize the answer.

But it’s a multi-layered question.

It’s more than like a fill in the blank.

So more like rag, retrieval, argument, generation, architecture.

We would have first layer that will take in the question and simplify it into sub-questions, then synthesize answers for each sub-questions, and then generate the final answer.

But if you just take a raw LLM, it might not be sophisticated enough because all it is trying to do is next token prediction.

So we should think of these as like as floors, right?

So there’s a first question asked, you get the answer, then it drops down to the next floor or level, and then to the next level.

Is that a good visualization to think of it?

So, if you’re talking about just LLM, Large Language Model, probably not, because you can think of it is like more like first, it is trying to analyze the structure at word level, then at word combination level, then at a higher level, and it is trying to synthesize, what’s the next token, most probabilistic token to predict, that’s what it is doing.

So, simplistically speaking, not so much of splitting the question into sub levels, but more like just grammatically synthesizing more information.

And there is something called as attention, which simply means in the words that you have asked, whether it’s a question or a sentence, it is trying to calculate how each word is influenced by other words.

For example, there is an adjective or noun and things like that.

So, it is trying to assess what combination or what averaging of other words does an individual word or token constitute.

Does this mean that really all those grammar lessons that I took are really helpful for using an LLM if it is trying to figure out what I mean first, before I can answer what I’m trying to find out, true or false?

It learns internally without any explicit rules.

If you just give billions of billions of text without any explicit knowledge of grammar or sentence structure, none of those, it will automatically learn.

So even if, let’s say, by a random chance, you found a lot of data, like written text from 10,000 years or 20,000 years, you just feed to the LLM, it will learn what the text means without even knowing any of those individual components are.

Yeah.

Just from the volume.

When I think about that, when you say that, that’s a great example of the LLM learns it, meaning it can predict the next thing, not that it has an innate understanding of it.

So that’s a mind-blowing thing to think about, is that that’s why it doesn’t need the context of parents showing you, when I say lamp, this is a lamp, right?

Because what it’s doing is processing text and just predicting, here’s what’s next, here’s what’s next, and there’s, that’s kind of one of the fundamental differences between a human and these LLMs, right?

Is that innate understanding?

Is that a fair assessment that comes from that?

Good, Jen, do you want to take this?

Yeah, no, I absolutely, I think it is.

However, the attention which Ramsri was talking about really was the game changer in creating LLMs, and that was really about how to associate things together.

So when we talk about, does it reason?

Not really, but it has this amazing number of levels of the ability to relate things to each other.

So that if we think of a particular concepts, it can have a phenomenal number, millions of relationships of things to that.

So that to us, it almost seems as though it is actually thinking.

Like it does actually have a reason.

Like we ask it a particular question, it uses all the relations that it has inside of its scope to be able to provide answers that seem like they came out due to reasoning and then produce text.

But in truth, it’s all just doing predictive determination of what are the best words to come.

And that, you know, part of that, as Ramsri talked about, is you can change that.

If you can tell it, hey, I want a different answer, and it will give you a different answer.

Because it’s all looking at, you know, every next word has like a percentage likelihood as to whether or not that ought to be the next word.

And so it can pick a different word with a different percentage and produce some completely other text.

But yeah, it’s more that it’s able to associate concepts, high-level concepts.

It’s able to create these high-level concepts inside it, associate them together, and then produce something that comes out of that that seems correct.

Those associations that you were just talking about, one of the words I wanted to go back to that Ramsri said originally was neural networks.

We hear about neural networks all the time.

Is that a core piece of it?

It sounds similar as you’re describing it to me, that that’s what these networks are that we’re talking about, is the relationship of these things to each other?

In short, the ability of neural network is to introduce nonlinearity.

That means, simply put, if everything was linear calculation, you could collapse it into one single equation.

Like no matter you have one plus A plus B plus C and all these, then if you just know the values of A, B, C at the end, you just add a single sum, right?

So if neural networks, if there was no element of nonlinearity, which neural networks have, because of that, we can introduce nonlinearity and combine several neural networks together, stack them up and add things like more memory, cross knowledge between neural networks, let’s say cross attention and other parameters, etc.

So in the end, it’s a stack of neural networks and neural networks combine in a specific pattern to retain more information, have more cross knowledge transfer across tokens or words.

But at the end of the day, it’s a stacked layer of neural networks heavily till the end.

Yeah.

And the more layers of neural networks that are in the LLM, the deeper or more cross associations that it can make.

So when they’re building LLMs, they’re really determining how many different layers of neural networks do we have, and what are the types of inputs, and how does it get output?

So when we hear about the cost of training, then obviously a lot of it is the data, the amount of data that it digests.

Is that the other piece?

Is the number of neural networks that are stacked on top of each other?

Absolutely.

It’s the number of neural networks, but also the context window, how large of an input they can have.

But and then also the number of how big the neural networks are, the number of…

Oh, so not just the number of them, but how big they themselves are.

Fascinating.

Correct.

You know, what’s fascinating about this is the most elegant technology of our times is sort of brute forcing itself through, right?

So, it’s reading all these millions, billions of documents, putting it in a certain way, finding patterns, but essentially it’s kind of brute forcing itself into finding patterns, right?

Is that an accurate and a simple way to say it?

I think the beauty is that one can come up with all kinds of, let’s say, learning paradigms of teaching networks, etc.

But only when you don’t need preparation for training data, then you can hit scale, which is what deep learning large language models have done, which is simply you don’t need to massage the data at all.

You just give the raw text as it is, chop anywhere and give because you know the next word or next token, and you just train with billions of data.

Basically, it’s called as unsupervised learning, where you don’t need to label the data, you just give raw data unsupervised learning and it learns.

And that’s the magic of LLMs because only when that unlogged, then you don’t need people, but rather raw unstructured data just thrown at it, billions of billions of text and automatically it learns.

Do recognize, though, something we ought to get into is, junk in versus junk out, that the training is very important for LLM.

But going back to your original question of what’s the capital of the state in which Tom Brady won the Super Bowl, if an LLM has never been trained on that, then there’s no possibility that can provide the right outcome or very small possibility that can provide the right outcome.

But nevertheless, it’s going to try.

So if you ask it the question, it’s going to predict words that sound like they could possibly be the answer, but obviously not necessarily get it right.

It’s one of the both great things and terrible things about LLMs.

They always try to be helpful.

They don’t realize sometimes when they’re not being helpful.

They have no recognition of the fact that they’re not being helpful.

I know some people like that, John.

That’s true.

It’s worth, I don’t know if we want to get into it, but it’s worth talking as well.

Where we’ve seen and a lot of people have seen significant increase is not just only the training of the LLM, but being able to provide it context to questions.

So, for instance, we had Knownwell, when someone asked a question to one of our chat bots, we actually don’t rely purely upon how the LLM has been trained.

We actually go out and retrieve data that is related to their question, and then provide that to the chat bot so that the chat mechanism, which is using an LLM, will be able to say, given this background information, can you provide a reasonable response to this question?

So that way, it’s not purely focused on what all has been fed to the LLM.

What has been fed to the LLM provides the capability to make inferences, to reason.

But what we actually provide within the prompt is more context around the question, so that the LLM will be able to read all of these things and say, okay, well, based on this information, I can answer the question as like this.

That’s really one of the pieces that we do, and obviously, all the other large LLMs now do as well.

John, it’s interesting that you brought that up.

I was just reading actually a state of generative AI report by Menlo Ventures.

One of the things that came out in the report this year versus last year was that retrieval augmented generation has really taken off versus fine tuning models.

Originally, people were fine tuning models, and now, what they’re seeing is only 9% of production deployments have fine trained models, whereas I think it was something like 20 to 30% are using retrieval augmented generation.

How, as a data scientist and engineer, you’re building these things, how do you find that fine line?

Where do you need to train a model versus where can you just provide it context?

One of the things is that the models, when they were launched, they had very limited token limit in the sense that your question could expand maximum up to 4,000 tokens or something like that.

And the response would also be limited in that range.

But once people have cracked how LLMs work, they started expanding the context length.

And that’s where the magic happened where now you can ask question as well as provide some context from which it can answer the question.

Now, as LLMs grew bigger and bigger, the ability to give larger context also increased.

So you could give 10 pages, 100 pages of content and then just ask a question.

So the need to fine tune became lesser and lesser in the sense that, for example, if we had only 4,000 token models, what we had to do was take an open-source model, train on all the data that Knownwell has, and then ask a question, get an answer.

Maybe you can have a little paragraph or something, but that’s it.

You could not have 10 or 15 pages of content.

Now, as models became larger, you can dynamically fetch the 10 or 15 pages of content from a database, let’s say vector database, dynamically fetch and place it there and answer the question.

That’s why the world is slowly moving from fine-tuning the model to retrieval-argumented generation, because now models like Gemini can take up to 1 million context length as input or even 2 million tokens as input.

So that simply means you can just bring in a 300-page or 400-page book or content of that volume and just ask a question which it will rephrase and answer from that context.

So, and the thing is that fine-tuning is complicated in the sense that, one, you need to have a lot of data, you need to train which costs you money.

So, you need to rent GPUs and spend thousands or ten thousands of dollars to fine-tune.

And the other thing is that you need to host the model yourself, do MLOps, that is maintain the model if it goes down and things like that.

But what happens with RAG architecture is, you don’t need training or maintenance of the model at all.

You just fetch data dynamically, plug it in and ask the question.

So, the world is slowly moving towards more and more in-context learning or RAG architecture, because now models have also evolved to take in large context of data.

I will say there’s also two parts to LLMs, two difficulties when dealing with LLMs, that RAG really helps fix.

One is the biases that can be put in to the LLM during its training.

Biases are a pretty big issue.

And then the other is possible hallucinations.

Where I said before, the LLM tries to be helpful.

So, if you ask it a question, it doesn’t know the answer to, it’s still going to provide an answer.

It’s just completely wrong.

The nice thing about RAG is that retrieval augmented generation, is that you can provide to the LLM the context with which a question can be answered.

So, what that does is it helps remove a lot of the biases that may have been generated during the training of it.

And it also helps limit really what the possible answers are that the LLM can give so there’s fewer hallucinations.

Doesn’t mean there can’t be hallucinations, doesn’t mean there can’t be bias.

But by providing a significant context before having the question, the LLM more often than not will actually produce an answer that is relevant to what the question is.

If you’re like many executives, you’re knee deep in planning for 2025.

But there’s one question you can’t ignore.

How will your organization ride the AI wave to shore rather than being swept up by it?

Here’s the good news.

Knownwell is here to empower you.

As an AI powered platform for commercial intelligence, Knownwell is designed for forward thinking innovators who want to turn AI from an abstract idea into a tangible advantage, starting today.

If you’re interested, go to knownwell.com to learn more and sign up for our beta wait list.

So, we’ve talked about how LLMs actually work, and I certainly will be deploying some of these new terms as frequently as possible.

So, everybody at Knownwell, get ready.

But now, I want to flip the coin and talk about why it matters, especially for executives in professional service firms.

So, help us understand why do we need to know this?

Sure.

I mean, I’ll go back to understanding how the LLM works is important because you need to understand the pitfalls and what can go wrong there.

And a lot of the two biggest things are the biases that can be put into it, along with the hallucinations.

And being able to understand how LLM works means, you know, you’re able to more, better utilize it to produce the outcomes that you’re actually looking for.

You know, the LLM will always try to be helpful, but that doesn’t mean it’s going to produce helpful things.

So understanding how it works, being able to provide the context it needs, and being able to utilize it correctly will help you produce outcomes that are actually useful, helpful, understandable.

The other piece of that, that I couldn’t help but think about when we tripped over, like it’s just predicting.

And if you force feed it in old ancient language, it’s going to learn it, but it’s not going to understand it, right?

We start to get a clear picture of what is human versus what is artificial intelligence, right?

And I really think in the adoption of these technologies and the usages, it’s not just how we optimize the use of the tool itself.

It’s also understanding what humans are great at and what is uniquely human and what we can do to support humans in their work.

And I think both of those levels can be interesting.

And I feel like I have a new grounding in that because of the education you just gave us.

Does that resonate with all of you as well?

Yeah, it totally does.

Also, in one of our previous episodes, we talked about explainable AI versus understandable AI.

Right?

So we talked about those concepts.

As we start applying these to industries in which there are compliance regulations and policies, we absolutely need to know how LLMs work.

So we can blindly apply the results because there are rules framework in which the business needs to be done.

So it’s really important to understand how they work and what sort of human involvement is needed at the loop as we take in more and more of the products that are built on LLMs.

Yeah, I would add too, it also reminds me of when we had the discussion about native AI platforms versus adding AION to an existing platform and how those results would be different from each other or what the expectation of what kind of results those produce is dramatically different.

I think this kind of starts to like round out for me, which I think hopefully for everybody listening, and I know for everybody listening, they’re like, I’m starting to really get this, this makes sense to me.

You may still run into some coworkers who are frustrated when the native AI platform they’re using is not right.

And now, you can really help explain to them, send them to this episode to get from John and Ramsri, why this is important, how LLMs work.

Why can’t I say letters today?

Like I just said LLM, but it sounded like, like almost like llama without the A’s, you know?

Imagine that.

Yeah, how LLMs work.

And I think this would be really, this has been really helpful.

Hey Courtney, the other thing I’ll add in here real quick too is just in terms of our own prompt engineering when you’re using this, like I think I have a new understanding as well.

You know, Ramsri and John went into the example in the question that Mohan pointed out and there’s multiple layered questions there.

Well, understanding the need to pre-process and break that down into three different questions that then it can answer an infer if you’re just going straight to the LLM.

I think that can be really helpful in terms of just creating better CRISPR prompts that give better answers, right?

Or if you are more technical in terms of leveraging these LLMs, how do you actually build an architecture that routes these requests properly and breaks them down and then decomposes them to be able to process them and be more helpful in giving an answer?

I think, Ramsri and John, do different LLMs work differently or at a high level, or are they all pretty similar?

Great question.

So, I mean, I would say at a 100,000-foot level, they all look very similar.

That is, they all have these layered neural networks.

They all use attention to be able to relate things together.

At a much deeper level, much closer level, they will look significantly different in just how they are optimized, the ways in which the LLMs have been put together, but most importantly as well, how they’ve been trained, the data that’s been put into them in order to produce a particular output.

And so, I’d say, again, 100,000-foot level, they all look the same.

10-foot level, they look very different.

That’s cool.

Ramsri, does it mean different LLMs have different capabilities?

Somewhat, yes, in the sense that the languages that they’re trained on and even the paradigms that are trained on, like, for example, some models are good at code generation, some at just text generation, in a specific language or multiple languages, and also capabilities, because just building LLMs was like a year and a half year ago, or two years thing.

But later on, how do LLMs empower more?

Can they generate complete artifacts?

By that, I mean, can you generate like a separate code which generates HTML and you can view that?

So, can it synthesize multiple artifacts or if programmers are using it?

Can it generate?

Can it know how to call a function and also the parameters to fill it, etc.?

So, kind of like once you have the base structure, people started understanding what does real-world usage of LLMs mean.

And each one started argumenting in one direction or the other.

So, the capabilities, it’s a bit like Amoeba, where it’s changing dynamically based on the popular usage.

If programmers are using, appending more programmatic output features to the LLM, things like that, and all these paradigms that are evolving.

Yeah.

John, Ramsri, I would like to thank you for helping me know more about LLMs than say my microwave downstairs.

So I appreciate this lesson, and I hope for everybody listening today that, again, you’ve got some really great terms to make you sound really smart in your next meeting, and hopefully this helps you even just help break this down for people that you’re working alongside with, that this knowledge, you understanding it helps you disseminate within your organizations as well.

David, Mohan, John, Ramsri, thank you as always.

Thanks, Courtney.

That was fun.

Our pleasure.

Thanks as always for listening or watching.

Don’t forget to give us a five star rating on your podcast Player of Choice.

And we’d really appreciate it if you can leave us a review and or share this episode on social media.

At the end of every episode, we like to ask one of our AI friends to weigh in on the topic at hand.

So, hey Claude, what’s happening?

This episode, we’re talking about how LLMs work and why it matters.

So, what do you think?

LLMs are basically super smart language prediction machines that learn by devouring tons of text and figuring out how words typically go together.

They’re a big deal because they can now help humans do all sorts of crazy cool stuff from writing stories to solving tricky problems.

Now, you’re in the know.

Thanks as always for watching or listening.

We’ll see you next week with more AI applications, discussions and experts.

How LLMs Actually Work—and Why It Matters

AI Knowhow Episode 63 Summary

Unpacking the Mechanics of LLMs

Why Should Business Leaders Care About How LLMs Work?

Navigating the Challenges and Leveraging the Opportunities

Understanding LLMs in the Context of AI Transformation

Watch the Episode

Listen to the Episode

Show Notes & Related Links

You may also like

The Widening AI Value Gap…And What To Do About It

The Rise of the AI Operating Partner

The AI Reset: When to Pause, When to Pivot, and When to Double Down