Description
In Part 1 of the Agentic Maturity Model: From Readiness to Delivery series, we explore how to assess your position on the Agentic Maturity scale and why strong DevOps foundations are critical to progressing agentic use cases safely into production.
Transcript
It's a pleasure to be with you all here this afternoon, this evening, this morning, wherever you are in the world. My name is Jack McCurdy. I'm a DevOps advocate at Gearset. Today, it's my distinct privilege to be joined by my colleague, Dave Rant, who is Agentforce engineering lead here at Gearset, and Yeuk Yuan, Senior Director of Enterprise IT Strategy at Salesforce. And to kick us off, I am gonna throw over to Yeuk. Yeuk, let's talk a little bit about agent maturity.
Great. So Agentic maturity has been a a conversation we've been having with our customers for while. Let's go to the next slide.
So I've been with Salesforce for over twenty years, and I've had the privilege of talking to a lot of CIOs, a lot of enterprise architects, and so on. And we've seen changes come and go. We've seen social. We've seen mobile.
We've seen a few iterations of, you know, data, and AI has been really interesting. In some ways, it follows the patterns. Like, we have this wonderful technology. What are we gonna do with it?
I think what's really a lot more acute than previous one is that the chasm is much broader. And what I mean by that is on the left hand side, if you look at innovate, protect, optimize, that's what our CIOs are faced with. And I think because of AI, maybe because of the hype, the expectations are much higher with AI than there have been with previous changes. And what's happening on the right hand side, what we see on the technology and products and solutions and features is this endless stream of buzzwords, these technology nuggets.
And what's happening is companies are doing DIY. They're doing POCs on top of these nuggets. And sometimes you're gonna get into a situation where some customers are still waiting to see what other folks are doing. Some customers have done some DIYs, but now they're sitting there.
It's like, alright. I approved what one of the vendors told me, but how does that affect my business? And if you look at the MIT quote at the bottom, I think that's a nice summary. That was a there's a piece that came from their Nanda project, last year.
And, really, you can see the tagline there. Ninety five percent of the organizations are getting zero return. So they're putting all this money. They're hiring people, and they're getting the foundations ready. And arguably, you know, that's the right thing to do. But how do I get to business value?
Right? And from the from what they've from the the pattern that they've seen is that the ones who have been successful, they've looked at process specific outcomes, and they've done it. And these little bold words at the bottom, existing processes improve over time. So that makes me think of phased approach. And if anyone has been around architecture for a while, that's really the approach that we're taking here. So let's go to the next slide.
And this is why we built out the agentic maturity model. And what we've done is we've taken a framing around where are we today with chatbots? What does that mean to the business? What types of actions and automations and levels of automations and domain and span of those agents?
And we made this very simple model where you can start from it may be a rigid, simplistic, not very generative AI approach to all the way to level four, which is gonna be multi agent, full generative AI, compliance security, etcetera. Okay? But, really, the left to right here is mainly about the business. This is less about I'm implementing technology one or technology two, but I am making business decisions to increase the level of scope and autonomy that my agents have.
And why is that meaningful to business? Because this is when you can start measuring the value, and you can also start thinking beyond the technology. What other business processes that need to change? How do I engage this new form of digital labor?
What are the process and business, accountability rules that I need to change in my business? When you have a fully autonomous agent writing a commission you know, writing, you know, running a transaction for you, who gets the commission? How does that work? Who's accountable when the solution is not right?
So there's a lot more than just technology, and this is a framework that we're using to bring it more to the business and not just the people in the lab codes doing doing the POCs. Alright. So let's step into some examples.
Okay. So level zero is, I think, honestly, is stuff that we've been using for years. I'm a if just just out of giggles here. We're we're using the airline example, and what you see here on the screen, folks, is this is actually a a pile it's a pretty up pile of screenshots that I took because late night, I was putting a slide together.
I just went to website, and, literally, this is what I got. You got multiple choice. If you don't pick one of multiple points, you don't get an answer. You try to type in some words.
It doesn't work, and it ends up to being a very frustrating experience.
That's kinda before the Gen AI world. Now I think a lot of our customers, a lot of our businesses that we're in still may have elements of that. Let's go to level one.
Level one through level four is really when we start bringing in, quote, agents, especially from Salesforce. You know, we're talking about agent force. Our our position here is that, you know, whether you're a level one or level four, put in a technology, put in the objective platform so that you can scale your businesses left and right as you need. And it doesn't have to be one of these enterprise wide solutions.
It could be departmental. It could be use case. Alright. So level one is information retrieval.
This is where the agent is providing simple, maybe complex information, but the agent is giving the data to the user. And that user can then do the business transaction. Might be booking something. It might be, oh, okay.
I just needed to know that. Alright? So that's the level of autonomy that we're talking about. Let's go to level two.
Level two is when we're starting to let our agents actually make transactions for us. So in this case, the person said, you know what? Let's make that flight. The agent logic behind the scenes is going to actually create that transaction.
Right? There's no human intervening that says, okay. Great. Dave wants that ticket. Let me all tab to all of my windows to book that ticket.
The agent has all the instrumentation to do that. Now behind the scenes, there is Gen AI logic to do that. So this is where, you know, whether or not that agent is tuned to do the transaction, you know, that is part of going from level one to level two. Alright.
Let's go to level three.
Level three now goes into potentially complex orchestrations and or multiple domains. So if you think about a lot of our organizations, we have departmental silos. Right? And whereas maybe I have one simple transaction, multi domain.
Now I'm gonna span multiple departments. I'm gonna go to another area of interest. In this case, not only did we book the ticket, we also did an upgrade. We did collections.
Right? So now one agent is spanning those departmental walls.
Did it just grab randomly grab some APIs? Did it just talk to other agents? Or did it also have to figure out what the interactions between the business processes and those transact those other departments' rules are? So that's level three. Let's go to level four.
Level four, and for those who are really keen, level three have a little bit of level four. Arguably, yes. So what we're thinking about here and the way to think about it is multi agents because I think we're all kind of in that expecting everything to be multi agent. Right?
I think the the one big agent that does everything, I think we're kinda past that. It's gonna be orchestration. It's gonna be controls. It's gonna be hand off.
Multi agent, I think another way to think about it for your models, for your for your businesses, is when your agents within your corporate walls need to start working with agents outside your corporate walls.
Right? So in this particular case, not only have we taken care of the airline ticket, we're gonna coordinate that rideshare.
That's a different business. That's a different organization with potentially a completely different agent technology. Now I think everybody's, you know, watching MCP and the a to a, and let's see where that goes. But that's the level of maturity and complexity that we've laid out as a way to for you to start modeling where you want to go with all this technology.
And I think with that, we're ready to go see an actual example.
Thanks.
Yep.
Thanks very much.
At Gearset, we think the agentic maturity model is exactly the right framing you need for your agentic journey.
And the reason is because the question isn't just where are we today, it's it's how do we actually move up that curve.
And so let me show you what this looks like a little bit in practice.
Secret Escapes is a luxury travel company with sixty million members and Performer, who are a Gearset customer, implemented their agent force journey.
They originally started at level zero in twenty twenty two with some Einstein bots, and they were resolving about ten percent of customer queries automatically. At level one, they had the first ever live agent force agent in EMEA, which was resolving about thirty percent of customer requests. And at level two, Agentforce now handles tasks that would normally fall to their specialists, achieving a forty five percent deflection rate and hundred and 50k in annual savings.
Their road map takes them to level three with Agentforce becoming a versatile travel assistant across the entire customer journey, and the foundation they're building paves the way for level four and becoming an agentic enterprise.
What took six months with Einstein bots took only two weeks with Agentforce, and we're gonna hear more about this journey in part two of this webinar series with Ben Coleman, who's Performa's CEO.
Amazing. Thanks, Dave. We're gonna go back to Polly for a second here. So the ice cream thing wasn't just all for a little bit of fun and giggles, albeit it was fun and giggles.
If you open Polly now, you should be up to vote on this poll. Based on everything that we've heard from Yeuk and now from Dave, it'd be really interesting to understand for the people on this webinar, what agentic maturity level are you currently at?
Alright. We got a couple of level two, level one.
Some still at the start of their journey.
Do have a little bit more time to vote if you're still looking to do that?
You know, that's interesting. When we did this Dreamforce and when I've done some of these in the office with groups, it's really just o's and one's. Zero's and one's. I I wonder if, you know, end of January if people are actually moving forward now.
Yeah.
Well, what would also be interesting is to understand what agentic maturity level do y'all need to be at. So that's where you are right now. What about where you need to be?
Yeah. I'd say the framing around this question is think about business advantage, but maybe add an element of practicality. You know, this is not one of those exercises where the answer is the upper right.
Right? Because depending on the group of the use case, it might be, hey. You know, level three is completely sufficient.
Yeah. Yeah. Every business is unique, and therefore, your agent journey will be unique too.
I do I do I do love that some have threes, and we're not just a whole pile of fours.
Alright. Well, Dave, can you tell us a little bit more about how we might help folks get to that level four?
Yeah. Of course. Thanks, Jack. So just seeing where everyone's at and where you'd all like to be, And the gap between those two is really what we're here to talk about. And here's the thing, AgenTic maturity isn't just about the agents themselves as we've heard from Yeuk, and it starts with the foundations underneath them.
At Gearset, we've got more than three hundred and fifty customers deploying AgentForce and Data Cloud today, and sixty percent of those are already pushing agents into production, which is a high number. What we're seeing again and again with those customers is that teams with strong DevOps foundations are adopting agent force faster and with more confidence. They have the visibility, the automation, and the guardrails they need. And when DevOps is done right, their innovation becomes sustainable and not risky.
And that matters more now than ever. AI really does feel different. This isn't just another feature. It's a leap.
These systems are dynamic. They're data hungry, and they change fast.
If we want to keep up, we need to build and ship our changes with confidence.
So if you're looking at that gap between where you are and where you need to be, DevOps will give you the technical underpinnings to close it.
Now we've seen plenty of Salesforce releases over the years. So what is it about agent force DevOps that's more challenging?
Well, to start with, it's not predictable. We're used to traditional development where you write the code, you run the code, and you get the results you expect. You run it a thousand times, and you'll get the same result. Well, agentic systems, they don't behave that way. The same input can give you different outputs each time, and that's powerful, but it's also kind of unsettling. How do you test and trust and verify that at scale?
And these systems run on data and prompts that keep shifting, and even the models themselves get updated. Agent behaviour can change in ways you just don't see coming, and that's why observability becomes essential. It's how you debug. It's how you stay compliant, and it's how you keep trust with your users.
Then there's data. It's always been important, but with agent force, it's dial up to eleven. Agents don't just rely on good data. They're powered by it. Inconsistent data will break your agents, whether that's in development or production.
And testing has to change. It can't be bolted on at the end. It has to shift left.
And what feels like a realistic dataset today won't be anywhere near enough for agents. They need big representative data for us to build for reliability and production.
This isn't just about technology. It's about people too.
Agentforce changes what it means to be an admin, what your team is responsible for, and how you work together.
Now our industry only goes through these big shifts once a decade. Google changed search, Apple changed phones, and AI is changing everything.
And that's why agent force is more than just another Salesforce release. It's a step change. And the bigger the leap, the more urgent it becomes to get the foundations right.
And that brings us nicely back to the agentic maturity model, which Jack has already touched on. And it's actually a really helpful way of thinking about how teams will adopt agent force over time.
And as we've seen, most teams are still at level one or level two, and and that's okay. It's it's what we'd expect with something this new.
But there's another reason that teams aren't climbing higher yet. You want agents to take the right actions, but you don't quite trust them to do it on their own. The data is not always clean, deployment pipelines aren't in place, and there's a lack of confidence in testing.
DevOps is what clears those challenges. It gives you the repeatable process, the clean data, the observability, and the confidence to let the agent take action and climb to the next level.
To achieve that, we need to look across the whole software development life cycle. Oh, we did not. Oh. And we visualize this with the familiar infinity loop.
If we want to deliver agent force changes repeatedly and reliably, we need to work across the whole life cycle, understand the state of our orgs, build the right changes, catch issues early, release frequently, and respond quickly when things go wrong. And this is Gearset's wheelhouse. We've been doing this for years with all sorts of metadata. So I'd like to break it down and show you how Gearset can help at each stage of the life cycle.
The DevOps process begins when you're planning your work. And before you build an agent, you need to understand what already exists in your org. What data have you got, what's missing, And what metadata does your agent need to touch?
Gearset's org intelligence can scan everything and give you the blueprint of your complete org architecture, and you can simply ask Gearset's agent about the business processes within your org.
Without this, you'd spend hours digging around with different tools. Org intelligence removes that guesswork even in an org that you don't know or one that's been through years of changes.
Next slide, please, Jack.
And once we understand our org, we need to get the right data to build and test an agent.
Most teams are working in developer sandboxes, which comes with metadata but no data, And agents are data hungry. If you're testing against a handful of records, you won't catch the edge cases that will break it in production.
Gearset sandbox seeding makes it easy to populate your lower environments with production like data while masking anything that's sensitive. You just pick the records that you want, and Gearset will handle the relationships and the dependencies. You'll get realistic data in your sandbox in minutes so that you can test properly and catch issues before they hit production.
And if you're using a full copy or a partial copy sandbox, you've already got production data, but it's in a lower environment. Real customer data, real PII, potentially in the hands of developers or contractors.
And when you're building agents, that's a risk you don't want to take because these systems can interact with data in unpredictable ways.
Gearset gives you two options. You can mask the data as you seed it so sensitive records never reach your sandbox, or you can use in place masking to obfuscate the data that's already there with realistic values that still work for testing.
It means you keep compliant, and it means you can build agents with confidence knowing that your customer's data is protected.
Now we come to validation, catching issues before they hit production.
With agents, this becomes even more important. These systems interact with your data in new ways, opening up new attack surfaces and security risks.
Gearset's code reviews scans every change, Apex flows, and Lightning Web Components, all of it, against best practices like the well architected framework and security standards like OWASP. It'll catch vulnerabilities and flag bottlenecks and enforce consistent standards across your team, which means issues get fixed during development and not during a production instant.
And once you've built your changes, you'll need to know they actually work before they hit production.
Gearset's automated testing is an accessible no code way to start your test automation journey.
You don't need to be a developer to build and run these tests.
Traditional robotic testing tools follow rigid scripts that break the moment a button moves, but Gearset's automated testing uses AI to understand the semantics of what it's interacting with. So your tests will stay resilient as your UI evolves.
And we use that same intelligence to make test creation available to everyone, whether you're an admin, a QA, or a developer. That means less time testing and more time building.
Then when we want to operate our agents, archiving becomes critical.
These systems are data hungry. And if they're waiting through years of legacy records, you're gonna get slow responses, weird edge cases, and possibly inconsistent behavior. So with archiving, we can create policies that clear our old or unused records, keeping our orgs clean so our agents are only working with relevant high quality data. And if we ever need to bring that data back for compliance or auditing, we can restore it in just a few clicks.
And finally, in the observe phase, once your agent is live, you need visibility into how it's behaving.
Gearset's observability surfaces every flow and apex failure across your orgs. It'll track your org limits, and it'll pull it all into one place.
Instead of trawling through logs or waiting for end users to tell you something's broken, you get to see the problems first and the patterns behind them.
With agents, where behavior shifts as data and prompts change, that kind of insightful isn't just nice to have. It's essential.
And then reiterate because agents learn and evolve and so should your team and your processes. What's working? What isn't? What needs to change?
Some of this playbook will feel familiar, but some of it will be completely new.
And, honestly, a lot of what's possible with AgentForce is still unknown. We don't yet know everything we're gonna achieve over the next few years, but that's exactly why these foundations matter. When you've got the visibility, the automation, and the guardrails in place, you can experiment and learn and move forward with confidence.
Cool. Thank you, Dave. You know, it's funny. I was thinking about the maturity model deeper more deeply when I was listening to your observability piece, actually. I I I definitely am a big fan of observability. I think that is probably the most unexpected thing around the corner with AI that we maybe haven't even gotten to the conversation about. Because when you're going left to right with maturity, you know, the the release process may feel, like, comfortable at that point.
But because you're reaching domains and combinations of orchestrations and processes and domains that you haven't thought about before, I think observability is gonna be the only place where you're gonna see, like, oh, I did not expect that. So, anyway, Alright, folks. So we have covered most of the first part. First part is to outline the what.
And we have a part two. We'll tell you more about it. But part one is thinking about your use case. So we use you see some guidelines and take a quick look because we're gonna give you some homework.
And what we want you to think about is how would you have this conversation with the business leaders in your organization? So think about your first use case. Think about commercial importance. Think about risk.
Think about should this be an internal audience or external audience? Think about who do I need to engage with to make this happen. And the way we want you to capture that, the the result of that thinking is check. Next slide.
We're providing a template. Okay? And you can get the QR code, download this, fill in the boxes. We've kinda laid it out already for you. So first row, think about the use case. Just simple business description.
Second part, business needs. Do I need data? What type of participation? Will there be enablement?
Right? Will there be interorganizational communications? God, that sounded so bad. Well, how how broadly do I need to communicate this across the organization?
And then DevOps challenges. Alright? And it's funny. This template, as simple as it is, we went through this with a pharmaceutical company a while ago.
And one of the VPs, they said, you know, this process made it so simple for me because I'm not a tech person. I understand how this is gonna impact my business now. So take a crack at this, and I don't know, you know, what what the profile of everyone here is, but use this, go talk to your business. And by all means, avoid the jargon and build this phased approach going left to right, incremental.
Try to keep within the swim lane of one job to be done, but think about how it changes more and more as you go from left to right. Okay? So that's our template, and we're going to pick that up next in part two. And, Jack here, before I cut this over to you, today was about the what, and the next session is gonna be the how.
We're gonna take that template. We're gonna have a customer. We're gonna go deeper into that. And what we'd love for you to do is be able to follow along with the how thinking with your what in front of you so you can get even more out of this webinar.
Alright. Jack, you.
Yeah. Absolutely. That's why homework's important. You know, the teachers don't give you homework for no reason. Your homework is gonna be super important, well, on Wednesday the eleventh of February. So as Yeuk and Dave has explained to us, we've got we've got we've got the what.
We've got the what now. We need to need to make the most of this second session, and this outline here is gonna tell tell you all about it. And Ben Coleman from Performa, the customer Dave was talking about earlier, will be joining us as well. So I'm really excited for that and be sure to join us as we learn more about how DevOps underpins successful agentic adoption and what we can all achieve together when we are doing the right things.
That being said, we have a short amount of time for q and a. In Zoom, there is a q and a functionality which you can use to to drop in some questions for Dave and Yeuk. But one question that I have based on everything, to to get us started here as well. Yeuk, you mentioned observability, one of the key things that Dave was mentioning that really kinda lit a fire for you and really resonated. Dave, I wanna throw it throw it over to to your side. And what part of what Yeuk was telling us really resonated with you when you think about the DevOps journey that you go along with Gearset customers?
So I think it's really poignant to point out that observability is kind of one of the the biggest curveballs that maybe we weren't anticipating becoming so dominant in the dev in the DevOps life cycle for for agents. They kinda make sense when you think about it. So we've got these systems.
They're not gonna produce identical responses or behave in identical ways all the time.
And the best thing we can do is just observe them. Observe, respond, and this is why DevOps is so important. So when we need to respond in a timely manner, we're in a place where we can make changes confidently and rapidly to get things back to a known good state in our production environments just to make sure our customers are having the best experiences that they can.
Yeah. Absolute absolutely. Yeuk, where would you where would you say the ambition is for a lot of folks? We've seen the responses to the short survey that we did during, this presentation.
Is is there a true hunger for folks to get to that level four? Is that where most people are sitting as they look at AI right now? And do you expect the shift to come back down to that two or three like you mentioned that you were you mentioned you were encouraging people to think, is two the right level or is three the right level? Are people still focused on four and expecting to be at four next year, year after?
It's kinda all over the map, to be honest. I mean, we we've we have internal surveys where we, pull some of our CIOs. I I have the the the privilege of going to SICs and talk to customers. It feels like the there's still a lot of people oops.
Hang on. My camera. There's still a lot of people who are really thinking about those first steps. And the types of questions they're asking about those first steps are around which one, what's the business impact, how do we even measure it, kind of very fundamental things.
But I think the thing that's happening is as all the technology vendors are in this space and, you know, the, my boss tells a story. Like, the billboards in the Silicon Valley are so far ahead of the rest of the planet, it's unreal. But this is what you see in trade shows. You know, you go to Dreamforce, and we're all talking about, you know, MCP version thirty five.
Right? So what's happening with our audience is they'll still be struggling with getting to maybe one level one, But they're they're they're a little bit hindered by, oh, but what happens when I do multi agent? Right? So I think it's a combination.
I mean, obviously, I I can you know, there are some outliers that are already out there, but I'd say the vast majority are still kinda like, how do I get into that first step? You know, do I even trust it going do I trust Gen AI enough specifically?
Because that's that's a pretty big step function there because you're kinda moving away from programmatic. You're going to this space of potential hallucinations, but they do have this nagging multi agent thing out there when I think most of them aren't even at that point yet, but they're thinking ahead. And this is I mean, as far as Salesforce is concerned, you know, I think it's a good conversation because what we want to provide the customer in those cases is, you know, here's the near term needs that you have, but also on our platform are the longer term needs. But it'll be there when you've made the business decision that you need it.
Yeah. Absolutely. Like Dave was mentioning, that iterative development and that iterative thinking can be really critical to the success of any Salesforce implementation, never mind agent force, iterating on what you have and with with that ambition, to to take you places.
Dave, is there anything interesting that that you've observed working with the AgentForce platform and the DevOps capabilities? What what would be your number one piece of advice for folks that are implementing AgentForce right now and when it comes to tackling the DevOps challenges that they're having having, we you mentioned you walked us through all the stages of that life cycle. What's the first thing that you would recommend people focusing on improving or fixing to get things right?
Sure. So I think, making sure that you've got a stable pathway to production is gonna be absolutely key to reliably delivering agents that deliver value to your business. It's really interesting to see how agent force the platform is rapidly evolving, one of the fastest evolving bits of Salesforce that I think we've ever seen, and how the whole ecosystem is moving so quickly. The agentic maturity model is kind of mapping out this pathway for customers to go on as they become an agentic enterprise where they're gonna need to knock down silos within their business, data silos and conversational silos.
And actually, the the real enterprise challenge ahead of us is that these silos are the things that are limiting the ability of agents to solve more complex real world problems. And as Jack's been alluding to, it's not just technology. Technology is an implementation detail. It's how you arrange yourselves as an organization and how you think about coordinating policy and how you're liaising with your colleagues to to kind of take these steps forward.
Amazing. Very poignant poignant poignant comment to to leave on as we have run five minutes over over the scheduled webinar timing. So thank you folks for for sticking around with us to hear part one of agentic readiness. And we will be back on the eleventh of February again with Ben from Performa IT.
Yeuk will be here as well as Dave and myself, so we look forward to seeing you there. If you have questions for any of us, then please please reach out to us. All of us have LinkedIn profiles. All of us have emails.
I think we will drop those into the chat here now as well. But that being said, we look forward to seeing you all on the eleventh of February. And keep going keep going with your journeys, have a great day, everybody.