As AI agents move toward true autonomy, law firms and other industries must adopt the principles of transparency, autonomy, reliability, and visibility to ensure safe and trustworthy deployment
Key takeaways:
-
-
-
There is no true agentic AI… yet — We don’t have true agents yet, but the release of GPT-5 and the speed of improvements signal that agents will become ever more capable quickly.
-
There are 4 core principles of deployment — Deploying true AI agents in law and other high-stakes fields demands adherence to four core principles: transparency, autonomy, reliability, and visibility.
-
Deliberate design and balance needed — The future of AI agents depends on deliberate design choices that balance machine autonomy with human oversight, ensuring trustworthy and effective collaboration.
-
-
Welcome back for the third edition of my column. Last month we took a 30,000-foot view of AI evolution and its five stages of development. This month, I’d like to take a closer view of AI agents and some principles we should be applying to their use. Let’s start by talking about what true AI agents are and what they mean for the practice of law.
Imagine this — a major law firm discovers their AI agent had been conducting legal research for three months despite a critical flaw: It was systematically ignoring case law from certain jurisdictions due to a visibility parameter no one knew existed. The AI agent had drafted hundreds of briefs, all technically accurate within its limited scope, yet all potentially catastrophic if filed. The firm caught it by accident, when a junior associate noticed a glaring omission that the AI had consistently made.
This near-miss isn’t an isolated incident. Across industries, we’re beginning to deploy AI agents to autonomously act in high-stakes environments, such as reviewing contracts, making medical recommendations, managing financial portfolios, even driving cars. We celebrate their efficiency and scale while harboring a gnawing uncertainty: Do we really understand what these systems are doing? Can we trust them when we can’t fully see how they see the world?
What is an AI agent?
Before diving into principles, let’s clarify what we mean by AI agent. The term gets thrown around loosely, often confused with agentic workflows, but there’s a crucial distinction.
An agentic workflow is a semi-automated process in which AI assists with specific tasks but requires human oversight (a human in the loop) at key decision points. Think of it as a chain of AI-powered assistants that hand off work, like a baton, to each other with your approval. The system might draft emails, analyze data, or suggest actions, but a human must review and approve each step.
A true AI agent, by contrast, operates with genuine autonomy. It perceives its environment, makes decisions, and takes actions independently to achieve specified goals. The key difference? An AI agent doesn’t just assist, it acts. It can plan and execute multiple steps, adapt to unexpected situations, and complete complex tasks without constant human intervention.
We don’t have true agents yet. Yes, I’ve experimented with ChatGPT Operator, Agent, and Manus, but they are not fully autonomous, and it would be reckless to assign them any serious work. However, the release of GPT-5 and the speed of improvements signal that agents will become ever more capable much more quickly.
The 4 core principles
There are four core principles — transparency, autonomy, reliability, and visibility — that must be adhered to when deploying true AI agents in law and other high-stakes fields. Let’s look at each principle in turn.
Transparency
Transparency means being able to observe what an AI agent does at every step. This isn’t just about logging actions, rather it’s about understanding the agent’s decision-making process in real-time.
Consider an AI agent assisting with legal research and case preparation. True transparency would mean the user could see which case law databases it consulted and understand why it chose certain precedents over others. In addition, the user would be able to track how the agent weighted different factors, such as jurisdiction, recency, and similarity. And the user also could observe the agent’s reasoning for distinguishing or applying specific cases.
Without transparency, we’re operating on faith — we might see outcomes but miss critical context about how those outcomes were achieved, which becomes especially problematic when agents make mistakes. Without transparency, we can’t diagnose what went wrong or prevent future errors.
For implementation, developers need to build comprehensive logging systems that capture and display, not just actions, but the agent’s reasoning as well. They should create dashboards that visualize decision trees in real-time, and design interrupt mechanisms that allow human inspection at any point.
Autonomy
Autonomy, the agent’s ability to act independently, is both the greatest promise and challenge of AI agents. True autonomy means the agent can initiate actions without explicit commands, adapt strategies based on changing conditions, make judgment calls in ambiguous situations, and recover from errors without human intervention.
The key is matching the AI’s autonomy levels to the risk profile of the work being undertaken. High-stakes decisions will likely require human-in-the-loop constraints, while less risky or routine operations can run fully autonomously. This calibration is an ongoing process, not a one-time setting. Legal ethical requirements also will help set the limits of an agent’s autonomy.
To design autonomy into the system, developers should establish clear boundaries and escalation protocols. They should define which decisions require human approval and which can proceed independently, while also building in periodic autonomy reviews to adjust boundaries based on performance.
Reliability
Reliability in AI agents goes beyond simple accuracy. It encompasses the answers to questions such as: Is the information the agent acts upon accurate and current? Do the agent’s actions consistently comport with ethical requirements and does the agent perform consistently across different contexts? And when things do go wrong, does the agent fail gracefully?
A dangerous misconception is equating autonomy with reliability. Just because an agent operates independently doesn’t mean its outputs are trustworthy. In fact, autonomous operation can mask reliability issues until it’s too late, and they cascade into significant failures.
To ensure reliability, developers need to implement robust testing frameworks that go beyond best-case scenarios. They should create adversarial testing environments, monitor for drift in performance over time, and establish clear reliability metrics tied to real-world outcomes.
Visibility
Visibility, often overlooked, might be the most critical principle. It refers to the scope of information available to an agent when it makes decisions.
When humans research a problem, they can cast a wide net, which leads them to follow unexpected leads and discover information they didn’t know they needed. AI agents, on the other hand, operate within defined parameters — they can only see what they’re programmed to look for.
This creates a fundamental limitation: AI agents make choices about what information to seek and process, potentially missing crucial context. These filtering decisions happen opaquely, creating blind spots a user might not even know exist.
To implement visibility, developers should map the full information landscape available to the AI agent, documenting what data sources are included and, crucially, what’s excluded. They should also build mechanisms for agents to signal when they’re operating at the edges of their visibility boundaries.
Overlapping interactions
Critically, these four principles don’t exist in isolation, rather they interact in complex ways, including:
-
- Transparency without visibility shows us what an agent did but not what it missed. We might see every step of the agent’s process while remaining blind to alternative paths not taken.
- Autonomy without reliability creates unpredictable systems that act independently but inconsistently. This combination is particularly dangerous in high-stakes environments.
- Reliability without transparency gives us consistent outcomes but no insight into the process, undermining its credibility. The agent might work perfectly until it doesn’t, with no prior warning signs.
- Visibility without autonomy creates systems that can see everything but act on nothing, becoming sophisticated analysis tools that still require human execution for every step.
The path forward with AI agents
Granted, true AI agents will live in a world we don’t inhabit yet, but they are coming along quickly. That means the future of AI agents isn’t about choosing between human control and machine autonomy. It’s about creating systems in which both can work together effectively, with clear principles guiding their interaction.
As we move forward, we must remember that every AI agent embodies a theory about how decisions should be made. The principles we embed in them will shape not just their behavior but our own expectations about reasoning, responsibility, and trust. In our rush to create agents that can act in the world, are we thinking deeply enough about the kind of world we want them to create?
In my next column we’ll take a microscope to GPT-5 and see how it ticks and what makes it useful.