As AI agents have become a hot topic over the last year, I’ve spoken at conferences and workshops worldwide about their use. I usually tell audiences it’s OK if they don’t know what agents are because most people don’t. There’s often a visible sense of relief on faces when they hear that. Then I’ll explain that agents aren’t very complicated. They’re what we used to call computer programs back in the “before times.” But there is one significant difference. These new digital actors have a level of autonomy and non-determinism that previous software has never possessed. In fact, this level of autonomy makes it more than a mere software program, alone. It is a dynamic orchestration system. An agent adapts a prompt to language models, tools, memory, and even other agents. That’s why agents promise to change how businesses operate and transform our personal lives in ways that we’re only beginning to comprehend. You may already be an avid user of chatbots like ChatGPT, Google Gemini, Claude, Llama, and so on. You probably unknowingly rely on agents embedded in everyday devices and apps like intelligent assistants (Siri and Alexa) and navigation systems (Waze and Google Maps). But those represent only the tip of the iceberg for their potential in automating and optimizing processes at scale. My hope is to demystify agents and help you understand what they are, what they do, and where we may be heading with what’s becoming known as agentic transformation. I’ve prepared a four-part series as an agent primer. We’ll explore:
- エージェントの定義と自律性について
- エージェントの自立性のレベル
- エージェントが使うツールとは?
- AIを正しく安全に管理する方法(AIガバナンス)
これは「AIエージェントの教科書」の決定版です。それでは、第1回から始めていきましょう。
AIエージェントの正体とは
For decades, software has operated by rules-based paths set by human-created code. The foundation has always been if-then-else statements. If a condition is met, then the software does one thing. If that condition is not met, it does something else. Programmers had to write endless lines of code to anticipate as many condition statements as possible ahead of time. While an AI agent is also software, it’s not passive. Agents can take action. Agents use digital tools to gain access to systems and data sources. They can understand context, learn, and remember by using their own “brains” or embedded AI large language models. They accomplish goals based on their instructions and have guardrails to ensure they act as expected. Rather than wait for explicit instructions, they act with various degrees of independence. Agents take an assigned objective and figure out how to accomplish it. They possess the flexibility to find new digital pathways by sorting through potentially copious amounts of data that is too much for humans to comprehend, near-instantaneously. That makes them effective collaborators for workers in improving complex business processes in manufacturing, customer service, supply chain management, retail, healthcare, and much more. An example I like to cite is the order-to-cash process. Agents can break what is usually a complicated process at most businesses into manageable chunks, resulting in less manual work, fewer errors, and increased speed. Another way to think about agents: natural language is code. Congratulations! You are a coder. You prompt the agent to do what you want, and it sets out to accomplish the goal. It returns results, creates lists, and even writes code if that’s what you request. That simplicity explains why agents soon will be found everywhere in our businesses. Essentially, the limitations of the applications, processes, and services of the future will be limited only by your imagination. But there’s a catch. It’s important to understand that imagination, while great for providing the motivation for agents, introduces new problems that we will discuss in a future post.
エージェントの振る舞いを決める5つポイント
When we talk about agency, think of it as the behavior of agents. These characteristics aren’t just one thing. Like a mixing board in a recording studio that subtly adjusts sounds, there are plenty of dials, faders, and switches to modulate an agent’s behavior. Similarly, agents can be fine-tuned in scope, personality, need for human oversight, and so on. It’s how we can design nuanced agents with the attributes we want and trust to create beautiful process music. I see those “agent soundboard” controls breaking down into these categories.
- 自律性:エージェントがどれだけ自分の判断で動けるかを示す、いわば「メインボリューム」です。自律性が高いほど、エージェントは「次は何をすべきか」を自分で考え、優先順位をつけ、もし途中で間違いに気づいたら自分で修正しながら進みます。
- 計画と推論:与えられた権限に応じて、エージェントは次に起こることを予測し、ゴールまでのステップを自ら組み立てます。一から十まで指示を書かなくても、エージェントが自ら「Aの次にBをやって、最後にCで仕上げよう」というような論理的なステップを組み立て、答えを出してくれます。
- 反応と提案:「反応」とは、周りの状況の変化や新しい情報に素早く対応することです。一方で「提案」とは、直接命令されなくても、目標を達成するために進んで行動を起こすことを指します。
- 学習と適応: 関連する情報から学び、状況に合わせてやり方を調整する力です。途中で状況が変わっても、目標からズレることなく、その場に合わせて最適な方法に軌道修正できます。
- 目標の達成: 会議の予約やデータの分析といった具体的なゴールをやり遂げる力です。チャットで指示されて動くこともあれば、システムの見えないところで自動的にタスクを完了させてくれることもあります。
AIエージェントがもたらす大きな変化
My company has a CEO-driven mandate to use AI wherever it makes sense. Through our early experience, we’ve found that we see the most success when using teams of agents to streamline complex processes, and each agent focuses on a specific task within the larger workflow. Our goal at this point is to amplify human potential by helping make people more productive while freeing them from tedious, manual work. Now, the open question is whether agents will do that for most people or if they will lead to widespread job displacement. Will agents be complementary or a substitute? While many have opinions on what agents ultimately mean for the workforce, the answers to that big question remain unclear, both for the short term and the long term. But one thing is obvious to me. People who incorporate agents into their daily roles are more likely to thrive in an ever-changing workplace. That’s why it’s so important to understand the role of agents. In my next post, I’ll examine the varying degrees of autonomy that agents can possess.
Boomi AgentstudioBoomi Agentstudioを活用して、自社でもAIエージェントの利用を始めてみませんか?