
SAN DIEGO, CA – MARCH 31: Qualcomm’s new CEO Cristiano Amon poses for photographs within the foyer at Qualcomm headquarters on Wednesday, March 31, 2021 in San Diego, CA. (Picture by Eduardo Contreras / The San Diego Union-Tribune through Getty Pictures)
The San Diego Union-Tribune through Getty Pictures
Many have heard loads about Agentic AI, and the way it will impression our lives, our companies, and even {our relationships} with computing units. This text lays out the essential panorama for agentic AI infrastructure, spanning private units, edge computing, and hyperscale cloud infrastructure, and assesses how one participant, Qualcomm, is hoping that Agentic AI is simply the chance it has been ready for. (Disclosure: Qualcomm, Nvidia and plenty of different AI semiconductor firms are shoppers of Cambrian-ai Analysis.)
What’s Agentic AI?
Agentic AI describes techniques that do greater than generate outputs on request; they exhibit “company” by setting or decomposing targets, selecting methods, and taking motion (APIs, apps, instruments, different brokers) to maneuver ahead to attain these targets. A full resolution could orchestrate a number of AI brokers, every with their very own diploma of autonomy.
Whereas conventional and generative AI are reactive, agentic AI can do work, not simply predict an final result or reply a query. For instance, one can ask an AI agent to “analysis a vendor and draft an RFP.” The agent can break targets down into steps, sequence them and execute these steps through instruments, APIs, workflows, bespoke code or different brokers.
The Agentic AI management aircraft runs on CPUs, requiring solely restricted human steerage. The important endgame permits the consumer to offer approval authority to take actions. Agentic AI makes use of generative fashions as a “mind” contained in the broader management loop that may name instruments, question knowledge sources, replace state, and iterate to completion of a activity.
Agentic AI Influence on Computing Infrastructure
As brokers run a number of steps over an extended time interval, calling on a number of AI fashions in a loop for components of an answer, agentic AI will eat way more tokens and demand decrease latencies than generative AI does. Consequently, the infrastructure wanted to run agentic AI should turn into extra environment friendly to stability the associated fee/worth equation.
Agentic AI techniques demand decrease latencies as a result of they function in suggestions loops, orchestrate many instrument and mannequin calls per activity, and sometimes act in actual time; even modest per-step delays compound into unacceptable finish‑to‑finish lag and unstable conduct.
Conventional GenAI apps are sometimes one request → one response, so a number of seconds is tolerable. Agentic techniques plan, act, observe, and re-plan in a number of iterations, so a 1–2 second delay per step can simply flip into many tens of seconds or longer, total. As well as, a single consumer intent can set off dozens of retrievals, augmented era (RAG) calls, API/instrument invocations, and inter-agent messages; every extra hop provides community and compute latency that accumulates linearly, or worse.
In buyer help, voice bots, and co‑pilot interfaces, customers anticipate close to‑immediate flip‑taking; brokers that take 15–30 seconds per motion are perceived as damaged, no matter accuracy. In domains like buying and selling, logistics management, or autonomous techniques, choices should land inside tight time budgets; greater latency immediately interprets into missed alternatives or unsafe conduct.
The trade is shifting from optimizing particular person AI fashions to orchestrating advanced, distributed AI techniques—and this shift is redefining compute architectures throughout edge, cloud-edge, and knowledge middle.
Agentic AI Places the CPU Is Again within the Highlight
The infrastructure for AI was initially optimized for high-throughput GPU coaching and now for inference. The CPU has acted because the management aircraft, sending heavy-duty processing to the GPU or different ASIC. Inference processing was as soon as considered a easy one-shot stroll via the neural community. Agentic AI is totally breaking this mannequin, and a brand new structure is rising.
Agentic AI is workflow-driven, inserting vital calls for on CPUs to plan, schedule and optimize over an optimization loop to search out the most effective reply to an issue. As such, the CPU strikes from a supporting function to an orchestration engine and decision-making agentic function. The orchestration happens throughout a number of instrument and AI mannequin instantiations, so the accelerator workload additionally will increase.
Agentic workloads introduce a lot heavier management‑aircraft logic on the CPU facet: planning, multi‑step instrument invocation, retrieval orchestration, reminiscence/context administration, API calls, evaluations, multi‑agent coordination and activity termination. In “AI agent period” knowledge facilities, CPU core demand per GW of accelerator capability might rise some four-fold, driving the transfer to close‑parity ratios for giant‑scale agentic providers.
For “traditional” LLM assistants, many operators sized roughly 1 CPU socket per 4–8 accelerators. For the rising agentic AI workloads, steerage and early deployments are transferring extra towards one or two CPUs per accelerator on the system degree.
This shift to CPU reliance, coupled with an growing concentrate on vitality effectivity, is why Qualcomm sees an incredible alternative in agentic AI; but it surely should transfer quick.
Hybrid AI Infrastructure: The Scalable Mannequin for Agentic AI Workloads
If one thinks concerning the continuum of compute assets accessible to the agentic AI consumer, it ought to turn into apparent that the infrastructure can enhance total effectivity if every layer contributes to the orchestrated workflow; every layer performs its logical half. Correctly carried out, a hybrid infrastructure ought to be capable to decrease prices and vitality consumption per token, and thereby decrease the price of agentic actions, whereas offering a better degree of responsiveness, reliability and scalability. This must be completed with a pointy concentrate on energy consumption to be each reasonably priced and acceptable to society, particularly within the knowledge middle. Let’s have a look at the roles and limitations of the three layers of infrastructure: endpoints, edge servers, and the hyperscale knowledge middle.
The mobile endpoints, or units, present intent classification, the entrance finish of the workflow. What are the consumer’s goals and priorities? A cell phone can present private context / consciousness that’s key for brokers to interpret a request and ship a related end result. Right here, efficiency per watt is king; individuals gained’t lug round further batteries to run agentic AI. It must be constructed into the units we use day by day.
On the edge, maybe an edge-cloud, workstation, or vehicle, the prepared availability of energy permits for extra computation, extra sensors, extra reminiscence and extra storage. This enables the sting to carry out intermediate reasoning and aggregation. Qualcomm has already attained management standing within the clever automotive market.
In fact, within the knowledge middle we anticipate practically limitless computation for large-scale mannequin execution. However huge knowledge facilities have gotten a political flashpoint, turning segments of the inhabitants in opposition to AI. So, it’s affordable to conclude that extra power-efficient designs than at the moment accessible will see prepared demand.
As I look throughout this agentic panorama, Qualcomm is clearly robust in power-efficient CPUs and AI, however has been lacking out the place all of the motion is: the info middle.
How Would possibly Qualcomm Fare within the Agentic AI Age?
Given the angst about knowledge middle energy consumption and prices, there’s appreciable curiosity in Qualcomm’s anticipated disclosures about its knowledge middle technique at its upcoming Investor Day, June 24. Clearly, Qualcomm’s power in cell and edge units like cars present a launch pad for the corporate’s push to turn into a full-scale supplier of agentic AI infrastructure. Most buyers already know that Qualcomm Snapdragon has glorious AI on the edge, however with no robust play within the knowledge middle, it is going to be unattainable for the corporate to leverage agentic AI to the extent Nvidia can.
How will Qualcomm place its Cloud AI200, not too long ago rebranded as Dragonfly? Will it have a powerful sufficient energy effectivity story for inference processing to make up for the truth that Qualcomm is late to the info middle social gathering?
Right here’s what we all know thus far about Dragonfly
Qualcomm’s upcoming Information Heart merchandise (Qualcomm AI200 and Qualcomm AI250), beneath the brand new model, Qualcomm Dragonfly, are being positioned as efficiency-first AI inference platforms, not a coaching machine. Qualcomm says it makes use of an modern near-memory computing structure that delivers greater than 10x greater efficient reminiscence bandwidth with a lot decrease energy consumption, and the corporate ties that to high-performance-per-dollar-per-watt for knowledge middle AI inference. Anybody who has been watching AI of late is aware of that the battle for AI compute has shifted to a battle for brand spanking new reminiscence architectures to extend efficiency whereas lowering the vitality spent on knowledge motion.
The Dragonfly model was launched at Computex 2026.
Qualcomm
Qualcomm’s launch materials says AI250 is constructed for rack-scale AI inference, with a “generational leap” round reminiscence effectivity and decrease energy draw. It additionally says AI250 will use direct liquid cooling, which suggests Qualcomm is concentrating on sustained effectivity at rack scale somewhat than peak burst efficiency. Qualcomm is clearly aiming at decrease energy consumption, higher utilization, and decrease whole value of possession. Qualcomm had beforehand introduced its intention to adopt NVLink in its knowledge middle roadmap; we don’t know if this primary iteration will embody the networking expertise.
- Goal Markets: Dragonfly encompasses three predominant product teams: Central Processing Items (CPUs), customized ASICs (Software-Particular Built-in Circuits), and devoted AI inference accelerators.
- Customized Silicon: The model depends on interconnect mental property and high-speed knowledge switch tech (reminiscent of PCIe, CXL, and Ethernet) obtained via Qualcomm’s acquisition of Alphawave Semi.
- Hyperscaler Partnerships: Qualcomm is closely collaborating with cloud suppliers and enterprise prospects—together with rumored early manufacturing designs with firms like ByteDance (unverified by Qualcomm)—with high-volume manufacturing anticipated to generate billions in income.
- Type Elements: Dragonfly delivers processing {hardware} throughout a number of setups, starting from standalone accelerator playing cards to dense servers and industrial-scale server racks.
We are going to Study Extra on the Qualcomm Investor Day
The subsequent age of AI is already upon us. Agentic AI will remodel jobs throughout each trade, permitting human employees to focus extra time and a spotlight the place their creativity, values, and world-knowledge are most wanted. However to make Agentic AI an reasonably priced actuality, it have to be carried out utilizing power-efficient but high-performance applied sciences throughout a hybrid infrastructure. Qualcomm already has a powerful CPU story on the system and the sting. Now we’ll see if the corporate can full the pivot to turn into a broad-scale AI infrastructure supplier.
Disclosures: This text expresses the opinions of the creator and isn’t to be taken as recommendation to buy from or put money into the businesses talked about. My agency, Cambrian-AI Analysis, is lucky to have many semiconductor corporations as our shoppers, together with Baya Techniques, BrainChip, Cadence, Cerebras Techniques, D-Matrix, Flex, Groq, IBM, Infleqtion, Intel, Micron, NVIDIA, Qualcomm, SImA.ai, Synopsys, Taalas, Tenstorrent, Ventana Microsystems, and scores of buyers. I’ve no funding positions in any of the businesses talked about on this article. For extra info, please go to our web site at https://cambrian-AI.com.


:max_bytes(150000):strip_icc()/Health-GettyImages-157426812-8456a15f571a49ffb767b2112ae411b1.jpg?w=160&resize=160,100&ssl=1)

:max_bytes(150000):strip_icc()/HDC-GettyImages-668641904-9179dc9fe60446d8b4d8a08fbffcf46d.jpg?w=600&resize=600,400&ssl=1)




Recent Comments