Overview
- Microsoft and Google’s new agents can autonomously manage multi-step tasks such as drafting and sending emails, compiling research reports and placing online orders.
- These agents connect to existing tools through APIs and automation technologies to navigate applications, enter data, submit forms and execute workflows.
- Underlying multimodal models such as GPT-4 and Gemini 2.0 provide reasoning and memory functions that allow personalization based on user habits and preferences.
- Financial institutions including Visa and Mastercard are piloting agents that shop for supplies, handle inventory orders and monitor transactions for anomalies.
- Stakeholders are urging the adoption of interoperability standards, transparent data policies and human-in-the-loop controls to address privacy, security and workforce displacement risks.