Would you choose HAL-9000 or R2-D2?

Two different types of AI Agents

If you could pick a personal assistant, which one would you choose:

  • HAL 9000 from 2001: A Space Odyssey – An advanced AI capable of deep thinking, reasoning, and operating a computer system autonomously (ok ok, let’s forget the villainous part for a minute)

  • or R2-D2 from Star Wars – A loyal automation bot handling orders and executing key operations on demand

gm innovator

This week's newsletter explores the two different types of AI agents, comparing our HAL 9000 Computer Use Agents, known for handling unstructured tasks, with R2-D2 like Automation Agents, which excel in rule-based workflows.

Hal 9000 or Computer Use Agents

Computer Use

Imagine observing a meticulously orchestrated ghost operating your computer – this is the magic of Computer Use Agents like OpenAI's ChatGPT Operator and Claude Computer Use.

Think of these AI Agents as digital students learning to use computers just like you once did. When you give them a task, they carefully follow your instructions, finding their way around websites and programs just like someone learning a new skill.

How Computers Are Learning to Use Computers

Here's How It Works: When you tell this AI Agent what you want it to do, it breaks it down into smaller steps, takes pictures of the computer screen to understand what it's looking at, then uses a virtual mouse and keyboard to click and type. It’s actually mimicking the way how you would to it

ChatGPT Operator

These computer-using AIs can already do some pretty cool things. Right now they are careful assistants who are still learning but they can complete tasks on their own like:

  • Navigating Windows, Mac, and Linux systems, and of course operating browsers to navigate and search the web.

  • Ordering groceries by adding items to an Instacart or Amazon Fresh cart.

  • Booking an appointment (e.g., a doctor’s visit or a table in a restaurant,…) by navigating a website, selecting a time, and entering your details.

  • Searching Google for the best places for a Sunday excursion, checking the weather forecast, and finding reviews on TripAdvisor.

  • Looking up flight prices and checking different airlines for the best deal.

  • Finding crypto prices or stock market updates and summarizing them for you.

They're not as quick or reliable as humans yet but how well do they actually work? Of course there is a special test - called OSWorld Benchmark- to measure how good agents are at using computers. Generally humans score about 72% on these tests, while Open AI’s Operator scored about 38% and Claude scored 22%. This shows that while these agents are clever, there is still some improvement possible.

Can you use them already? Yes, but it's a bit tricky.

  • ChatGPT Operator: You can try this if you have the Pro subscription (the one that costs $200 per month)

  • Claude's Computer Use: This one needs some technical knowledge as it’s only available via API and you need then to set up a local environment to execute Claude's commands, such as a Docker container or virtual machine.

R2-D2 or AI Automation Agents

R2-D2 automating apps

Automation Agents are the reliable workers of the AI world—often based on tools like Zapier, Make.com, or n8n. They focus on precision and repetition, seamlessly integrating AI into rule-based workflows to handle tedious tasks. Whether it's sorting business leads or managing your inbox, these agents use AI to boost efficiency and save time.

However, they have a limitation: they work best with predictable, structured tasks and struggle with anything outside their predefined rules. But that's also their strength—while broader AI tools might be less precise, Automation Agents excel at delivering highly targeted results. You can easily experiment with them using Zapier’s simple interface or its Chrome extension, which lets you automate actions directly from any webpage.

Does this really count as “AI agent”?

No. And Yes.

What? Let me explain:

While people might argue that this is automation only and not really autonomous decision-making (and they are right with this), I see the topic of agents a little bit more differentiated.

If I would give the task to a fellow human colleague, they would need the skills to do it and that might differ from task to task and person to person and they would need the system access.

Same with agents.

In the end it’s one question for me:

“Can I hand over this task to an assistant (aka agent) who will carry it out or solve it independently - without any further action on my part?”



So, finally let’s stack those agents against each other:

Criteria

Computer Use Agents (e.g. OpenAI Operator)

Automation Agents
(e.g. Zapier Agents)

How They Work

Use multimodal LLMs (e.g., GPT-4o) to interpret screenshots, plan actions dynamically, and simulate mouse/keyboard interactions via browser-based GUIs

2910

Rely on event-driven automation based on APIs to connect apps and workflow orchestration rather than generative reasoning.

Current Use Cases

- Web-based research
- Booking tickets/hotels
- Form filling
- Multi-step online tasks (e.g., invoice processing)11

- Business process automation (e.g., syncing data between apps)
- Notification systems and task scheduling
- CRM and e-commerce workflow integration
- Routine data transfers and repetitive tasks

Limitations

- Prone to hallucinations/errors
Slow processing due to visual analysis
- Security risks (UI-based attacks)
- Integration with non-text interfaces or legacy systems can be challenging 1011.

- Rigid workflows (not really dynamic decision-making)
- Limited to API-supported apps
- Less adaptable to novel tasks outside of defined workflows

Outlook

Progress toward more autonomous, self-correcting agents

Greater convergence with conversational interfaces for hybrid task automation

Both paradigms promise a transformational future. Embrace this vision now by experimenting with tasks scheduling in ChatGPT or weaving complex business processes with low-code automation.

And then? What’s Next?

We are looking forward (and might be a bit afraid at the same time) to something like Jarvis.

Jarvis (Marvel’s Iron Man) – Acts as an intelligent, conversational assistant for Tony Stark, analyzing complex data, assisting in decision-making, and responding dynamically.

Fabian with Jarvis Heads up display

Would you want something like this?

This is the personal assistant we want for everything we do. We’re not quite there yet, but also almost there already…

News & Reads on AI Agents

or “What those notorious AI Agents have been up to lately”

Galileo AI has launched the Agent Leaderboard on Hugging Face, a platform evaluating language models' proficiency in tool utilization and maintaining coherent multi-turn conversations. The leaderboard assesses models on both fundamental functionalities and challenging edge cases to determine their real-world applicability. Galileo AI on Hugging Face

Microsoft’s Azure AI Agent Service Now In Public Preview: The Azure AI Agent Service is designed to accelerate development and automation. It offers rapid development capabilities, flexibility, and enterprise assurances, managing all necessary compute, networking, and storage for running agent micro-services microsoft.com

AI's Impact on Employment: In 2025, major tech companies like Salesforce, Google, and Meta are implementing significant layoffs as AI advancements lead to the automation of tasks traditionally performed by humans. This shift underscores AI's growing role in the workforce and its potential to transform various industries. thetimes.co.uk and Forbes

Browser Use: Open-Source AI Agent for Web Automation: If you found ChatGPT Operator interesting but love save money and to tinker a bit this open-source AI agent might be a good fit: It’s called Browser Use and enables autonomous navigation, interaction, and information extraction from websites, facilitating automated web-based tasks infoworld and techradar

Pinkfish helps enterprises build AI agents through natural language processing. It combines generative AI with enterprise-grade reliability. Techcrunch

Apptronik's Humanoid Robots: Apptronik has secured $350 million in funding to advance the production of its AI-powered humanoid robots, including "Apollo," designed for tasks in warehouses and manufacturing plants. The company plans to expand Apollo's capabilities into industries such as elder care and healthcare. reuters.com

Ok, now it’s your turn:

What do YOU actually want to read?

Login or Subscribe to participate in polls.

See you next week, keep building.

Fabian

Fabian