Hello Fellow AI Friends,
The headline news at this week’s meetup was Anthropic’s foray into computer control—the ability to dictate and manage computer functions via API.
Matt – our iOS developer, set this up properly in a container format so it wouldn’t take over his entire computer. Let’s just say it didn’t quite produce the results we were hoping for.
Interestingly, a developer and friend in my network has built something very similar but much more effective. We’re eagerly awaiting a demo.
While there’s still a long way to go, it’s clear the starting pistol has been fired in the realm of computer control. My prediction? In the next 6 months, we’ll see more truly agentic behavior from AI, and within 2 years, this may become the default way we interact with our machines.
Some exciting research this month and a small but mighty turnout in Bali this week.
Harry Verity
AI Bali Meetup Host
Course Leader – AI To The World
Attendees:
- Harry Verity (Host)
- Daniel (Freelance copywriter)
- Will M (Product marketer)
- Bastion
- Matt (iOS Developer)
- Khesav (Developer)
- Several other participants joined throughout
Key Takeaways:
- Anthropic emerging as strong competitor to OpenAI
- Computer vision control represents next frontier in AI development
- Practical implementation challenges remain significant
- Watermarking effectiveness questioned
- Benchmark results should be viewed critically
Main Topics Discussed:
1. Claude Vision & Computer Control
- Anthropic launched Claude Vision with computer control capabilities
- Allows AI to take over computer functions independently
- Harry attempted to demo but Harry faced API key configuration issues so Matt (iOS developer) set it up
- Matt explained the importance of setting it up in a container to avoid it taking control of your whole computer
- Claude failed to sign up to the AI To The World Newsletter, failing to check the sitemap
- Mixed opinions on practical usefulness given current limitations
2. Claude Model Updates
- Claude 3.5 Sonnet upgraded
- Improved reasoning abilities noted
- Group consensus: Claude generally performing better than ChatGPT
- Haiku upgrade announced but not yet released
- Benchmark claims: Better chain of thought than GPT-4, coding abilities at 93.77%
3. Google’s AI Developments
- Announced Jarvis project for web task automation
- Planning December launch alongside Gemini language model
- Group expressed skepticism about timing and market positioning
4. AI Watermarking Technology
- Google plans to open-source AI text watermarking tech for Gemini
- Group discussed practical limitations and potential workarounds
- Consensus that watermarking may be ineffective long-term
5. Research Paper Discussion
- Paper questioning accuracy of AI vision model benchmarks
- Found models struggling with basic visual reasoning tasks
- Group agreed benchmarks may be selective for PR purposes