Average Ratings 0 Ratings
Average Ratings 0 Ratings
Description
Introducing the Gemini 2.5 Computer Use model, an advanced agent built upon the visual reasoning strengths of Gemini 2.5 Pro, specifically crafted for direct interaction with user interfaces (UIs). This model is accessible through a newly developed computer-use tool within the Gemini API, which takes inputs such as the user's request, a screenshot of the UI context, and a log of recent actions. It adeptly generates function calls relevant to UI tasks, including clicking, typing, or selecting, while also having the capability to seek user confirmation for tasks deemed higher risk. Following each performed action, the model receives updated feedback in the form of a new screenshot and URL to facilitate a continuous process until the task is either completed or stopped. Primarily fine-tuned for web browser navigation, it also shows potential for mobile UI interactions, although it currently lacks the capability for desktop OS-level management. In various benchmarks comparing web and mobile control tasks, the Gemini 2.5 Computer Use model demonstrates superior performance over leading competitors, achieving remarkable accuracy with reduced latency, and paving the way for future enhancements in interface interaction.
Description
Project Mariner is an innovative research prototype created by Google DeepMind, utilizing their sophisticated AI model, Gemini 2.0. This project investigates the potential for enhanced human-agent interaction by automating a variety of tasks directly within a user's web browser. With its ability to understand multiple forms of information, Project Mariner can analyze and reason through diverse browser components, such as text, code snippets, images, and online forms. This functionality empowers it to adeptly navigate intricate websites, streamline repetitive workflows, and supply users with visual updates. The system is also capable of interpreting voice commands, providing real-time task progress updates and ensuring that users stay informed and maintain control over their activities. Furthermore, Project Mariner excels at deciphering complex instructions by deconstructing them into manageable steps, grasping the interconnections between different web elements, and delivering coherent plans and actions to users. Currently, the initiative is undergoing testing with a limited number of selected users, and those wishing to engage in future testing can express their interest by joining a waitlist. This approach not only fosters user engagement but also helps refine the system based on real-world feedback.
API Access
Has API
API Access
Has API
Integrations
Gemini
Gemini Enterprise
Gemini 2.5 Pro
Gemini 3 Deep Think
Google AI Studio
Google AI Ultra
Stitch
Vertex AI
WeatherNext
Integrations
Gemini
Gemini Enterprise
Gemini 2.5 Pro
Gemini 3 Deep Think
Google AI Studio
Google AI Ultra
Stitch
Vertex AI
WeatherNext
Pricing Details
Free
Free Trial
Free Version
Pricing Details
No price information available.
Free Trial
Free Version
Deployment
Web-Based
On-Premises
iPhone App
iPad App
Android App
Windows
Mac
Linux
Chromebook
Deployment
Web-Based
On-Premises
iPhone App
iPad App
Android App
Windows
Mac
Linux
Chromebook
Customer Support
Business Hours
Live Rep (24/7)
Online Support
Customer Support
Business Hours
Live Rep (24/7)
Online Support
Types of Training
Training Docs
Webinars
Live Training (Online)
In Person
Types of Training
Training Docs
Webinars
Live Training (Online)
In Person
Vendor Details
Company Name
Founded
1998
Country
United States
Website
blog.google/technology/google-deepmind/gemini-computer-use-model/
Vendor Details
Company Name
Google DeepMind
Founded
2010
Country
United Kingdom
Website
deepmind.google/technologies/project-mariner/