Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

dmdavidkov/computerusegemini

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Live API Starter

This project provides a starter kit for building applications that interact with the Gemini API in real-time. It supports audio and video input and provides a set of function tools for interacting with the user's system.

Installation

  1. Install the required dependencies:

    pip install -r requirements.txt
  2. Rename the .env.example file to .env

  3. Obtain a Gemini API key from Google AI Studio

  4. Replace your_api_key_here in .env with your actual API key.

Important: Use headphones when running the script to prevent audio feedback loops.

Usage

To run the script:

python main.py

The script takes a video-mode flag --mode, which can be "camera", "screen", or "none". The default is "screen". To share your screen, run:

python main.py --mode screen

You can also specify the modality to use with the --modality flag, which can be "AUDIO" or "TEXT". The default is "AUDIO".

Function Tools

The function_tools directory contains a set of Python scripts that provide various functionalities for interacting with the user's system. These tools can be called by the Gemini model to perform actions such as:

  • click_mouse.py: Performs a mouse click at the current cursor position.
  • copy_and_paste.py: Inputs text to the screen by simulating typing.
  • copy_to_clipboard.py: Copies text to the system clipboard.
  • execute_js_in_brave.py: Executes JavaScript code in the currently active chromium based browser window.
  • function_hub.py: Manages and executes the available function tools.
  • get_clipboard.py: Retrieves the current text from the system clipboard.
  • move_mouse.py: Moves the mouse cursor to specified coordinates.
  • output_text_to_screen.py: Displays a message on the screen using an alert box.
  • press_keys.py: Simulates pressing a single key or a combination of keys.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.