Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
/ planeo Public

Autonomous Agent Playground. Multi-agent simulation where Vision-Language Models interact in a shared React Three Fiber 3D world.

License

Notifications You must be signed in to change notification settings

rgilks/planeo

Open more actions menu

Repository files navigation

Planeo

CI/CD

planeo Screenshot

Buy Me a Coffee at ko-fi.com

Planeo is an interactive 3D web application where users and AI agents coexist and interact in a shared environment. It showcases real-time multi-user communication, AI-driven agents with vision and speech capabilities, and a dynamic physics-based world.

Core Features

  • 3D Environment: Interactive 3D space built with React Three Fiber.
  • Real-time Multi-user Interaction: See other users' movements (represented as eyeballs) in real-time using Server-Sent Events (SSE).
  • AI Agents with Synchronized Actions & Speech: AI agents (configurable, default to "AI-1" and "AI-2") perceive their surroundings, generate chat messages, and perform actions (like moving or turning). Their actions are synchronized with audio playback of their speech, and their visual perspective is updated at ~10 FPS. (Details, Vision Details, Interaction Flow)
  • Chat Functionality: View messages from AI agents in a shared chat window. (Details)
  • Text-to-Speech (TTS): Chat messages from AI agents can be spoken aloud. Currently, this uses a test audio track for development. A full Google Cloud TTS integration is prototyped. (Details, src/lib/audioService.ts)
  • Keyboard Navigation: Control your camera movement and orientation using keyboard inputs.
  • Physics-based World: Interact with objects like falling cubes in an environment governed by physics. (Details)
  • Randomized Cube Art: Falling cubes display random artwork from a local collection on one face. (Details)

Simulation Start

Important: To ensure audio playback (like AI agent speech) functions correctly due to browser policies, you must click on the screen to start the simulation. An overlay will prompt this action upon loading.

Getting Started

Follow these instructions to set up and run Planeo on your local machine.

Prerequisites

  • Node.js (v22 or higher recommended)
  • npm (comes with Node.js) or yarn

Setup Instructions

  1. Clone the repository:

    git clone https://github.com/rgilks/planeo.git
    cd planeo
  2. Install dependencies:

    npm install
    # or
    # yarn install
  3. Set up environment variables: Copy the example environment file to create your local configuration:

    cp .env.example .env.local

    Now, edit .env.local and provide the necessary values:

    • NEXT_PUBLIC_APP_URL: The public URL of your application (e.g., http://localhost:3000 for local development).
      • Used by: src/app/actions/generateMessage.ts for SSE event posting.
    • GOOGLE_AI_API_KEY: Your API key for Google Generative AI (e.g., Gemini).
      • Used by: src/lib/googleAI.ts for AI text and vision model interactions.
    • AI_AGENTS_CONFIG (Optional): JSON string to define custom AI agents. If not set, defaults to two agents ("AI-1", "AI-2").
      • Example: AI_AGENTS_CONFIG='[{"id":"custom-ai-1","displayName":"Custom AI Alpha"},{"id":"custom-ai-2","displayName":"Custom AI Beta"}]'
      • Used by: src/domain/aiAgent.ts, src/app/api/events/route.ts.
      • See docs/ai-agents.md for more details.
    • TOTAL_AGENTS (Optional): The maximum number of AI agents allowed in the environment. Defaults to 0 if not set on the server-side, influencing AI agent initialization.
      • Used by: src/lib/env.ts, potentially affecting src/app/api/events/sseStore.ts.
    • NUMBER_OF_BOXES (Optional): The number of interactive cubes to spawn in the environment. Defaults to 5 if not set.
      • Used by: src/lib/env.ts, src/app/api/events/sseStore.ts.
    • NEXT_PUBLIC_TTS_ENABLED (Optional): Set to "false" to disable Text-to-Speech functionality. Defaults to true (enabled).
      • Used by: src/components/ChatMessage.tsx, src/app/actions/tts.ts.
    • GOOGLE_APP_CREDS_JSON (Optional, for full TTS): JSON string containing Google Cloud service account credentials for Text-to-Speech API. Required if you intend to use the full Google Cloud TTS feature (currently prototyped).
      • Used by: src/app/actions/tts.ts.
      • See docs/text-to-speech.md for setup.
  4. Run the development server:

    npm run dev
    # or
    # yarn dev

    Open http://localhost:3000 in your browser.

  5. Build for production:

    npm run build
    npm run start
    # or
    # yarn build
    # yarn start

Key Technologies Used

  • Next.js (React Framework)
  • React Three Fiber (for 3D graphics)
  • Drei (helpers for React Three Fiber)
  • Rapier (physics engine via react-three-rapier)
  • Zustand (state management)
  • Google Generative AI (for AI agent logic)
  • Server-Sent Events (SSE for real-time communication)
  • TypeScript
  • Zod (schema validation)

Technical Documentation

More detailed technical documentation for various aspects of the project can be found in the docs/ folder:

  • docs/ai-agents.md: Details on AI agent behavior, configuration, and capabilities.
  • docs/ai-agent-vision.md: Describes how AI agents perceive and display their environment.
  • docs/chat.md: Overview of the chat system.
  • docs/physics.md: Explanation of the physics simulation for objects in the 3D scene.
  • docs/real-time-camera-movement.md: Covers how camera/user movements are handled and synchronized.
  • docs/sse-event-handling.md: Describes the Server-Sent Events (SSE) mechanism for real-time updates.
  • docs/text-to-speech.md: Information on the text-to-speech functionality (currently using test audio, with details on the planned full integration).
  • docs/ai-interaction-flow.md: Details the synchronized flow of AI actions, chat, and audio playback.
  • docs/cube-art-textures.md: Details on how artwork is displayed on interactive cubes.

Planned Features

The following are areas for future development:

  • Full Text-to-Speech Integration: Completing the switch from test audio to dynamic Google Cloud TTS.
  • Enhanced AI Capabilities: More complex AI behaviors, memory, and interaction models.
  • User-to-User Chat: Allowing human users to chat directly with each other.
  • Expanded World Interactions: More ways for users and AI to interact with the 3D environment and its objects.
  • Persistent User Accounts/Profiles.

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature-name).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add some feature').
  5. Push to the branch (git push origin feature/your-feature-name).
  6. Open a Pull Request.

Please ensure your code adheres to the project's linting rules (npm run lint) and all checks pass (npm run check).

License

This project is licensed under the MIT License - see the LICENSE file for details.

Environment Variables

  • REPLICATE_API_TOKEN: Your Replicate API token.
  • TOTAL_AGENTS: The maximum number of agents allowed in the environment (e.g., 10).

Environment Variables for Production

About

Autonomous Agent Playground. Multi-agent simulation where Vision-Language Models interact in a shared React Three Fiber 3D world.

Topics

Resources

License

Stars

Watchers

Forks

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.