Planeo

Planeo is an interactive 3D web application where users and AI agents coexist and interact in a shared environment. It showcases real-time multi-user communication, AI-driven agents with vision and speech capabilities, and a dynamic physics-based world.

Core Features

3D Environment: Interactive 3D space built with React Three Fiber.
Real-time Multi-user Interaction: See other users' movements (represented as eyeballs) in real-time using Server-Sent Events (SSE).
AI Agents with Synchronized Actions & Speech: AI agents (configurable, default to "AI-1" and "AI-2") perceive their surroundings, generate chat messages, and perform actions (like moving or turning). Their actions are synchronized with audio playback of their speech, and their visual perspective is updated at ~10 FPS. (Details, Vision Details, Interaction Flow)
Chat Functionality: View messages from AI agents in a shared chat window. (Details)
Text-to-Speech (TTS): Chat messages from AI agents can be spoken aloud. Currently, this uses a test audio track for development. A full Google Cloud TTS integration is prototyped. (Details, src/lib/audioService.ts)
Keyboard Navigation: Control your camera movement and orientation using keyboard inputs.
Physics-based World: Interact with objects like falling cubes in an environment governed by physics. (Details)
Randomized Cube Art: Falling cubes display random artwork from a local collection on one face. (Details)

Simulation Start

Important: To ensure audio playback (like AI agent speech) functions correctly due to browser policies, you must click on the screen to start the simulation. An overlay will prompt this action upon loading.

Getting Started

Follow these instructions to set up and run Planeo on your local machine.

Prerequisites

Node.js (v22 or higher recommended)
npm (comes with Node.js) or yarn

Setup Instructions

Clone the repository:

git clone https://github.com/rgilks/planeo.git
cd planeo

Install dependencies:
```
npm install
# or
# yarn install
```
Set up environment variables: Copy the example environment file to create your local configuration:
```
cp .env.example .env.local
```
Now, edit .env.local and provide the necessary values:
- NEXT_PUBLIC_APP_URL: The public URL of your application (e.g., http://localhost:3000 for local development).
  - Used by: src/app/actions/generateMessage.ts for SSE event posting.
- GOOGLE_AI_API_KEY: Your API key for Google Generative AI (e.g., Gemini).
  - Used by: src/lib/googleAI.ts for AI text and vision model interactions.
- AI_AGENTS_CONFIG (Optional): JSON string to define custom AI agents. If not set, defaults to two agents ("AI-1", "AI-2").
  - Example: AI_AGENTS_CONFIG='[{"id":"custom-ai-1","displayName":"Custom AI Alpha"},{"id":"custom-ai-2","displayName":"Custom AI Beta"}]'
  - Used by: src/domain/aiAgent.ts, src/app/api/events/route.ts.
  - See docs/ai-agents.md for more details.
- TOTAL_AGENTS (Optional): The maximum number of AI agents allowed in the environment. Defaults to 0 if not set on the server-side, influencing AI agent initialization.
  - Used by: src/lib/env.ts, potentially affecting src/app/api/events/sseStore.ts.
- NUMBER_OF_BOXES (Optional): The number of interactive cubes to spawn in the environment. Defaults to 5 if not set.
  - Used by: src/lib/env.ts, src/app/api/events/sseStore.ts.
- NEXT_PUBLIC_TTS_ENABLED (Optional): Set to "false" to disable Text-to-Speech functionality. Defaults to true (enabled).
  - Used by: src/components/ChatMessage.tsx, src/app/actions/tts.ts.
- GOOGLE_APP_CREDS_JSON (Optional, for full TTS): JSON string containing Google Cloud service account credentials for Text-to-Speech API. Required if you intend to use the full Google Cloud TTS feature (currently prototyped).
  - Used by: src/app/actions/tts.ts.
  - See docs/text-to-speech.md for setup.
Run the development server:
```
npm run dev
# or
# yarn dev
```
Open http://localhost:3000 in your browser.

Build for production:

npm run build
npm run start
# or
# yarn build
# yarn start

Key Technologies Used

Next.js (React Framework)
React Three Fiber (for 3D graphics)
Drei (helpers for React Three Fiber)
Rapier (physics engine via react-three-rapier)
Zustand (state management)
Google Generative AI (for AI agent logic)
Server-Sent Events (SSE for real-time communication)
TypeScript
Zod (schema validation)

Technical Documentation

More detailed technical documentation for various aspects of the project can be found in the docs/ folder:

docs/ai-agents.md: Details on AI agent behavior, configuration, and capabilities.
docs/ai-agent-vision.md: Describes how AI agents perceive and display their environment.
docs/chat.md: Overview of the chat system.
docs/physics.md: Explanation of the physics simulation for objects in the 3D scene.
docs/real-time-camera-movement.md: Covers how camera/user movements are handled and synchronized.
docs/sse-event-handling.md: Describes the Server-Sent Events (SSE) mechanism for real-time updates.
docs/text-to-speech.md: Information on the text-to-speech functionality (currently using test audio, with details on the planned full integration).
docs/ai-interaction-flow.md: Details the synchronized flow of AI actions, chat, and audio playback.
docs/cube-art-textures.md: Details on how artwork is displayed on interactive cubes.

Planned Features

The following are areas for future development:

Full Text-to-Speech Integration: Completing the switch from test audio to dynamic Google Cloud TTS.
Enhanced AI Capabilities: More complex AI behaviors, memory, and interaction models.
User-to-User Chat: Allowing human users to chat directly with each other.
Expanded World Interactions: More ways for users and AI to interact with the 3D environment and its objects.
Persistent User Accounts/Profiles.

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch (git checkout -b feature/your-feature-name).
Make your changes.
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/your-feature-name).
Open a Pull Request.

Please ensure your code adheres to the project's linting rules (npm run lint) and all checks pass (npm run check).

License

This project is licensed under the MIT License - see the LICENSE file for details.

Environment Variables

REPLICATE_API_TOKEN: Your Replicate API token.
TOTAL_AGENTS: The maximum number of agents allowed in the environment (e.g., 10).

Name	Name	Last commit message	Last commit date
Latest commit History 77 Commits
.vscode	.vscode
docs	docs
public	public
screenshots	screenshots
src	src
tests	tests
.dockerignore	.dockerignore
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENSE	LICENSE
README.md	README.md
eslint.config.mjs	eslint.config.mjs
fly.toml	fly.toml
next-pwa.d.ts	next-pwa.d.ts
next.config.ts	next.config.ts
package-lock.json	package-lock.json
package.json	package.json
playwright.config.ts	playwright.config.ts
postcss.config.mjs	postcss.config.mjs
tsconfig.json	tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Planeo

Core Features

Simulation Start

Getting Started

Prerequisites

Setup Instructions

Key Technologies Used

Technical Documentation

Planned Features

Contributing

License

Environment Variables

Environment Variables for Production

About

Uh oh!

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

License

rgilks/planeo

Folders and files

Latest commit

History

Repository files navigation

Planeo

Core Features

Simulation Start

Getting Started

Prerequisites

Setup Instructions

Key Technologies Used

Technical Documentation

Planned Features

Contributing

License

Environment Variables

Environment Variables for Production

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages