Interactive AI Experience – 3D Guide & Custom Image Gen

- or -

Post a project like this

Ends in (days)

Fixed Price

£2.5k(approx. $3.4k)

Posted: 6 hours ago
Proposals: 18
Remote
#4469890
FEATURED
OPPORTUNITY
Open for Proposals

+ have already sent a proposal.

Description

Experience Level: Expert

Estimated project duration: 2 months

I am an artist developing a browser-based interactive ritual experience where a 3D speaking character guides participants through a reflective AI-driven dialogue about the future.

At the end of the interaction, the system produces:

• A symbolic, poetic spoken response
• One AI-generated image based on the participant’s clarified vision, rendered in a custom visual style trained on my artwork

This is a poetic, immersive digital art experience, not a generic chatbot or commercial tool.

Deliverable: A mini website / web module that can be integrated into an existing website (for example, as a subpage or subdirectory).

Scope Clarification

The generated images will later be shown in a separate digital “wall” project built by another team.

This job does NOT include building that wall interface.
Your responsibility is to:

✔ Generate the images
✔ Store them with structured metadata
✔ Make them exportable for future integration

Technical Constraints (Non-Negotiable) -

• Open-source / open-weight AI models only (LLM, image generation, TTS, STT)
• Self-hosted deployment on my infrastructure (Hetzner servers)
• No proprietary AI APIs

Core User Experience Flow -

- Short conceptual intro animation
- 3D character appears and speaks, introducing the ritual
- User selects one of five thematic prompts
- User shares a vision (text input; voice input optional bonus)
- AI-guided dialogue (2–4 turns) to clarify the scenario
- Final symbolic spoken response from the character
- One AI-generated image created from the clarified vision
- Session data saved for archive and future visual display

Technical Requirements -

Frontend (Mini Website)

• Immersive but lightweight interface
• Smooth transitions between stages
• Audio playback (music + character voice)
• Responsive design (desktop + mobile)
• Built using React / Next.js or similar

3D Speaking Character -

• WebGL / Three.js / A-Frame (or similar)
• Rigged character model (provided)
• Idle animation
• Speaking animation synced to audio
(lip sync preferred, amplitude-based acceptable for MVP)

AI Dialogue System (Open-Source LLM) -

• Self-hosted open-weight model
• Multi-turn conversation handling
• Structured prompting system
• Outputs:
– follow-up prompts
– final poetic response
– structured summary for image generation

Voice System (Open-Source TTS) -

• Open-source text-to-speech hosted on server
• Audio drives speaking animation

Custom Style Image Generation -

The generated image must consistently match a custom artistic visual language based on my artwork.
Prompting alone is not enough.

You must implement:

Preferred: LoRA training using my artwork dataset
Alternative: Style adapter / reference conditioning

Requirements:
• One image per session
• Seed reproducibility
• Style strength control
• Save prompt + generation parameters

Backend & Storage

Store for each session:

• Selected prompt theme
• Dialogue transcript
• Final spoken response
• Scenario summary
• Image prompt + parameters
• Generated image file
• Timestamp

Admin Panel

Simple password-protected page to:
• View sessions
• Download text and images

Deployment Requirements

• Linux deployment on Hetzner
• Docker / Docker Compose preferred
• Documentation for:
– setup
– model downloads
– environment variables
– running services
– updating style model

Project Timeline

Total duration: 2 months

Skills Required

• Web 3D (Three.js / A-Frame / WebGL)
• Experience integrating animated 3D characters in the browser
• Experience serving open-source LLMs
• Diffusion model LoRA or adapter training
• Backend/API development
• Docker + Linux deployment

How to Apply

Please include:

2–3 relevant projects (AI apps, WebGL/WebXR, or interactive experiences)
Proposed tech stack (frontend, backend, model serving)
Which open models you would use (LLM, diffusion, TTS) and why
Recommended server setup (GPU/VRAM) for acceptable performance

Screening Questions

How would you sync speech audio to a 3D character animation in the browser?
Which open-weight LLM would you deploy and how would you serve it?
How would you train and deploy a custom style LoRA for image generation?
What server setup would you recommend and why?

New Proposal

Clarification Board Ask a Question

There are no clarification messages.

Description

Pierre G.

New Proposal

Clarification Board Ask a Question