Projects / flow-stateProject Detail

Flow State

macOS focus coach that watches screen context and gives voiced, character-driven feedback in real time.

ElectronJavaScriptClaude VisionElevenLabsJestmacOS

Demo

Flow State desktop demo

YouTube demo slot ready. Add a youtubeId in content/projects.ts to render the embedded demo.

Metrics / Signals

Approx. size

1,100 LOC

Tests

33 Jest tests

Modes

3 characters

Problem

What this project solves

Task lists capture intentions, but they do not notice when someone drifts off-task or help them return to the session.

Architecture

How it is put together

Flow State runs as an Electron background app with a tray menu, draggable overlay character, configurable screen capture cadence, Claude Vision analysis for on-task detection, Claude text generation for character dialogue, ElevenLabs TTS, electron-store persistence, and Jest coverage around core modules.

Desktop agent

What It Does

Flow State is a macOS background focus coach that screenshots the screen every configurable interval, asks Claude Vision whether the user is working on their stated task, and generates voiced feedback through a selected character.

Characters can praise on-task behavior or call out distraction.
Session memory lets characters reference repeated behavior across the current focus session.
Idle detection reacts when the user has been inactive for 30+ seconds.

Electron

Desktop Experience

A tray menu opens settings and controls sessions.
A small draggable character lives near the bottom-right of the screen.
The transparent overlay uses click-through behavior with dynamic mouse-event toggling.
electron-store persists API keys, settings, current task, and user preferences.

Runtime

AI Loop

The app coordinates a sequential async loop across screenshot capture, visual task analysis, character dialogue generation, and ElevenLabs speech playback. The loop can skip vision calls on unchanged screens to reduce token usage.

Product flavor

Character System

Drill Sergeant: direct, military-style tough love.
Disappointed Mom: affectionate but deeply let down.
Anime Rival: competitive, dramatic, and motivational.

Decisions

Technical choices

Used periodic screenshots rather than manual check-ins so the app can react to actual behavior.
Separated character definitions, memory, screen capture, analysis, and speech into testable modules.
Made the coach live as a small desktop overlay so feedback is ambient rather than buried in a dashboard.

Tradeoffs

Constraints and next choices

Screen Recording permission adds setup friction, but it is necessary for truthful context-aware feedback.
Voiced reactions make the experience vivid, but require careful settings and API-key handling.
Skipping vision calls on unchanged screens reduces token usage, but requires screen-change and idle heuristics.