The source post shows an interactive visual demo where hand tracking (MediaPipe) controls a WebGPU + TSL panel scene built in a React Three Fiber starter12. The quoted parent post frames it as "Stage dives into WebGPU Render Targets + TSL," then the follow-up adds hand detection on top1.
| Layer | Role | Why it matters |
|---|---|---|
| WebGPU | Modern GPU backend | Supports high-frequency render-target experimentation with less CPU overhead6 |
| TSL / Node Material | Shader graph authoring | Lets iteration happen in JS/TS nodes instead of raw WGSL/GLSL for faster prototyping7 |
| MediaPipe hand landmarks | Gesture signal | Converts camera input into normalized hand keypoints for controls5 |
| R3F | React orchestration | Keeps render loop, components, and controls composable8 |
Replicate the same architecture path: demo inspection -> runtime signal check -> implementation scaffold with hand landmarks driving panel transforms.
mediapipe-panels.vercel.app
r3f-webgpu-starter
TSL / WebGPU / Webcam / MediaPipe
Webcam: waiting for permission
Palm: waiting for webcam
// Maps MediaPipe landmarks -> panel transform targets
export function mapHandToPanel(landmarks) {
if (!landmarks || landmarks.length < 9) return null;
const wrist = landmarks[0];
const indexMcp = landmarks[5];
const middleTip = landmarks[12];
const dx = indexMcp.x - wrist.x;
const dy = indexMcp.y - wrist.y;
const pinch = Math.hypot(
landmarks[4].x - landmarks[8].x,
landmarks[4].y - landmarks[8].y
);
return {
rotY: (dx - 0.08) * 2.6, // horizontal hand shift -> panel yaw
rotX: (dy - 0.10) * -2.0, // vertical hand shift -> panel pitch
z: Math.max(-1.8, -0.9 - pinch * 2.4), // pinch -> push/pull depth
glow: Math.max(0, 1.0 - pinch * 3.0), // pinch closes glow
cursorX: middleTip.x,
cursorY: middleTip.y
};
}
This mirrors the likely interaction grammar in the artifact: stable anchor points, a pinch-distance scalar, then low-pass smoothing in the render loop.
| Blocker | Type | Status | Resolution |
|---|---|---|---|
| No live webcam stream in this execution context | Access gap | Unresolved here | Run locally on Mac/Chrome with camera permissions and calibrate landmarks in-browser |
| Unknown exact smoothing constants from creator build | Taste/feel gap | Partially resolved | Tune EMA/spring constants while testing real hand motion |
| TSL graph specifics not directly visible from post | Tool gap | Resolved by pattern | Start with node-based color/depth modulation, then iterate visually |
Canvas + WebGPU renderer path)86.mapHandToPanel)5.| Question | Assessment |
|---|---|
| Is this durable skill or one-off novelty? | Durable. The pattern (vision landmarks -> normalized control bus -> GPU scene modulation) generalizes to creative tools and UI prototyping. |
| Does it require rare hardware? | No. Modern Chrome + camera + WebGPU-capable machine is sufficient; scaling quality is mostly software tuning. |
| Biggest false assumption risk | Thinking shader complexity is the hard part. In practice, motion filtering and gesture semantics dominate perceived quality. |
Donna masters the architecture and implementation scaffolding; Eric masters live calibration. This is an ideal split task: Donna writes and iterates the pipeline quickly, while Eric spends short hands-on time tuning real camera behavior for production feel.
If the goal is shipping a polished clone, the next action is to run the scaffold locally with webcam permission, tune 3-5 control constants, and record a before/after interaction clip.