The Events Arena is a first-of-its-kind product in VR: a stadium-scale venue where up to 1,000 people attend live events together with stereoscopic 3D video, spatial audio, and immersive environmental effects, an experience that doesn't exist anywhere else in the industry. It launched as one of three hero worlds in Horizon Social Reality, debuting at Meta Connect 2025 (Week 39) and followed by a 180° stereo layout for Winter Moment 2025. The Arena supports rectilinear and 180° stereo video layouts and can be rebranded for any event (Connect, Sabrina Carpenter, Charli XCX) the same way a real-world stadium reconfigures for different shows.
The community response validated the vision. Users described it as “the first time I felt the metaverse”, praising the live 3D video quality, the sense of shared presence with hundreds of avatars, and how confetti and crowd reactions made it feel like a real event. One user wrote: “It's definitely the beginning of the future.” Here is one example of the reception, with 350+ upvotes and dozens of comments asking for more events in this format.
Real-Time Lighting Pipeline
The Arena works like a theatre: a large screen faces the audience and users move freely around the space in front of it. To make this feel real rather than flat, the VR environment needs to react to whatever is playing on screen. If the video cuts to a warm orange stage light, the floor, walls, and avatars in the Arena should pick up that same warm glow. I built the end-to-end graphics pipeline that makes this happen in real time.
CPU Path: Java PixelCopy
The initial implementation ran entirely on the CPU. Every 16ms, we used Java's PixelCopy API to grab the current video frame, downsampled the 4K source to a 128x128 image, and computed the average color across all pixels. This color was then pushed to the GPU as a uniform for the surface shaders to consume.
Simple averaging worked for straightforward content but broke down in practice. Videos with large black borders (letterboxing, non-standard aspect ratios) would drag the average color toward dark grey, washing out the actual lighting from the content area. For example, the NCT Dream concert had significant black regions on the edges, which meant the computed color was far dimmer than what the viewer was actually seeing on screen. Creators had to manually adjust lighting presets per event to compensate, which didn't scale.
I rebuilt the color extraction to use luminance-weighted dominant color detection instead of simple averaging. The pipeline converts sRGB pixels to linear space, computes per-pixel luminance (BT.709 coefficients), and weights each pixel's color contribution by its luminance so bright content dominates over dark borders. A 100-bin luminance histogram classifies whether the scene is intentionally dark (P99 luminance < 0.02): if so, intensity drops to 0.02; otherwise the dominant color is boosted to a minimum luminance floor of 0.35 so the lighting output is stable across different video content without manual presets.
// Core of onVideoFrameCopy — called every 33ms on a 128x128 downsampled frame
for (int i = 0; i < pixels.length; i++) {
float r = sRGBToLinear(Color.red(pixels[i]));
float g = sRGBToLinear(Color.green(pixels[i]));
float b = sRGBToLinear(Color.blue(pixels[i]));
float luminance = 0.2126f * r + 0.7152f * g + 0.0722f * b;
luminanceHist[Math.min((int)(luminance * 100), 99)] += 1;
// Luminance-weighted accumulation: bright pixels dominate
weightedR += r * luminance;
weightedG += g * luminance;
weightedB += b * luminance;
totalLuminance += luminance;
}
// Dark scene detection via P99 histogram
if (finalLuminance == 0 || isDarkScene(luminanceHist)) {
intensity = 0.02f; // near-black for intentional darks
} else if (finalLuminance < 0.35f) {
intensity = 0.35f / finalLuminance; // boost to consistent range
}The tradeoff was performance. The histogram-based dominant color extraction was significantly more expensive than the original averaging on Quest's mobile chipset:
| onVideoFrameCopy | Count | Min | Max | Avg | StdDev |
|---|---|---|---|---|---|
| Original (averaging) | 5,297 | 0.225 ms | 13.192 ms | 0.462 ms | 0.467 ms |
| Dominant color (histogram) | 4,127 | 0.779 ms | 62.905 ms | 4.312 ms | 1.851 ms |
A ~9x regression in average latency and a 62ms worst case (nearly 4 frames at 60Hz) made this untenable on the CPU path.
GPU Path: Compute Kernel
The fix is moving the entire color sampling pipeline to a GPU compute kernel. The video frame is already a GPU texture after hardware decoding, so the CPU path was doing a pointless round-trip: GPU texture → CPU readback (pipeline stall) → Java processing → push result back to GPU as a uniform. A compute shader eliminates this entirely by dispatching a parallel reduction directly on the video texture, computing the luminance histogram and dominant color on-GPU, and writing the result to a small buffer that the surface shaders consume in the same frame. No readback, no stall, and the histogram computation that was expensive on CPU becomes trivial when parallelized across GPU threads.
The GPU compute kernel equivalent, same algorithm but parallelized across GPU threads with no CPU readback:
#version 310 es
layout(local_size_x = 16, local_size_y = 16) in;
uniform sampler2D videoFrame; // video texture, already on GPU
layout(std430, binding = 0) buffer ResultBuffer {
vec4 dominantColor; // consumed by surface shaders same frame
};
shared vec3 weightedColorAccum[256];
shared float totalLumAccum[256];
shared uint lumHistogram[100];
void main() {
// Each thread samples a tile directly from the video texture
vec3 linear = vec3(sRGBToLinear(srgb.r), sRGBToLinear(srgb.g), sRGBToLinear(srgb.b));
float lum = dot(linear, vec3(0.2126, 0.7152, 0.0722));
atomicAdd(lumHistogram[min(uint(lum * 100.0), 99u)], 1u);
tileWeightedColor += linear * lum; // same luminance-weighted accumulation
// Parallel reduction across 256 threads
for (uint stride = 128u; stride > 0u; stride >>= 1u) {
if (localIdx < stride) {
weightedColorAccum[localIdx] += weightedColorAccum[localIdx + stride];
totalLumAccum[localIdx] += totalLumAccum[localIdx + stride];
}
barrier();
}
// Thread 0: same dark scene detection + intensity boost as Java path
if (localIdx == 0u) {
vec3 avgColor = weightedColorAccum[0] / totalLumAccum[0];
float finalLum = dot(avgColor, vec3(0.2126, 0.7152, 0.0722));
// ... P99 histogram check, intensity = 0.35 / finalLum
dominantColor = vec4(avgColor * intensity, intensity);
}
}On Quest 3 (Adreno 740, Snapdragon XR2 Gen 2), the GPU path eliminates the CPU readback stall (1-3ms on Quest) and parallelizes the 16,384-pixel loop across 256 threads. A 128x128 reduction with a 100-bin histogram is a trivially small workload for the Adreno 740's ~2.5 TFLOPS, bringing the per-frame cost from ~4.3ms average (62ms worst case) down to under 0.1ms. The worst-case spikes disappear entirely since they were caused by CPU scheduling contention and Java GC pauses, neither of which exist on the GPU path.
Surface Shader
The sampled color (whether from the CPU or GPU path) is made available to shaders via an external luminance uniform. I wrote a custom surface shader that runs in the GPU forward rendering pass and is applied to every static mesh in the Arena. The shader computes each fragment's distance and orientation relative to the screen position, then blends the sampled video color into the surface lightmap. Fragments facing the screen pick up the full color; fragments facing away fall off via a smoothstep on the dot product between the surface normal and the direction to the screen. A distance-based feather factor ensures the effect fades naturally past the screen's radius so surfaces far from the stage aren't tinted.
The core of the lightmap blending in the surface shader:
// Video lighting — runs per-fragment on every mesh in the Arena
float distance = length(matParams.screenPosition.xyz - d.worldSpacePosition.xyz);
vec3 normalDir = normalize(d.worldSpaceNormal);
vec3 lightDir = normalize(getExternalAverageLuminance().rgb - d.worldSpacePosition.xyz);
float normalDot = dot(normalDir, lightDir);
float featherFactor = smoothstep(
matParams.screenRadius - 2.0, matParams.screenRadius, distance);
float transitionFactor = smoothstep(0.0, 1.0, 1.0 - normalDot);
vec3 targetColor = mix(
vec3(getExternalAverageLuminance().r), // luminance only
getExternalAverageLuminance().rgb, // full color
transitionFactor); // blend by facing angle
// Fade to luminance-only past screen radius
s.lightmap *= mix(targetColor, vec3(getExternalAverageLuminance().r), featherFactor);For dynamic objects like avatars, the same sampled color drives the directional light's color and intensity each frame, so players walking through the space are lit consistently with the static environment. The combined effect is that every surface in the Arena, static and dynamic, responds to the video content in real time. The constraint throughout was Quest hardware: two views (one per eye) at 72-90Hz with a hard latency budget, so the shader had to be cheap enough to run on every mesh in the scene without missing frames.
Immersive Mode
One of the core features I built was Immersive Mode, the passthrough rendering pipeline that lets users switch into a full 3D stereoscopic view during 180° events. When a user sits down and activates 3D mode, the Arena transitions from the normal environment into a passthrough/immersive view where only the video content, UI, and cursor are rendered, everything else (avatars, VFX, meshes) is hidden to create an uninterrupted cinematic experience.
The main technical challenge was handling VFX correctly across rendering modes. Particle effects and meshes that were added after the initial implementation would continue rendering during immersive mode, causing visual artifacts. I solved this by adding a check in the PopcornFX rendering pipeline that inspects the camera's filter behavior: when the camera is in IncludeList mode (passthrough/immersive), VFX rendering is skipped entirely; when it's in ExcludeList mode (normal rendering with specific exclusions like the mirror avatar), VFX renders normally. This approach is more robust than checking for specific filter list entries because it keys off the rendering mode itself rather than individual object names.
The changes touched the view camera manager (adding an includeListContains API), the PopcornFX scene module (injecting camera manager awareness), and the BUCK build graph. The fix had to preserve existing behavior for the mirror avatar exclusion system while cleanly gating all VFX during immersive transitions.
Post-launch analysis across 58 events with 111,925 distinct users showed a statistically significant positive correlation (+0.237 to +0.245) between immersive time spent and overall event time spent. Users who activated immersive mode averaged 9.08 minutes per session compared to 3.24 minutes for those who did not. The entertainment persona (Viola) showed the highest immersive engagement at 44% of event time when they used the feature, though their seating utilization was lower, suggesting immersive mode was gated behind a discoverability barrier rather than lack of interest.