
CS2-10k is a new open dataset from Reka AI that may be the most practically useful thing to drop for world model researchers this year. It contains over 600,000 first-person video clips from professional Counter-Strike 2 matches, totaling more than 10,000 hours of footage, each clip paired with exact per-frame annotations: which keys were pressed, how the mouse moved, and where the player was in 3D space. All of it is free, available now on Hugging Face, and comes with an open-source pipeline to generate more.
The data problem nobody talks about enough
World models are systems that learn to predict how a visual environment changes in response to actions. Think of them as neural simulators: given a frame and an input ("press W, move mouse right"), they generate the next frame. Training them well requires something very specific: egocentric video tightly synchronized with the exact actions that caused each visual change.
The shift from video generation to interactive world modeling places new demands on data. Beyond captioned videos, world models require temporally aligned video-action-language trajectories grounded in the actions, camera motion, and states that drive future scene changes. Such data is difficult to obtain at scale: web video datasets offer broad visual coverage but lack executable actions; robotic datasets provide action supervision but are costly and limited in scene diversity.
This is precisely the gap CS2-10k fills. And it does it without a single human labeler.
Why Counter-Strike, specifically
Training interactive world models requires data that is notoriously hard to find: ego-centric video sequences with densely aligned action signals, all synchronized to the visual stream. CS2 sidesteps the collection problem entirely through a quirk of how the game works.
Counter-Strike 2 demos offer a compelling middle ground: because matches are recorded as deterministic replays, Reka can reconstruct clean first-person video at any point in a match, extracting the precise control inputs that drove each visual change. No estimation, no labeling, no approximation. The ground truth is baked into the replay file.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves
