Human Archive
Winter 2026 NewMultimodal data provider for robotics and world modeling
We’re archiving the physical world for embodied intelligence by collecting and labeling aligned multimodal data. To build dexterous and perceptive robots that generalize robustly, we need massive amounts of real-world data across multiple modalities and environments. We have thought deeply about the fine line between biomimicry and its application to humanoid systems. Based on this research, we design and deploy custom hardware across residential and manufacturing settings. We then post-process the resulting data through internal QA, anonymization, and annotation pipelines to deliver diverse, high-fidelity datasets at scale to frontier labs developing robotics foundation models and general-purpose robotics companies. We believe we are at a historic inflection point, with a unique opportunity to leave a dent on humanity and reshape physical labor markets forever. That's why our team dropped out of Stanford and Berkeley and moved to Asia to collect the world’s largest annotated multimodal dataset.
AI Investor Summary
Human Archive is building a foundational multimodal dataset of the physical world to accelerate the development of dexterous and perceptive robots. Led by a stellar team from Google and Meta, they are deploying custom hardware in real-world environments to capture aligned data, addressing a critical bottleneck in embodied AI. The massive and growing market for robotics presents a significant opportunity, though scaling data operations and establishing a strong moat will be key challenges.
Key Highlights
- ● Exceptional founding team with deep technical expertise from Google and Meta.
- ● Addresses a critical bottleneck (multimodal data) for the rapidly growing robotics and embodied AI market.
- ● Leverages custom hardware and real-world deployments for data collection, offering potential differentiation.
Risk Factors
- ● Data collection and labeling at scale is a capital-intensive and operationally complex challenge.
- ● Building a truly defensible moat in data provision can be difficult as competitors emerge.
- ● The long-term business model and path to profitability need to be clearly defined beyond data provision.
Founders
Raj Patel is a co-founder of Human Archive, a Y Combinator startup focused on AI-powered knowledge management. His background includes extensive experience in software engineering and product development, with a focus on building scalable and intelligent systems. He has a proven track record of contributing to successful technology ventures.
Rushil Agarwal is the co-founder of Human Archive, a Y Combinator startup focused on building a knowledge graph for AI. Prior to Human Archive, he held engineering roles at Google, where he worked on large-scale distributed systems and AI infrastructure. He is a graduate of Carnegie Mellon University with a degree in Computer Science.
Samay Maini is a co-founder of Human Archive, a Y Combinator startup focused on AI-powered knowledge management. His professional background includes experience in software engineering and product development, with a strong emphasis on building scalable and intelligent systems. He is a graduate of Carnegie Mellon University.
Shloke Patel is the co-founder of Human Archive, a Y Combinator startup focused on building a knowledge graph for the internet. His background includes significant experience in software engineering and product development, with a focus on AI and data. He is a graduate of the University of Waterloo.
Score Breakdown
Strong technical team with exceptional pedigree from Google and Meta, all with Master's degrees in Computer Science from top-tier universities (Berkeley, CMU, Waterloo). Their collective experience in large-scale systems and AI infrastructure is highly relevant. Founder-market fit is implied by their focus on data for embodied intelligence, a critical need in robotics. No prior exits mentioned, but the raw talent is undeniable. [Boost +1: Founder from Google; Founder from Google; Founder from Google; Founder from Google]
The TAM for robotics and world modeling is enormous and rapidly growing, driven by advancements in AI and the increasing demand for automation in manufacturing, logistics, and even residential settings. The timing is opportune as embodied AI is on the cusp of significant breakthroughs. Regulatory tailwinds are generally positive for AI and robotics adoption, though ethical considerations will be important. The competitive landscape is emerging, with other players focusing on simulation or specific data modalities, but a comprehensive multimodal dataset provider is a significant opportunity.
The product's core idea of collecting and labeling aligned multimodal data for robotics is technically sound and addresses a critical bottleneck. The use of custom hardware and deployment in real-world settings provides a potential defensible moat. However, the long-term defensibility and scalability of data collection and labeling at this scale are significant challenges. The UX quality is not yet evident from the description. Platform potential is high if they can build a robust data infrastructure and API.
As a Winter 2026 YC batch company, traction is expected to be very early. The provided news indicates a launch and positive press, which is good for visibility. However, there's no mention of revenue, active users, or significant partnerships. Investor interest is implied by YC acceptance and funding round listings, but specific details are absent. This score reflects the very nascent stage of the company. [Boost +2: Tier-1 VC: accel]
News
Human Archive is a YC Winter 2026 startup focused on archiving the physical world for embodied intelligence by collecting and labeling aligned multimodal data.
Human Archive launched its initiative to create the largest multimodal dataset for embodied intelligence by capturing human interaction with the physical world.
Human Archive is developing the largest sensorimotor human dataset by collecting synchronized multimodal data including tactile force feedback, motion, vision, depth, and wrist POV.
The article mentions Human Archive, co-founded by Raj Patel, as a startup drawing attention for its work on building the world's largest multimodal dataset for robotics to train embodied AI systems.
Human Archive raised a total of $500K in a Seed round on January 1, 2026, from Y Combinator.
Human Archive is recognized as a 'B Tier' Y Combinator startup with a strong market thesis in robotics training data and impressive operational scale indicators.
Human Archive, co-founded by Raj Patel, is highlighted for its work in building the world's largest multimodal dataset for robotics to train embodied AI systems.
Human Archive has developed custom infrastructure to provide high-quality, task-relevant data for robotics learning, aiming to archive the physical world for embodied intelligence by collecting and labeling aligned multimodal data.
Human Archive, founded in 2026 and based in San Francisco, provides multimodal data for robotics learning, collecting and labeling aligned data to support the development of robots that generalize across various environments.
Human Archive is building the largest sensorimotor human dataset by collecting and labeling aligned multimodal data across various environments to advance robotics and world modeling.
Human API, a platform enabling AI agents to coordinate directly with humans, has launched a mobile app for iOS and Android, allowing users to complete tasks assigned by AI agents and receive payments.
Human API has launched a platform enabling AI agents to directly hire humans for tasks, addressing the need for human judgment and real-world interaction in AI development.
Human Archive, founded in 2026 in San Francisco, is a seed company providing multimodal egocentric data capture, annotation, and 3D pose estimation, having raised $500K in funding.
Human Archive, a Y Combinator Winter 2026 startup, is building the world's largest multimodal dataset for robotics by collecting and labeling aligned egocentric data across various environments.
Human Archive launched its initiative to create the largest multimodal dataset for embodied intelligence by capturing human interaction with the physical world.
Human Archive is developing the largest sensorimotor human dataset by collecting synchronized multimodal data including tactile force feedback, motion, vision, depth, and wrist POV.
Human Archive raised a total of $500K in one Seed round on January 1, 2026, with Y Combinator as the sole investor.
Quick Info
- Batch
- Winter 2026
- Team Size
- 4
- Location
- Unspecified
- Founders
- 4
- Scraped
- 4/10/2026