If AI is going to work its way out of the chat box and into our living rooms, it will need to understand spaces and objects better. To further that work, the Allen Institute for AI has created a gigantic and diverse database of 3D models of everyday objects, so simulations for AI models can be that much closer to reality.

Simulators are basically 3D environments meant to represent real places that a robot or AI might have to navigate or understand. But unlike, say, a modern console game, training simulators are far from photorealistic and often lack detail, variation, or interactivity.

Objaverse, as it is awkwardly yet somehow pleasingly named, aims to improve this with its collection of over 800,000 (and growing) 3D models with all kinds of metadata. The things represented range from types of food to tables and chairs to appliances and gadgets. Any relatively ordinary object you might expect to see in a home, office, or restaurant is represented here.

It’s meant to replace aging object libraries like ShapeNet, an old standby database with about 50,000 less detailed models. If the only “lamp” your AI has ever seen is a generic one with no pattern or color, how can you expect it to recognize a funky cut-glass one or one with a totally different shape? Objaverse includes variations on common objects so the model can learn what defines them despite their differences.

Sure, it probably won’t be necessary for your AI assistant to identify a bookcase as “medieval” or not, but it should definitely know the difference between a peeled and unpeeled banana. But you never know what might matter.

Using photorealistic imagery (captured via photogrammetry, it is clear) also brings a level of diversity and realism that is obvious in retrospect. Sure, all beds look roughly the same, but what about unmade beds? All different!

Having objects that also animate to do their “main thing” if you will is also helpful. Knowing what a refrigerator, cabinet, book, laptop, or garage door look like closed is one thing and open is another, but how does it get from A to B? It sounds simplistic but if AI models aren’t provided this information, they aren’t likely to invent or intuit it.

You can read more about the characteristics and details of this huge dataset in the AI2 paper describing it. And if you’re a researcher, you can start using it now for free via Hugging Face.


Source link