The easiest way to
  • explore
  • curate
  • evaluate
  • fix
  • query
your
  • images
  • video
  • audio
  • point clouds
  • sensor data

Blazing-fast, modern data store for computer vision, built on open standards.

How it works

Lance is an open-source tool for blazing-fast exploration and analysis of computer vision data, backed by open standards.

Ingest data from any source easily

Start from our dataset zoo, or read your existing CSV, Parquet, TFRecord or files on S3. We have auto-conversion helpers for common dataset types.

Query, aggregate, slice, and get insights fast

Use the power of OLAP to check aggregate statistics, search through vector-space, find anomalies. Get rich insights fast with our out-of-the-box visualization widgets.

Manage all your vision data with a single source of truth

With our dataset management features, you can track and compare versions over time, without having to worry about schema evolution. Easy backfills and more.

Features

Improve your data and models effortlessly

Eto enables you to run blazing-fast SQL and Python to help you find misclassified or incorrect labels, track model regressions by comparing metrics over time, inform you when you need to re-train a model, and much, much more. Backed by open standards and fully open-source, Eto is extensible to meet your needs.

Blazing fast performance

Lance is Eto’s open-source columnar data format for vision datasets that support fast point lookups, built-in vector and search indexing support. Optimized for cloud-storage, nested data, and large blobs, Lance offers blazing fast performance at a fraction of the cost.

Analyze, visualize, and version in one place

Eto provides a complete ML data management solution with extensible visualizations to inspect, contextualize and get insights fast. Versioning and lineage out of the box for datasets and other artifacts, such as models, reports, and more.

Use your existing tools

Eto is integrated into your existing toolchain already. Use Eto to work with your data alongside Duckdb, Numpy, Pillow, Arrow, Pytorch, and Ray. For production, it’s easy to add dbt models for Eto and run Mode dashboards on top of Eto as well.

Team

Eto is developed by a team that built data and ML infrastructure for self-driving cars and are former core contributors to Pandas and Hadoop.

We are a YCombinator company.

Built on open standards

Lance is fully open source, visit us on Github to get started.

Visit us on GitHub