Lance is an open-source tool for blazing-fast exploration and analysis of computer vision data, backed by open standards.
Start from our dataset zoo, or read your existing CSV, Parquet, TFRecord or files on S3. We have auto-conversion helpers for common dataset types.
Use the power of OLAP to check aggregate statistics, search through vector-space, find anomalies. Get rich insights fast with our out-of-the-box visualization widgets.
With our dataset management features, you can track and compare versions over time, without having to worry about schema evolution. Easy backfills and more.
Eto enables you to run blazing-fast SQL and Python to help you find misclassified or incorrect labels, track model regressions by comparing metrics over time, inform you when you need to re-train a model, and much, much more. Backed by open standards and fully open-source, Eto is extensible to meet your needs.
Lance is Eto’s open-source columnar data format for vision datasets that support fast point lookups, built-in vector and search indexing support. Optimized for cloud-storage, nested data, and large blobs, Lance offers blazing fast performance at a fraction of the cost.
Eto provides a complete ML data management solution with extensible visualizations to inspect, contextualize and get insights fast. Versioning and lineage out of the box for datasets and other artifacts, such as models, reports, and more.
Eto is integrated into your existing toolchain already. Use Eto to work with your data alongside Duckdb, Numpy, Pillow, Arrow, Pytorch, and Ray. For production, it’s easy to add dbt models for Eto and run Mode dashboards on top of Eto as well.
Eto is developed by a team that built data and ML infrastructure for self-driving cars and are former core contributors to Pandas and Hadoop.
We are a YCombinator company.
Lance is fully open source, visit us on Github to get started.