The most current matchup on Stemma v. Amundsen; how Stemma provides a friendlier experience for analysts and business users, improves productivity through lineage, and simplifies catalog setup and administration.
In part 2, Stemma's Head of Design, Dani Sandoval, continues his deep dive into the changes made to the data catalog interface to decrease frustration when debugging data issues, help data engineers establish best practices within data teams, and make discovering existing data easier for data scientists and analysts.
Stemma's Head of Design, Dani Sandoval, goes deep into the changes made to the data catalog interface to decrease frustration when debugging data issues, help data engineers establish best practices within data teams, and make discovering existing data easier for data scientists and analysts.
Stemma releases the Data Developer Update to further improve the way dbt users manage their data development lifecycle: include the Lineage Graph for complex environments, dbt Cloud integration, and column Autodescribe to pull documentation based on column-level lineage
A data team’s success hinges on its leader’s soft skills but, this is easy to lose sight of given the backlog of technical work. These two tips, earned from experience, can help new managers keep a focus on the fuzzy but important work of managing across teams.
In part 2 of Mark's conversation with Chad Sanderson about the Semantic Data Warehouse, they discuss the importance of semantically defining entities and new roles for contracts and the data catalog.
In this first part of Mark's discussion with Chad Sanford, they focus on "semantics," how it should be defined, and how the Semantic Data Warehouse fits with the modern data stack
Our SOC 2 Type II certification is a publicly visible milestone in the journey towards securing the privacy of your data. In this post, we discuss what this step in Stemma's journey means for our existing and future customers.
The data catalog is becoming a ubiquitous part of the data ecosystem, so much so that the new data stack is now known by its three pillars: the warehouse, the BI tool and the data catalog. However, many organizations still struggle to wrangle their data in a way that allows users to answer the most basic questions: what data exists and what does it mean?
One of the early questions that data engineering teams pose when implementing a catalog is: should we make the catalog responsible for gathering metadata from data systems ("pull"), or task data systems with reporting metadata to the catalog ("push")? And, what are the consequences of using one approach over the other? Learn how to ingest metadata into your catalog and which method to choose.
Why is data discovery important? What is the role for data discovery in data mesh? Who's responsible for making data discoverable? Learn the answers to these questions (and more!) — summarized from a recent panel discussion on Data Discovery in Data Mesh.
It’s time for the data catalog to evolve. Catalogs already have access to rich, cross-sectional views of your data ecosystem. The next frontier is to repurpose this information for operational use by integrating your catalog into your CICD pipeline.
Today, we are excited to announce the launch of Stemma - a fully managed data catalog, powered by Amundsen, the leading open-source data catalog with the largest community and broadest adoption. We raised $4.8M in seed funding led by Sequoia to bring the power of the leading open-source data catalog to every organization.
Almost any organization using Amundsen will need to make custom changes to their install. Unfortunately, this has been a long-time issue for the community. This post is the first in a step-by-step guide to getting a fully customized enterprise deployment of Amundsen¹, based on how Stemma deploys Amundsen for its customers.