Sorry, our demo is not currently available on mobile devices.

Please check out the desktop version.
You can learn more about Stemma on our blog.
See Our Blog
close icon
April 21, 2022
February 16, 2022
-
min read

Migrating from Amundsen to Stemma

by
Mark Grover
Co-founder, CEO of Stemma
Share This Article

Stemma is based on Amundsen, which makes transitioning from Amundsen to Stemma simple. In fact, other companies (such as iRobot) have already successfully migrated from Amundsen to Stemma.

Migration from Amundsen involves 3 main steps:

  1. Migrating existing metadata from Amundsen to Stemma
  2. Ongoing metadata ingestion
  3. Cut over

1 - Migrating existing metadata from Amundsen to Stemma

In this step, you want to take all the existing metadata from Amundsen and load it into Stemma. The metadata imported includes, but is not limited to:

  • Table and column descriptions
  • Tags
  • Ownership information

In order to import this metadata, you can upload a dump of your Amundsen neo4j metadata into Stemma. You can find complete documentation on that on the Stemma docs site. After import, your Stemma instance will look very similar to your existing Amundsen instance, as shown below. It won’t however be periodically updated. For periodic updates, see the next step below.

Stemma after importing metadata from Amundsen

2 - Ongoing metadata ingestion

Now that you have ingested existing Amundsen metadata into Stemma, it’s time to configure Stemma to ingest metadata updates on an ongoing basis. Once that's configured, you’ll no longer have to write Python databuilder jobs to ingest data.

Information ingested in this step includes:

  • Table and column names
  • Linked issues (JIRA tickets)
  • Frequent users
  • Lineage, if it exists

In Stemma's Admin interface, you can provide credentials to your data sources to allow Stemma access to extract metadata on an ongoing basis. Stemma will do the work to “upsert” metadata updates coming on an ongoing basis into the pre-loaded data from Amundsen.

Admin interface in Stemma to configure metadata ingestion; no more databuilder jobs!

3 - Cut Over

At this point, you have both Amundsen and Stemma running side-by-side. During this stage, we recommend sharing access to Stemma with your power users and getting their feedback to ensure all of their use-cases are supported. Usually one week of overlap to obtain feedback and an additional week to incorporate feedback, if applicable, is sufficient.

When all looks good, simply redirect the URL of your internal Amundsen to the Stemma URL.

Summary

To summarize, it’s super easy and common to transition from Amundsen to Stemma. Companies like iRobot have successfully migrated from Amundsen to Stemma and many others are in the process.

“We chose Amundsen as our data catalog because of its focus on automation and its supportive community. Since then we have moved to Stemma as our data catalog. I am thrilled to see Stemma make automated data catalogs widely available." - Michelle Gulen, iRobot

Share This Article
Stay in the loop by subscribing to our newsletter
Oops! Something went wrong while submitting the form.

Next Articles

September 15, 2021
September 15, 2021
-
min read

Data Discovery in Data Mesh

Why is data discovery important? What is the role for data discovery in data mesh? Who's responsible for making data discoverable? Learn the answers to these questions (and more!) — summarized from a recent panel discussion on Data Discovery in Data Mesh.

October 4, 2021
October 4, 2021
-
min read

Making Sense of Metadata Ingestion

One of the early questions that data engineering teams pose when implementing a catalog is: should we make the catalog responsible for gathering metadata from data systems ("pull"), or task data systems with reporting metadata to the catalog ("push")? And, what are the consequences of using one approach over the other? Learn how to ingest metadata into your catalog and which method to choose.

October 7, 2021
October 7, 2021
-
min read

3 Steps for a Successful Data Migration

Learn the 3 crucial steps that great data engineering teams follow for a successful data migration.