Join us at Boomi World 2025 May 12 - 15 in Dallas

Best Practices When Migrating To and From Boomi DataHub (Part Two)

by Boomi
Published Jan 28, 2021

I’m an integration specialist at Solita Oy, an IT service management company, and a Boomi implementation partner. Our team was given the task of migrating master data from one Boomi DataHub cloud to another Boomi DataHub cloud location.

In the first blog of this series, I talked about the background of the project, and things to consider before migration. In this post, I’ll dive into configuration.

Configure your MDM

Your new MDM solution needs to be configured properly before the migration. If you are migrating from another Boomi DataHub repository to a new one you might think that just having the similar model and configurations is enough. However, this is not always the case.

Additional fields

Your new MDM model might benefit from a couple of new fields.

One would be a new timestamp field. The Updated Date field that has served you well in the past to identify when changes have occurred will be rendered rather useless in the new MDM for changes predating the migration. This is because you cannot set the value in the field when you migrate — the value will be set to the moment the migration process transfers the record and its history. A separate timestamp field will enable you to see the actual date when record updates happened prior to the migration.

Another useful field would be the original Golden Record ID. During the migration process you are likely to find yourself in a situation where a record has odd updates in its history, and you struggle to realize why. Transmitting the original golden record ID in the data within its own field will enable you to identify if the problem occurred because of a collision of records. Even if these cases end up in quarantine it might not be easy to find the source golden records that caused the collision.

Additional sources

If your golden records have been modified through the data steward’s view in Boomi DataHub, the updates will be shown to originate from a source called “MDM.” This is a special kind of source that is handled a bit differently by DataHub. You cannot create a source called “MDM” as it is reserved for manual updates. Neither can you send updates to DataHub with the API and use MDM as your source. So you are left with the question of how to migrate such modifications.

You could ignore the modifications and attribute any changes to the next source that updates the record. The problem with this is that what if the last update to the golden record was through manual update? There wouldn’t be any next step that would take responsibility for these changes, and they would not be migrated.

You could replace the MDM source with one of the other existing sources and set any changes to its responsibility. This way all the updates would be migrated. The problem with this is that you might end up with a field being updated by a source that should not have anything to do with the field in question. This problem gets really complicated if you are utilizing ranked fields in your MDM, as this might cause deadlocks where the field might not be able to be updated anymore by the regular operation of the integration processes. The cause of such deadlocks is discussed further in the next section.

Both of these solutions also cause corruption on records history as you can’t really trust your golden record history anymore. Troubleshooting errors becomes harder.

A good solution we found was to create an additional source that takes responsibility for any manual modification of the golden records. This also enables you to guide your data stewards towards other solutions than modifying golden records manually. Offering an alternative way to modify records, with, for example, a Boomi Flow application, could enable you more control on the changes. It would also give you a way to prevent future deadlocks when handling ranked fields.

Ranked sources

Ranked sources is a handy feature with which you can control updates on common fields by competing systems. It has many good sides.

There are, however, also dangers involved with the feature. One of these is that when a field is changed manually through the DataHub interface, where data stewards can access and modify the data as part of their work, the field gets locked for the golden record. Only the topmost source can break this lock.

Normally this is not an issue with a well-configured MDM. But there are cases where a record might be handled normally by sources other than the top ranked one, and as such might not ever receive an update to that field from the top ranked source. To solve this deadlock requires manual intervention and innovative use of API calls.

Another danger inherent in ranked fields is that the ranking can be modified when integration processes change. This is of course handy, as other systems might be included and the rankings need adjusting. It however makes things difficult when you are migrating the history of a record one step at a time and you realize that a field is left out as the ranking was different in the past for that field and cannot be modified with the current setup.

Such cases usually require you to either forgo the accuracy of golden records history and have a further update fix the field value, or you might have to manually send the record one step at a time with a tool like Postman and modify the ranking of the source between the steps. This is painfully slow and error prone.

In the final blog of this series, we cover lessons learned and best practices.

Boomi DataHub is a cloud-native master data management (MDM) solution that sits at the center of the various data silos within your business – including your existing MDM solution, to provide you an easy to implement, scalable, flexible, and secure master data management hub as a service. For more information, go here or contact a Boomi expert.

On this page

On this page

Stay in touch with Boomi

Get the latest insights, news, and product updates directly to your inbox.

Subscribe now