I’m an integration specialist at Solita Oy, an IT service management company, and a Boomi implementation partner. Our team was given the task of migrating master data from one Boomi Data Hub cloud to another Boomi Data Hub cloud location.
In the first blog of this series, I talked about the background of the project, and things to consider before migration. In this post, I’ll dive into configuration.
MDMの設定を行う
新しいMDM(Master Data Management。企業内の重要な基幹データを統合・管理する仕組み)ソリューションは、移行を実施する前に適切な設定を行う必要があります。
既存のBoomi Data Hubリポジトリ(マスターデータの管理領域)から新しいリポジトリへ移行する場合、同様のデータモデルや設定を用意すれば十分だと考えがちです。しかし、実際には必ずしもそれだけで対応できるとは限りません。
追加フィールドの検討
新しいMDM(Master Data Management。企業の重要な基幹データを統合・管理する仕組み)のデータモデルでは、いくつかの新しいフィールドを追加することで運用面のメリットが得られる場合があります。
1つは、新しいタイムスタンプ(timestamp。データが更新・登録された日時を記録する項目)フィールドです。これまで変更の発生時期を把握するために活用してきた「Updated Date」フィールドは、移行前に行われた変更については新しいMDMではほとんど役に立たなくなります。これは、移行時にそのフィールドの値を任意に設定できず、レコードと履歴が移行処理によって転送された時点の日時が自動的に設定されるためです。そのため、別途タイムスタンプ用のフィールドを用意することで、移行前に実際に行われたレコード更新の日時を確認できるようになります。
もう1つ有用なのが、元のゴールデンレコードID(Golden Record ID。重複・分散しているデータを統合した際に生成される代表レコードの識別子)を保持するフィールドです。移行の過程では、履歴上に不自然な更新が見られ、その原因を特定しにくいケースが発生する可能性があります。
元のゴールデンレコードIDをデータ内の専用フィールドとして保持しておくことで、問題がレコード同士の衝突によって発生したものかどうかを判断しやすくなります。仮にこうしたケースが隔離(quarantine。データ品質の問題などにより一時的に処理対象から除外される状態)に回された場合でも、衝突の原因となった元のゴールデンレコードを特定しやすくなる点が重要です。
追加ソースの検討
If your golden records have been modified through the data steward’s view in Boomi Data Hub, the updates will be shown to originate from a source called “MDM.” This is a special kind of source that is handled a bit differently by Data Hub. You cannot create a source called “MDM” as it is reserved for manual updates. Neither can you send updates to Data Hub with the API and use MDM as your source. So you are left with the question of how to migrate such modifications.
You could ignore the modifications and attribute any changes to the next source that updates the record. The problem with this is that what if the last update to the golden record was through manual update? There wouldn’t be any next step that would take responsibility for these changes, and they would not be migrated.
You could replace the MDM source with one of the other existing sources and set any changes to its responsibility. This way all the updates would be migrated. The problem with this is that you might end up with a field being updated by a source that should not have anything to do with the field in question. This problem gets really complicated if you are utilizing ranked fields in your MDM, as this might cause deadlocks where the field might not be able to be updated anymore by the regular operation of the integration processes. The cause of such deadlocks is discussed further in the next section.
Both of these solutions also cause corruption on records history as you can’t really trust your golden record history anymore. Troubleshooting errors becomes harder.
A good solution we found was to create an additional source that takes responsibility for any manual modification of the golden records. This also enables you to guide your data stewards towards other solutions than modifying golden records manually. Offering an alternative way to modify records, with, for example, a Boomi Flow application, could enable you more control on the changes. It would also give you a way to prevent future deadlocks when handling ranked fields.
ランク付けされたソース(Ranked sources)の注意点
Ranked sources is a handy feature with which you can control updates on common fields by competing systems. It has many good sides.
There are, however, also dangers involved with the feature. One of these is that when a field is changed manually through the Data Hub interface, where data stewards can access and modify the data as part of their work, the field gets locked for the golden record. Only the topmost source can break this lock.
Normally this is not an issue with a well-configured MDM. But there are cases where a record might be handled normally by sources other than the top ranked one, and as such might not ever receive an update to that field from the top ranked source. To solve this deadlock requires manual intervention and innovative use of API calls.
Another danger inherent in ranked fields is that the ranking can be modified when integration processes change. This is of course handy, as other systems might be included and the rankings need adjusting. It however makes things difficult when you are migrating the history of a record one step at a time and you realize that a field is left out as the ranking was different in the past for that field and cannot be modified with the current setup.
Such cases usually require you to either forgo the accuracy of golden records history and have a further update fix the field value, or you might have to manually send the record one step at a time with a tool like Postman and modify the ranking of the source between the steps. This is painfully slow and error prone.
In the final blog of this series, we cover lessons learned and best practices.
Boomi Data Hub は、企業内に存在するさまざまなデータサイロ(Data Silo:部門やシステムごとに分断され、共有されにくいデータの状態)の中心に位置する、クラウドネイティブ型のマスターデータ管理ソリューション です。既存の MDM 環境も含めたデータ基盤全体をつなぎ、導入しやすく、拡張性・柔軟性・セキュリティに優れた「サービスとしてのマスターデータ管理ハブ」を実現します。
詳細については関連情報をご確認いただくか、 Boomi 担当者までお問い合わせください。