How to handle your Canonical Data Model (CDM) in a SOA environment?
As decribed by my colleague Emiel Paasschens, it can be highly beneficial to use a Canonical Data Model (CDM) when you're going for a SOA approach. To not repeat what he already, I refer you to his article if you want to know what a CDM is and how it can be helpful:
However, what that article doesn't describe is how to setup your CDM in a flexible, yet reusable way that's allowing for maximal business flexibility. This article is intended to fill that gap.
Problems occuring with a poorly setup CDM
Messing up your CDM is a great foundation for complete and utter failure of your SOA/integration project, so you might want to avoid this. Having done many different SOA projects over the years, I've experienced different ways of (not) dealing with a CDM with different results. Not dealing with a CDM is out of scope for this article, as I think it belongs more to the microservices domain, so let's talk about different ways to manage the CDM.
1. One CDM to rule them all
If you want to go extremely hardcore on reusability, this is the one for you. You'll have exactly one version of your CDM, which is used by all your services. Obviously, this will lead to a lot of problems, because any breaking change will force you to also change many services and your interfaces will generally not be very precise. For example, I've worked at a governmental department, where they have a "Person" type. This Person type contained all the elements that a Person could possibly have from all kinds of different perspectives, making it rather epic, hard to transform and generally too broad for any interface definition. Of course, you can create different subtypes, but you'll still have the breaking changes issue and once you let a type become too epic, it will be almost impossible to break it down later.
2. Versioned CDM
So, having read point-1, we're going to add some versioning to the CDM. This means that services can use different versions of it to avoid breaking changes having too much impact. However, those services will need to be upgraded to the latest CDM version sooner or later and you might also end up having to do a lot of transformations between services, because their CDM versions are not matching. So, while this is significantly more manageable than the CDM in point-1, you'll still be perfectly able to run into a mess.
3. Design Time CDM
Having learned the harsh lessons from point-1 and point-2, I've been in two projects where they decided that using a runtime CDM is too troublesome and CDM was only used in design time. This was done by copying the necessary CDM types to service specific XSDs, each service having its own namespace and then allowing some minor tweaks. While this makes your services resistant to CDM changes, it requires a lot of transformations, so adding an optional element can easily lead to quite some work, bringing down productivity dramatically. I think it's a too harsh way of countering the problems occuring with a runtime enterprise CDM.
Solution: domain driven CDM
Since neither of the three options mentioned above appealed to me, during a project at a financial company, I decided to take a different approach and break down the enterprise-wide CDM. Instead, I've created a CDM exclusively for the domain that I was working on. In this case, reusing the Person example, a Person can have different definitions in different domains, because a person from an HR point of view will have different attributes than a person from a pension point of view.
So, let each domain have its own definitions and namespace and you'll lose dependencies between domains. You still have dependencies within the domain, but I don't see this as a bad thing. When dealing with an Employment domain, it's extremely unlikely that you make a change to the Person type that's applicable for some service operations (f.e. UpdateEmployment), but not for others (f.e. GetEmployment). So, things that change for the same reasons should be able to reuse the same types. On top of that, you can also version the Domain Driven CDM to make it more robust. You'll still end up having some transformations between different domains, but when your domain exposes APIs or services in a smart way, this effort should be limited and manageable. It will also be easy to add an optional element in this case, because within the domain you can work with Assigns (BPEL) or Data Associations (BPM) on a higher object level, instead of having to rely on transformations all the time.
General design consideration
Apart from the choices you make in how you handle your CDM, it is strongly recommended to avoid breaking changes as much as possible, since they're always going to be painful. Therefore, think hard about your design standards before you start and consider for each element within your types if it might become an array in the future or not. These things should help you to reap the benefits of the CDM without feeling too much of the pain that can come with it.