Building a Modern Line-Of-Business Application - Part 3
Last issue, we talked about logging and unique record IDs. There's still more to consider on that topic.
When planning different types of data structures, we should always start with record IDs (primary keys). There are framework considerations that should inform our decisions. It's a step which is often overlooked until it is too late. Even in relational databases, it is important to have a consistent logic for determining which fields should be used as the primary and secondary keys.
One-Database/One-Machine to Rule them All!
Aside from the "Lord of the Rings" reference, this is a common assumption made by many developers. We can get stuck in a mindset of "this will all run on one machine, in one location." While that is likely to be true of the development environment, it may not be true in production.
There's a tendency to start throwing together an application without considering the implications of what this mono-model represents. It is the simplest and fastest way to get the job done, and that's all that matters to the programmer facing a killer deadline.
It does make record ID management much easier — in development. Ignoring the hard parts of the spec usually does make the job simpler. However, we'll find that the simple and fast approach has a lot of holes in it when we try to release it for production.
Some of the questions to consider before creating our Record ID management systems are:
- Will we ever be selling the applications to others? If so, how do we want to manage the multi-tenancy?
- Will the database ever be residing in more than one physical location? If so, do we need to create a federated database?
- Whether or not the database is split, what if the business has multiple locations? Do we need to consider multi-company/multi-branch records and, if so, how do we share information across companies or branches?
- How do we handle file consolidation for central/corporate offices compared to branch offices or stores?
These questions bring into focus the drawbacks to the "one-database/one-server to rule them all" approach. Let's look at each of these questions individually.
Multi-tenancy is an architecture that allows a single application to service multiple customers and/or sites. This often comes up when building SaaS (Software-as-a-Service) applications, but the considerations we have to make for SaaS are the same as what we have to make when dealing with multi-company/multi-branch environments. Do we want to have each tenant (customer) stored in the same database or do we want to create a separate database for each tenant?
A Federated Database is a structure that allows us to treat multiple autonomous databases as a single large database. This approach is often used in branch offices or brick and mortar store locations where most of the information can be handled independently from each other.
Most developers don't consider the advantages of federated databases because of the complexity it creates in their applications. It's a flawed idea to believe that "I'll just use a VPN to address the problem" is a reasonable alternative. While VPN makes sense for individual users, it's a backwards answer here. If the bulk of the users are in one location, say a branch of a bank, putting the database local to the users makes more sense.
The information still needs to be consolidated into a much larger corporate database for main office handling. Keeping IDs unique in the consolidation phase can be very complex. Much of that downside can be removed by choosing the right record ID management approach.
Federated databases also provide process isolation in addition to automatically mirroring and replicating systems without high priced replication, via software or hardware, being implemented. When we want to run a report for all the sales for the Colorado branch, then the report is run on Colorado's database and server, which does not affect the London branch, or Eastern Australia's.
The same goes for running large processes on centralized corporate databases. Consolidated reporting does not affect any of the individual branches. It also provides fail-over backups.
Even if we don't want to get into a multi-tenant or federated database environment, we still need to consider the multi-company/branch question.
If we plan to grow our company, there will always be a need for the option to handle multi-branch and multi-company information. The multi-branch scenario is the more common situation for a line-of-business application.
How do we isolate the company or branch information from each other when we build a data-store? For example, if we have more than one physical store location, how do we handle each store's inventory information?
More than likely, we would want to share the general inventory information with all the stores, but each individual location would need to handle inventory running totals independently. The purchasing, receiving, bookkeeping, and P&L reports would need to be handled by-store as well as for the overall corporate information.
The easy answer is placing everything into one table, using a store or company prefix. The problem with this is that the table would quickly become millions of records, potentially causing scaling issue.
The alternative is to create a distributed table, where each store would hold its own information in a separate table. This allows better scaling when stores are added or closed. Distributed tables work well with federated databases or single databases. It does, however, add an extra level of complexity when extracting and reporting consolidated information.
Software-Defined Record Key Management
Record-key management is best handled as a software-defined process rather than a hard-coded process that most developers tend to use. It fits the current trend of software-defined hardware, networking, and operating systems. There are advantages, but it adds a lot of up-front planning. The main advantage for software-defined record-key management is that we can choose one key-type for starters and then change it at a later date.
To do software-defined record-key management we need to make some decisions:
- Single or Distributed Table - The configuration needs to define how to handle one-table-for-everything vs. distributed tables. This is about defining how item and table identities are handled.
- Machine Independent Keys - If we plan to create federated databases, then all our keys *MUST* be machine independent. One way to do that is UUID V4. It can easily create machine independence, but the key size makes UUID hard to handle for data entry. The alternative is to create a machine, branch, or store prefix.
- Sequential Keys - The sequential key will always be unique across distributed tables as long as we use the same seed. A common example of this is to have a control record called NextOrderNo that is used, and updated, by all stores. This only works in a single database environment. It will not be unique if we move into a federated database.
- Structured Keys - Structured keys, also referred to as derived keys, are always built around something unique. This is similar to using a machine/branch/store prefix, but it may not be as obvious. For example, all U.S. stores start with 5, the next two digits are the state code, and the next two are the tie-breaker for states with multiple stores. So, while the Wilmington, Delaware store might be designated #97, the structured keys might all start with 50104.
- Key Reformatting - There should be some way to reformat the data when moving from a single table to a distributed and back again. This way the developer, or user, can change this information as needed. An example would be allowing every store to have an invoice #1, but prefixing the records as they are brought into the corporate system, and removing the prefixes when sending records back to the individual systems.
An example of a key management record would be:
Key: Table Name
<1> - Description of Table
<2> - File Structure: 1 - Single, 2 - Distributed, 3 - Federated
<3> - Key Structure: 1 - Machine Independent (UUID v4), 2 - Sequential, 3 - Prefixed Sequential, 4 - Structured (program driven)
<4> - Seed - Mainly used for Sequential keys.
<5> - Prefix Configuration
<6> - API/Program to call to generate Structured Key
Call GET.KEY.MANAGEMENT ("CUSTOMER",KEY.MANAGEMENT, CUSTOMER.FILEVAR) Call CREATE.KEY(KEY.MANAGEMENT,CUSTOMER.ID)]
Software-defined key management may seem like overkill, but if you plan for this type of structure ahead of time, then many things become easier in the long run.
Stay tuned for part four.