By Kore Nordmann, first published at Tue, 22 Mar 2016 10:23:00 +0100
Download our free e-book "Crafting Quality Software" with a selection of the finest blog posts as PDF or EPub.
You can also buy a printed version of the book on Amazon or on epubli.
We experience that the system architectures of our customers grow more and more complex. This is either because of scaling requirements or because developers like to try out new technologies like implementing Microservices in other languages then PHP (Node.js, Go, …) or different storage technologies (MongoDB, CouchDB, Redis, …). Depending on the application it is also common to introduce dedicated search services (Elasticsearch, Solr, …), queueing systems (Beanstalk, RabbitMQ, ZeroMQ, …) or cache systems (Redis, Memcache, …).
Often there are very valid reasons to do this but there is also an important problem: You are creating a distributed system and they are really hard to get right & operate. Every system spread across multiple nodes in a network is a distributed system. A system consisting of a MySQL server and a PHP application server is already distributed, but this is a well known problem for most teams. Architecture decisions start to get critical once the data is distributed across multiple systems. Why is this the case?
One of the things which are hardest to repair in existing systems are inconsistencies of your data. Repairing this often even requires manual checks and sanitization which, depending on the amount of data, can take up really large amounts of time.
There are even studies [1] pointing out the costs of bugs in your architecture. If they are discovered late, when the system is already in production, then fixing these bugs can amplify the costs hundredfold. This is why we suggest to investigate and analyze your architecture before distributing your data and be careful doing so.
What are the main points you should check when designing a system architecture for a new project, during scaling an existing project or when introducing new storage technologies (search, storage, cache, …)? There are a couple of questions we can ask ourselves:
How can the consistency of data be ensured across multiple systems?
How do we verify that the chosen systems fulfil their requirements?
What are the technical and operational risks of newly introduced systems?
How will the system handle latencies and failures of nodes?
Is the overall application resilient against single node failures or how can this be accomplished?
On top of that those decisions should be documented and valued by certain criteria. There are even frameworks for documenting system architecture decisions and risks, which you might want to follow like ATAM. Important assessment points are:
Consistency and security of data
Performance (latency, transaction throughput)
Modifiability (Applicability to new products, future change costs)
Availability (Hardware failure, software bugs)
Plan an architecture workshop with Qafoo to ensure your system architecture slves your problem.
When introducing new systems you should be careful especially when you plan to distribute your data across multiple nodes. Technology and architecture decisions should not be made because some topic is hot right now (like Microservices) but you should assess that the chosen system architecture actually fulfills your requirements and will be beneficial. Since there will be no perfect architecture for your use case one should always document the respective benefits, drawbacks and reasoning why some kind of architecture was implemented.
Stay up to date with regular new technological insights by subscribing to our newsletter. We will send you articles to improve your developments skills.