reliability - Chimplie

#📚reference As the [[security]] article already mentioned, with great mathematical precision, no interesting computer program can be absolutely reliable. Nevertheless, we still want to engineer reasonably secure and reliable systems. How do we achieve that? Aside from the buzzword, a software product's reliability measures our reasonable trust in its ability to perform within defined boundaries. In other words, to achieve a certain level of reliability, we need to understand these boundaries and invest our efforts in enforcing them. The former is fulfilled by proper [[product specification]], and the latter is an outcome of the established [[development process]]. Simply speaking, to understand, to commit, and to endure. From a business perspective, reliability can be measured in terms of downtime and cost of failure. Downtime measures the percentage of the expected and tolerable absence of a particular piece of functionality. While the cost of failure estimates the average and total loss in profit given the known and expected defects in our software. It is important to mention that we can't account for all possible ways our program may fail. But at the same time, we can project from our understanding of the software's existing state. This is why such practices as [[tech debt]] management and [[security]] audits are so important. By reviewing our engineering solution, we can put a bound on the expected cost of incidents and their frequency. From this perspective, the quest for reliability can be perceived as [[defensive programming]] on a higher organizational layer. Another business dimension of reliability is our capacity to seal the development of one part of the system and focus our limited resources on another. When a product is developed with its reliability in mind, it is possible to gradually achieve goals by closing one milestone after another. This is exactly why [[quality assurance]] is essential for software products in the main development phase. From the same logic, it is clear that products in the [[POC]] or [[MVP]] stages of their development should not be guided by reliability. In some scenarios, a reliable POC could be something we can quickly restart during its presentation. When it comes to MVPs, investing in their [[maintainability]] (the ability to evolve and adapt) is usually more prudent than pursuing pure reliability. Instead, the [[fail fast]] should be taken. --- <font style="color: #F86759">Contributors:</font> *[[Mykhailo]]* <font style="color: #F86759">Last edited:</font> *2024-02-27*