Soft Coding and The Enterprise Rules Engine – Millions of Dollars Worth of Waste

Imagine the following scenario, a basic business rule (read logical) change in an enterprise (read complex) application which could and should be tested in under a second, takes an hour to test. Now imagine repeating this process countless times over years with hundreds to thousands of business rules – at vendor software development rates you’ve got millions of dollars wasted.

How on earth could one have such a scenario? Surely its madness to waste ridiculous amounts of money on highly inefficient processes. But its not, it happens time and time again. The harshness of these words however have no place in the field of software engineering and software architecture – we must look at the engineering tradeoffs involved when including a rules engine and perform a quality attribute analysis. While we will not look at the enterprise rule engine in terms of a quality attribute analysis (this should be in the SAD of a given enterprise system which will have its own enterprise context), its worth looking at opinion pieces that in effect do point out certain quality attributes of enterprise rules engines. One such piece is the article Soft Coding by Alex Padadimoulis. Soft Coding is broadly defined, by Mr. Padadimoulis as

the practice of removing “things that should be in source code” from source code and placing them in some external resource

Mr. Padadimoulis goes on to say

The reason we find ourselves Soft Coding is because we fear change. Not the normal Fear of Change, but the fear that the code we write will have to be changed as a result of a business rule change. It’s a pretty silly fear to have. The whole point of software (hence, the “soft”) is that it can change that it will change. The only way to insulate your software from business rule changes is to build a completely generic program that’s devoid of all business rules yet can implement any rule. Oh, and they’ve already built that tool. It’s called C++. And Java. And C#. And Basic.

In the authors experience, some of the reasons why an enterprise rules engine (which may manifest itself in many forms, including Excel or even a vendor GUI of some sorts), may be used, rather than say a regular programming language, include the following:

  1. Insufficient time spent on design and the financial implications of design. Particularly relevant quality attributes that get neglected include modularity, testability and maintainability.
  2. The desire to include non-technical (non-programmers) in logic specification.
  3. No experience with continuous integration or continuous delivery. It may be that just about everyone concerned has been burned by non-delivery, for whatever reason.
  4. Insufficient time spent considering the implications and effects of product vendor lock-in.
  5. Consulting software vendor interests not sufficiently aligned with the clients interests. For example, including an enterprise rules engine may lead to more business (hours sold).

It may be that its possible to include an enterprise rules engine in an efficient, streamlined delivery cycle. Regretfully the author has not seen this happen.

One Year To Get Productive On Codebase = You’ve got Architectural Problems

I recently met with an individual for a chat in my home town of Wellington, New Zealand who stated that it takes his engineers a year to get productive on their huge, monolithic Ruby on Rails codebase.

This in itself is astounding, as its an unusually long time-period. But it does make sense when one considers that no matter what framework or language you may be using, if you have not spent sufficient time on your architecture up front then you will pay for it at some time in the future.

Having dabbled with Rails, it is not surprising that a Rails application can morph, over years, into a large scale enterprise application that cannot easily be architecturally redesigned to enhance its modularity, testability and other quality attributes. But perhaps this does not matter so much if you are making tons of money.

MonolithLater and MonolithFirst Arguments

With microservices architectures being topical, its interesting to note that there are “Don’t start with a monolith” and “MonolithFirst” arguments out there with Martin Fowler falling into the latter camp. Mr. Fowler essentially argues that having architectural problems to solve is a better place to be with an exception to this rule of thumb being a system rewrite.

It may be hard to scale a poorly designed but successful software system, but that’s still a better place to be than its inverse.

Java World to Ruby World Conversion – Taking a Peek

As a professional software engineer, with the Java World as your primary domain, you may consider taking the plunge and either dabbling with the Ruby World or making it your primary bread and butter domain. In this post we compare the two worlds with a simple table listing.

Its worth noting that the term Java developer in itself does not mean a lot, most developers have a preferred peripheral stack from the operating system upwards and in addition some may specialise in particular domains (front end, back end, integration, other domain) and hopefully a set of treasured reference books. So naturally this listing comes loaded with the bias and interests of the author.

You will notice that in some cases there are no equivalent frameworks and specialist tools when it comes to the Ruby World – in these cases, it may be best to look at your requirements and let a combination of a Quality Attribute Workshop and the Architecture Tradeoff Analysis Method guide your software architecture. It may be that you end up with both Ruby and JVM based modules in a large scale distributed system.

One important thing to note when comparing the two worlds is that with appropriate tooling (e.g. JRebel) and a supportive architecture it is possible to develop rapidly in the Java World. There are those that still make the assumption that enterprise Java development is overpriced and slow.

Java World Ruby World
Apache Maven No direct equivalent. Combination of distinct tools. BundlerRakeRubyGems and more.
Eclipse RubyMine by JetBrains
Sonatype Nexus No direct equivalent. gem server, Gem in a Box.
Apache Camel No equivalent. Can use STOMP to connect to Camel component.
Ruby Language (JRuby) Ruby Language
Google Guava No similar public Google supplied library.
JRebel Not applicable since Ruby is a dynamic programming language.
Java 8 Reactive Language Features Reactive Extensions for Ruby
Spring declarative transaction management with isolation levels Similar power available using Active Record – documentation not nearly as rich and mature
Workflow and BPMN e.g. Activiti BPMN 2 Process Engine No direct native equivalent (i.e. unit testable process flows). Integration options exist for e.g. Process Maker.
Spring Batch – batch application framework No equivalent. Some basic batch processing constructs in Active Record.
Vaadin No equivalent.

Taking Things Further – Inspiring Reads

  1. Enterprise Architecture with Ruby (and Rails) – without a doubt worth a look if you have an interest in software architecture
  2. RailsTutorial – a well presented, clear and concise book if you are looking to get your hands dirty

Rails in Wellington, New Zealand

If you make a living as a software engineer, the local market for a given stack naturally matters. Here are some organisations where Ruby on Rails is used in the author’s home town, the awesome city of Wellington, New Zealand.

  1. Youdo
  2. Powershop
  3. Datacom
  4. Aura Information Security
  5. Abletech
  6. Southgate Labs
  7. Loyalty.co.nz

 

 

The AWS Professional Services Bandwagon – Beware

In Wellington, New Zealand, where I live, and no doubt in many other cities in the world, professional services consultancies are predictably jumping on the AWS (Amazon Web Services) Bandwagon. Everyone wants to be seen as THE experts in AWS.

In my mind, in terms of a business model, this is all well and good – but its more important and sustainable to be the cloud software engineering experts backed by the best engineering talent money can buy. Being an AWS expert is simply not enough.

So how do you become a cloud software engineering expert? In my mind by building and launching hugely ambitious, useful, global services that delight their users – even if you give it away for free – yes, build software for free. Sounds crazy right? No it’s not crazy, passionate engineers do it all the time and this attitude is the foundation of some of the most successful companies in the world.

But why aim to scale globally, even with a simple service? Because this is the only way you’ll get onto difficult engineering challenges. It’s the only way you’ll get pressed to look at each and every part of functionality in detail and determine how you’ll make sure every single user has a delightful experience. Also, if you are burning your own money, rather than someone else’s (a client’s money), you’ll be very, very careful with your design and possibly even think about it night and day.

 

 

Microsoft Azure-Aclypse – Yet Another Quality Attribute Reminder

With Azure’s global outages impacting millions of end-users today, as a Software Architect this is just a simple reminder. Either your architecture has a quality attribute that requires resilience in the face of an apocalypse-like cloud vendor outage or it does not.

Chances are that no individual vendor will ever entertain the notion that they will suffer a global outage. SLAs are unlikely to get lost business back. It is the architect’s job to guide their client through the scenarios that may impact their business, including an Azure-Aclypse.

Maven Cobertura and JaCoCo Plugins – What Are Your Rock Bottom, Minimum Standards?

When it comes to test driven development, the best line I’ve heard is the following.

Your clients are not a distributed test environment

These wise words where uttered by my Spring Core lecturer while covering the difference between unit and integration tests in Spring parlance. On the note of unit and integration tests, after working on a vast array of projects, it has dawned on me, with some sadness, that not a single project, organization or team I’ve been on has had non negotiable standards when it comes to code coverage. Of course, most projects have had unit testing as a part of a checklist, but not a single one has made a lasting effort to both measure and enforce a minimum goal in terms of code coverage.

In this post we take a look at potential rock bottom configurations for the free cobertura-maven-plugin in particular and also visit the jacoco-maven-plugin. Finally we encounter lacking JDK 8 support and start considering paying for a commercial plugin.

Before delving into code coverage tooling it’s worth asking why it matters, to whom, and what it means. So, what does code coverage mean? Why should software engineers care? Why should project managers care? Why should project sponsors care? If a consultancy (vendor) is doing the development, why should this organisation care? These are not questions we’ll delve into here in depth other than noting that coverage reports help detect code that had not been adequately tested by automated test suites.

No Standards? Introduce a Lower Quality Gate

So what to do in a world of little to no standards? In my mind the answer is to set one’s own personal standards, starting with defining what rock bottom is. This is a personal professional line in the sand. Its also a great question, when considering joining a project or organization, to ask of your prospective employer. The question would be what unit, integration and system test code coverage standards the organization has and then crucially how they are enforced and made visible to all concerned.

In terms of motivating the need to minimum standards, the term quality gate seems apt. On a given project, even personal project, one would have two gates, the lower gate would be enabled by default and builds will fail on developer machines if this minimum standard is not met, a CI server would also independently verify using the minimum standard. If this lower quality gate has not been met, the project manager or development manager should know about it.

The Plugins

Lets move onto the plugins. The cobertura-maven-plugin is used to report on and check your unit test code coverage using the Cobertura code coverage utility. So we’ll first check if all tests are passing and then check to make sure our standards have been met. Once we move onto the integration test phase, where our beans and infrastructure is tested in concert, the jacoco-maven-plugin will report on and check our integration test code coverage. 

The importance of performing both unit testing (individual classes) and integration testing (incorporating a container such as the Spring context) cannot be overstated. Both plugins and so both types of testing must be done in a given project and this stands to reason: we want coverage standards for individual classes as well as cooperating runtime services and ordinarily we only proceed to the latter once the former has succeeded as per the Maven Build Lifecycle.

Rock Bottom – Our Lower Quality Gate

It stands to reason that there some should be some correlation between the application domain and the amount of effort one will invest in unit and integration testing. When it comes to rock bottom however the application domain is irrelevant since it represents our bare minimum standard that is domain agnostic.

In terms of the merits of a rock bottom configuration for Cobertura and JaCoCo, the following IBM developerWorks sourced statement supports such an approach.

The main thing to understand about coverage reports is that they’re best used to expose code that hasn’t been adequately tested.

Cobertura

Defining a minimum standard when it comes to Cobertura, as it turns out, takes some effort when one considers the array of options one has to consider. For example the configuration below is the usage example provided on the official plugin page.

      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>cobertura-maven-plugin</artifactId>
        <version>2.6</version>
        <configuration>
          <check>
            <!-- Min branch coverage rate per class. 0 to 100. -->
            <branchRate>85</branchRate>
            <!-- Min line coverage rate per class. 0 to 100. -->
            <lineRate>85</lineRate>
            <haltOnFailure>true</haltOnFailure>
            <!-- Min branch coverage rate for project as a whole. -->
            <totalBranchRate>85</totalBranchRate>
            <!-- Min line coverage rate for project as a whole. -->
            <totalLineRate>85</totalLineRate>
            <!-- Min line coverage rate per package. -->
            <packageLineRate>85</packageLineRate>
            <!-- Min branch coverage rate per package. -->
            <packageBranchRate>85</packageBranchRate>
            <regexes>
              <!-- Package specific settings. -->
              <regex>
                <pattern>com.example.reallyimportant.*</pattern>
                <branchRate>90</branchRate>
                <lineRate>80</lineRate>
              </regex>
              <regex>
                <pattern>com.example.boringcode.*</pattern>
                <branchRate>40</branchRate>
                <lineRate>30</lineRate>
              </regex>
            </regexes>
          </check>
        </configuration>
        <executions>
          <execution>
            <goals>
              <goal>clean</goal>
              <goal>check</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

The first question that comes to mind when it comes to the above is what the configuration means in the first place. The main concept we need is the difference between the line rate and branch rate, which has been neatly explained here. So, a potential starting point would be a 50% line coverage rate on a project wide basis as a rock bottom configuration with branch coverage excluded. Naturally we will halt on failure as a rule since this is our bare minimum standard and not necessarily what we will aspire to achieve.

			<plugin>
        		<groupId>org.codehaus.mojo</groupId>
        		<artifactId>cobertura-maven-plugin</artifactId>
        		<version>2.5.2</version>
        		<configuration>
        			<instrumentedDirectory>target/cobertura/instrumented-classes</instrumentedDirectory>
          			<outputDirectory>target/cobertura/report</outputDirectory>
          			<check>
            			<haltOnFailure>true</haltOnFailure>
            			<totalLineRate>50</totalLineRate>
          			</check>
        		</configuration>
        		<executions>
          			 <execution>
                        <id>cobertura-clean</id>
                        <phase>clean</phase>
                        <goals>
                            <goal>clean</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>cobertura-instrument</id>
                        <phase>process-classes</phase>
                        <goals>
                            <goal>instrument</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>cobertura-verify</id>
                        <phase>verify</phase>
                        <goals>
                            <goal>check</goal>
                        </goals>
                    </execution>
        		</executions>
      		</plugin>

JaCoCo

When using JaCoCo to generate code coverage reports both that jacoco-maven-plugin and maven-failsafe-plugin must be configured as per this excellent resource.

<plugin>
    <groupId>org.jacoco</groupId>
    <artifactId>jacoco-maven-plugin</artifactId>
    <version>0.6.3.201306030806</version>
    <executions>
        <!-- The Executions required by unit tests are omitted. -->
        <!--
            Prepares the property pointing to the JaCoCo runtime agent which
            is passed as VM argument when Maven the Failsafe plugin is executed.
        -->
        <execution>
            <id>pre-integration-test</id>
            <phase>pre-integration-test</phase>
            <goals>
                <goal>prepare-agent</goal>
            </goals>
            <configuration>
                <!-- Sets the path to the file which contains the execution data. -->
                <destFile>${project.build.directory}/coverage-reports/jacoco-it.exec</destFile>
                <!--
                    Sets the name of the property containing the settings
                    for JaCoCo runtime agent.
                -->
                <propertyName>failsafeArgLine</propertyName>
            </configuration>
        </execution>
        <!--
            Ensures that the code coverage report for integration tests after
            integration tests have been run.
        -->
        <execution>
            <id>post-integration-test</id>
            <phase>post-integration-test</phase>
            <goals>
                <goal>report</goal>
            </goals>
            <configuration>
                <!-- Sets the path to the file which contains the execution data. -->
                <dataFile>${project.build.directory}/coverage-reports/jacoco-it.exec</dataFile>
                <!-- Sets the output directory for the code coverage report. -->
                <outputDirectory>${project.reporting.outputDirectory}/jacoco-it</outputDirectory>
            </configuration>
        </execution>
    </executions>
</plugin>

JDK 8 Support Lacking – Time To Look At Altassian Clover

While producing this post I had to abondon using both cited plugins and to start looking at Altassian Clover since the two cited free plugins do not support JDK 8 at present but Altassian Clover does. The latter does come with a $300 price tag, and that should be fine, it worth spending money on good development tools.

cobertura-maven-plugin issue log

Issue 1: cobertura-maven-plugin 2.6 gave incessant error, downgraded to 2.5.2 which made the error go away. Did not have the time for analysing the reasons for the failure.

Issue 2: Tests would not run with the mvn clean verify command, got incessant exceptions and bytecode on the console with the reason being “Expected stackmap frame at this location.” As it turns out this is was due to JDK 8 not being supported. Downgrading to JDK 7 was not an option for me, neither was spending time on understanding subtle new behaviours of JDK 7 .

AWS S3 Glacier Billing Example – The Peak-Restore-Bytes-Delta Burn

There’s lots to read about S3 online, but nothing helps more with overarching service design than the burn of an actual bill.

Peak-Restore-Bytes-Delta is effectively a usage penalty and as you can see the fines can be massive and render using Glacier financially infeasible.

As expected, on the billing front there is nothing Simple about using the Amazon Simple Storage Service (S3). In fairness however 5 out of 8 line items can be attributed to using Glacier along with S3.

The costs below represent a single month of testing using the Asia Pacific (Sydney) Region (ASP2) during October 2014.

ITEM COST PER UNIT USAGE COST
EarlyDelete-ByteHrs $0.036 per GB – Glacier Early Delete 1.440 GB-Mo $0.02
Peak-Restore-Bytes-Delta $0.012 per GB – Glacier Restore Fee 16.277 GB $0.20
Requests-Tier1 $0.00 per request – PUT, COPY, POST, or LIST requests under the monthly global free tier 168 Requests $0.00
Requests-Tier2 $0.00 per request – GET and all other requests under the monthly global free tier 52 Requests $0.00
Requests-Tier3 $0.06 per 1,000 Glacier Requests 16 Requests $0.01
TimedStorage-ByteHrs $0.000 per GB – storage under the monthly global free tier 0.000022 GB-Mo $0.00
TimedStorage-GlacierByteHrs $0.0120 per GB / month of storage used – Amazon Glacier 0.181 GB-Mo $0.01
TimedStorage-RRS-ByteHrs $0.0264 per GB – first 1 TB / month of storage used – Reduced Redundancy Storage 0.003 GB-Mo $0.01

The red hot glaring line item is Peak-Restore-Bytes-Delta, note that in the given month, only 181 MB was archived costing a paltry $0.01 (this cost provides some insight into AWS’s rounding). So why the comparatively massive $0.20 for the retrievals? Peak-Restore-Bytes-Delta is effectively a usage penalty and as you can see the fines can be massive and render using Glacier financially infeasible. With Glacier only 5% of the total amount of storage used can be retrieved for free in a given month. The $0.20 is for the peak billable retrieval rate as per the Glacier FAQs (look for the question entitled “Q: How will I be charged when retrieving large amounts of data from Amazon Glacier?”) and it means a peak rate of 16.277 GB/hr worth of transfers took place – so in essence during one particular hour 16.277 GB worth of retrievals where scheduled.

It’s also worth looking at the EarlyDelete-ByteHrs. A total of 1.440 GB-Mo was deleted early (within less than 90 days). Now first off what is a GB-Mo? Presumably it means Gigabyte-Months which means Gigabytes multiplied by months which is 3 months in terms of what an early delete is defined to mean. In essence, if one looks at the rate (3 times the TimedStorage-GlacierByteHrs rate) then EarlyDelete-ByteHrs just means they will bill you for 3 months of usage no matter what.