Dependency Injection and Lifecycle management with Spring and Scala

2015-07-01 Leave a comment

spring

One of my earliest programming mentors introduced me to the concept of dependency injection and inversion of control, commonly shortened to DI/IOC, in my first year of my first real job as a professional programmer. This individual was known in the group for “pushing the envelope” of technologies and processes that we used to write software. They were also known for championing Ruby (right around the time of the original release of Ruby on Rails), promoting test driven development, and taking hours off to go shopping for shoes during the workday at a rather large global financial juggernaut (see my resume on LinkedIn if you’re curious about who this employer might have been). Dependency injection, at the time, was described to me as something which would “bring my code out of the dark ages” by eliminating reams of poorly written (and often “wrong”) singleton management and lifecycle management code. (I use the term “wrong” in the sense that Java Concurrency in Practice uses when it talks about how, for example, using the double-checked locking idiom for lazy initialization in Java without the volatile keyword is “wrong”.) As a wide eyed newbie programmer listening to a relative “god”, I was enamored with the power of DI and became a life-long convert.

Why is DI powerful?

There are literally hundreds, likely thousands, of articles, StackOverflow questions, blog posts, forum threads and other resources which speak about the merits of dependency injection and some of the frameworks which implement it. My goal with this post is not to provide a comprehensive treatment of why one should use DI, but rather to provide samples from my own personal experience to add to the conversation.

That being said, some online resources that seem good include:

DI frameworks – Scala with Spring

In joining a new group at my current job, I stepped into a role as a software engineering lead and have recently found myself explaining the merits (and pitfalls!) of DI frameworks to un-initiates / skeptics. I should note that those who view DI frameworks with skepticism are not bad programmers by any means; on the contrary, they often write high quality, concise code. If anything, I myself suffer from the curse of knowledge regarding DI frameworks – Spring in particular. But much like my experience (and many other’s experiences) with technologies and methodologies that vary in “power”, technologies and methodologies “less powerful than [x] are obviously less powerful, because they’re missing some feature [they are] used to.” (I’m quoting from Paul Graham’s post on the power of languages, but I think the same argument works here for software design frameworks.) Again, I’ll quote (with some pronoun adjustments): “when our hypothetical [x] programmer looks in the other direction, up the power continuum, they don’t realize they are looking up. What they see are merely weird [technologies]. They probably consider them about equivalent in power to [x], but with all this other hairy stuff thrown in as well. [x] is good enough for them, because they think in [x].” In short, DI frameworks often looks as powerful as managing your own object graph and lifecycle but filled with lots of “hairy stuff thrown in as well” which seems superfluous – why should one go the trouble?

Regarding Spring as a DI framework – my opinion is that Spring is one of the best DI frameworks for the Java-based ecosystem. If you’re raising your eyebrows, I’m not surprised – over the many years that I’ve used DI frameworks, I’ve often gotten strange looks because I promote the use of Spring’s DI container. Most of the conversations that I have with Spring skeptics revolve around some basic mis-understandings of the framework itself. Let me try to set the record straight.

Spring is not based on XML anymore. Yes, you can technically use bloated XML to configure your apps, but it is by no means the preferred way nor do savvy programmers do this anymore. Using standard Java DI annotations or Guice-style code wiring is now the preferred DI wiring mechanism in Spring.
Using Spring DI does not force you to tie your program code to Spring. As you’ll see below, usage of Spring DI requires either one or two Spring import statements in one isolated section in code, completely separate from your application logic. This is on par with Guice. Spring does NOT marry you to the framework – ripping it out and replacing it with another DI framework will takes on the order of seconds, and requires zero refactoring.
Using Spring does not force you to import many libraries. Only one library is required (the spring-context library).

OK, I hope that’s all out of the way. Now, I’ll take a brief moment to highlight some of Spring’s benefits:

Spring DI is an incredibly well-tested, widely-used, and robust framework that has been around for over a decade. It is rock solid and I’d guess that most of it’s bugs have been excised long ago.
Spring DI supports standard Java specification annotations for DI (JSR 330 – the javax.inject package), if you wish to use them. This enables removal of all code-based DI (which is required with frameworks like Guice). This isn’t manadatory, but it is possible.
Spring DI supports standard Java specification annotations for lifecycle management (JSR 250 – the javax.annotations package), if you wish to use them. This enables the DI container to manage proper startup and shutdown of your classes (in case they are stateful). Think about thread pools, database connections, in-flight computations, and other long-lived entities. I have not seen any other Java DI framework implement lifecycle management “correctly”. For example, Guice’s maintainers have decided that lifecycle management is not important.
Other Spring framework components – JDBC helpers, thread pool management, AOP, Web / REST, and others – are very easily integrated into the Spring DI. They are by no means required, but benefits of using Spring DI and other Spring components together are greater then using them alone.

I am fully aware that there are numerous other DI frameworks on the market, both for Java and for Scala. I think some of them (notably Guice and Subcut) have very good points about them. However, I feel that they are not as robust as Spring’s DI offering, and this is why I am not highlighting their usage. I will leave a separate discussion of Spring vs other DI frameworks for another times.

I’ll continue this discussion based on sample code (https://github.com/dinoboy197/sailing-cruise-watcher) that I wrote using Scala and Spring together. The code was originally written to monitor a 2000s-era website for reservations availability for sailing classes, but that’s a different story entirely.

DI-enabling classes

Creating your own classes

It is easy to DI-enable a class so that Spring will create an instance of it for injection into other classes and for lifecycle management. Simply add the javax.inject.Named annotation to the class:

import javax.inject.Named

@Named
class SampleProcessor {

Now, Spring will create and manage a single instance of the SampleProcessor class in your program; it can be injected anywhere.

To inject an instance of a class that you’ve annotated with @Named into a different class, use the @Inject annotation:

import javax.inject.Inject
import javax.inject.Named

// mark a class with @Named to create an instance of it in the object graph
@Named
// use @Inject() followed by constructor arguments to have Spring wire in instances of these classes
class SailingCruiseChecker @Inject() (val http: Http, val sampleProcessor: SampleProcessor) {

See that for the class SailingCruiseChecker, two other class instances are being injected: an instance of a SampleProcessor class and an instance of an Http class.

Managing class instance lifecycle

Some class instances are stateful; they may even require special handling during startup or shutdown. A startup example: a class may need to pre-load data from a dependent class before it can service its public methods; however, its dependent classes must be wired up before this happens. A shutdown example: a class may need to wait before exiting to properly close a thread pool, close JDBC connections, or save in-flight computations for proper continuation upon restart.

To specify a method which is called once automatically after the object graph is created, annotate it with javax.annotation.PostConstruct. To specify a method which is called once automatically before the object graph is torn down (either due to JVM shutdown or DI container closing), annotate it with javax.annotation.PreDestroy.

  // if your singleton has some state which must be initialized only *after* the object graph is constructed
  // (ie, it calls other objects in the object graph which might not yet be fully constructed)
  // use this method
  @PostConstruct
  def start() {
    // some initialization that is guaranteed to only happen once
  }

  @PreDestroy
  def stop() {
    // if your singleton has some state which must be shut down to cleanly stop your app
    // (ex: database connections, background threads)
    // use this method
  }

Managing third-party library class instances with Spring

Using libraries is common, and it’s easy to instantiate third party classes which are not DI-enabled within any code that you write. Take, for example, a class in a third party library called NonDIEnabledClass which has an initialization method called init() and a method to be called for cleanup before JVM shutdown called close():

// this is a fictitious example of such an external class which must be started with init() and stopped with close()
class NonDIEnabledClass {
  def init {}
  def doSomething{}
  def close {}
}

Using this class in code might look like the following:

@Named
class Http {
  private val httpHelper = new NonDIEnabledClass()

  @PostConstruct
  def start() {
    httpHelper.start()
  }

  def get(url: String): String = {}

  @PreDestroy
  def init() {
    httpHelper.stop()
  }
}

The Http class is tightly coupled to the NonDIEnabledClass and is quite non-DI-like.

During testing (class, unit, or even end-to-end integration tests), it can valuable to stub behaviors at your program code boundaries – for instance, stubbing out the behavior of the NonDIEnabledClass above. Mocking frameworks can use fancy JVM bytecode re-writing techniques to intercept calls to new and swap in stubs at test time, but we can easily avoid JVM bytecode re-writing by managing third party library classes with Spring.

First, instruct Spring to create and manage an instance of this non-DI-enabled class. Declare a new configuration class in which you’ll add a method annotated with org.springframework.context.annotation.Bean (here the configuration class is named Bootstrap, though the name is irrelevant):

// configuration class
// used for advanced configuration
// such as to create DI-enabled instances of classes which do not have DI (JSR 330) annotations
class Bootstrap {
  // use @Bean to annotate a method which returns an instance of the class that you want to inject and of which
  // Spring should manage the lifecycle
  @Bean(initMethod = "init", destroyMethod = "close")
  def externalNonDIEnabledObject() = new NonDIEnabledClass()
}

Note how the optional initialization and teardown methods can be specified as parameter values to the Bean annotation as “initMethod” and “destroyMethod“.

Now, wire the instance of this class in where it is desired, for instance, in our Http class from above:

@Named
// see the Bootstrap class for how non-DI annotated (JSR 330) objects make their way into the object graph
class Http @Inject() (val externalNonDIEnabledObject: NonDIEnabledClass) {
  def get(url: String): String = {}
}

Note the differences between the former and latter examples of the Http class. In the latter example:

The Http class does not use new to instantiate the NonDIEnabledClass.
No @PostConstruct nor @PreDestroy methods are necessary in the Http class to manage the lifecycle of the NonDIEnabledClass instance.

Activating Spring DI

Your code is now wired up and ready for execution. Spring DI now needs to be activated.

Having now seen concrete examples of how to DI-enable your classes, let’s return to a Spring app’s lifecycle:

Instantiate any classes annotated with @Named
Inject these instances into @Inject points on your classes
Execute all @PostConstruct methods on these instances
Wait for the DI container shutdown (which can happen automatically at JVM shutdown via shutdown hook).
Execute all @PreDestroy methods on these instances

To activate Spring, create an instance of an org.springframework.context.annotation.AnnotationConfigApplicationContext, scan the package containing your DI-enabled classes, refresh the context, start the context, then register a shutdown hook. In this example, this code is located in an object called Bootstrap, which extends App for direct execution when Scala starts up (see more on this in a moment).

// main entry point for command line operation
object Bootstrap extends App {
  // start up the Spring DI/IOC context with all beans in the info.raack.sailingcruisechecker namespace
  val context = new AnnotationConfigApplicationContext()
  // include all DI annotated classes in this project's namespace
  context.scan("info.raack.sailingcruisechecker")
  context.refresh()

  // start up the app - run all JSR250 @PostConstruct annotated methods
  context.start()

  // ensure that all JSR250 @PreDestroy annotated methods are called when the process is sent SIGTERM
  context.registerShutdownHook()
}

If you’ve included a class definition for the Bootstrap class to manage the lifecycle of third party class instances, you’ll also need to register these instances. Add a register(classOf[Bootstrap]) call to the definition above:

  // start up the Spring DI/IOC context with all beans in the info.raack.sailingcruisechecker namespace
  val context = new AnnotationConfigApplicationContext()
  // include all custom class instances which are not DI enabled
  context.register(classOf[Bootstrap])
  // include all DI annotated classes in this project's namespace
  context.scan("info.raack.sailingcruisechecker")

Starting your app

We’re ready to start our app! I’ll assume use of the sbt build system for this Scala program.

First, include the Spring libraries that you’ll need for DI in the build.sbt file. This includes spring-beans and spring-context. I also like to use slf4j as a logging facade and logback as a logging backend, and since Spring still uses commons-logging as a backend, I redirect commons-logging into slf4j with jcl-over-slf4j and then include logback as the final backend.

// libraries
libraryDependencies ++= Seq(
  "org.springframework" % "spring-context" % "4.1.6.RELEASE" exclude ("commons-logging", "commons-logging"),
  // spring uses commons-logging, so redirect these logs to slf4j
  "org.slf4j" % "jcl-over-slf4j" % "1.7.12",
  "org.springframework" % "spring-beans" % "4.1.6.RELEASE",
  "javax.inject" % "javax.inject" % "[1]",
  // logging: log4s -> slf4j -> logback
  "org.log4s" %% "log4s" % "1.1.5",
  "ch.qos.logback" % "logback-classic" % "1.1.3"
)

Next, I’ll indicate the main class for starting up the program. You may want to do this if you choose to bundle all of your code and third party libraries together with the assembly plugin or something similar.

// main class
mainClass in (Compile, packageBin) := Some("info.raack.sailingcruisechecker.Bootstrap")

Finally, start up the app:

sbt run

If all is well, Spring should be started and your app will run!

Approaching Scala

2015-04-29 Leave a comment

Several months back, I was presented with an opportunity to join a new R&D team at my employer. The individuals in the team all had different skill sets and hailed from different backgrounds. What brought them together was a challenge to extract the essence of business offerings from unstructured human-written (often poorly written) reviews using modern NLP techniques, refreshed and updated daily to the tune of hundreds of millions of reviews. This project had been in research mode for some months before my joining the team, but after some internal organizational restructuring, it had piqued the interest of key business and technology leaders and was bestowed a formal team and dedicated (though slim) engineering resources. My own interest in this team came from the opportunity to design and build a scalable NLP computation engine for the task at hand, with a very small set of engineers at my disposal – only one of which had worked on true production systems in the past. Perhaps a daunting task, but I was excited to tackle it – and to learn as many new and unique technologies as I could while doing so.

Upon discussing the current state of the research and implementation with team members, I was surprised to find that the team had chosen Scala as the language choice with Apache Spark as the runtime engine for text processing. Not surprised because I thought this was a bad decision, but rather that my organization was (and still is) currently in some ambiguous stage between “wild-west coyboy driven startup” and “heel-dragging corporate behemoth” which tends to eschew trendy technology choices that don’t have a sizable production legacy (as Java does). Having spent many years with the JVM on Java alone, I became interested in Scala years back – just not enough to do much more than dip my toes in whatever it is one dips ones toes in when investigating a new programming language. Ready to take my JVM-based programming ego down a few notches, I dove into the team head first and was pleasantly surprised by what I found.

Where should we put Scala?

Programming-Tools — http://cubiclebot.com/pictures/programming-languages-as-real-tools/

Almost immediately upon quizzing my new teammates for details about their current software, I was bombarded by some highly charged discussions regarding previous technology choices. Comments like “yeah, we’re using Scala because Alice and Bob think that it’s cool”, “nobody supports this here; now that you’re helping us, Taylor, can you re-evaluate Scala’s use?”, and “I hate Scala” were frequent refrains. They were almost as popular as “we really want to run on Spark, and Scala supports many of the computation primitives that I’d rather not write in Python” and “Christine and Dave just need to practice more and they’ll see why Java is for dinosaurs” (names have been changed to protect the innocent). In fact, I myself was at first slightly miffed at being called a “dinosaur” – but I bottled up my own verbal defenses and tried not to be offended. Everyone’s opinion came from a different viewpoint; I wanted to figure out why the reactions had been so polar.

https://rmcjury.wordpress.com/blog-posts/m8-communication-is-powerlanguage-is-power/

Perhaps the most eager of the Scala-defenders led me to a Paul Graham blog entry from 2001 entitled “Beating the Averages”. (I had actually been led donkey-style to this article in the past, but admittedly didn’t parse past the first few sentences.) In a nutshell, Paul Graham’s argument is that using efficient, non-mainstream technologies (programming languages in particular) that competitors ignore provides a competitive edge. He goes on to talk about “The Blub Paradox” regarding the “power” of computer languages, and asserts that “the only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one”. He wrote that by choosing a language which he considered at the time to be very powerful, his “resulting software did things our competitors’ software couldn’t do”. (For those interested, a cursory search for “Blub Paradox” provided this page as a counterexample to its merits.) This was the argument presented to me by my co-worker – that by choosing a language more powerful than Java, our team would be able to approach and solve our business problems in a better way that would be harder for others to replicate. While I didn’t take Graham’s comments about his language of choice specifically to heart (Graham speaks very candidly about his love for Lisp), the article and my co-worker’s argument did strike me and I set off to dive into the world of Scala.

One more note from the Scala-defender from above – one of the more compelling statements they made to me was that “Scala raises the level of abstraction from Java by managing language complexity that Java cannot get rid of”. As I travelled along my Scala learning path, I quickly found this to be the case.

The first stop along my foray into Scala came in the form of Scala for the Impatient by Horstmann. Admittedly, I found that I was a bit impatient – I didn’t make it past the first two chapters. Upon reaching chapter three, entitled “Working with Arrays”, I skipped directly to reading my co-workers source code. My own brain learns by looking and manipulating concrete examples, and for me the best way to do this has always been to start with the most familiar pieces of a new domain, such as an existing business process encoded into a computer program.

Having a background in Ruby, Python (with some ancient Scheme / Lisp knowledge) as well as Java, much of what I read seemed to make sense, with a few exceptions. Going to the web and humans for help, I quickly realized what many before me have said – there are many, perhaps too many, mechanisms for achieving very simple logical operations such as method calls, transforms, variable name references, etc. So many that it can be a bit confusing for the relative newcomer, as there are a variety of “canonical” style guides which seem to be produced by different camps in the Scala community. As it was described to me by one co-worker (who is regularly in conversations with Scala and Spark advocates in the San Francisco Bay Area), there are two main groups supporting Scala today: a group which is very interested in using the language as a research tool for language design itself, and a group which is very interested in language feature stability, ease of use and comprehensibility, and wide adoption for community support. Being an engineer who has spent many years in the halls of production hot-fixes, support, junior engineer mentorship, and consensus-building through standards and convention, I quickly realized that I aligned much more closely with the latter camp.

Fast-forward two months. I think I’ve gotten a good grip on how to approach programming from a Scala-standpoint, mostly through writing patches and new features for the previously mentioned big-data processing program running on Spark. In addition to understanding how one uses Scala pragmatically for actually getting real work done, diving into the runtime details has also opened my eyes to how one writes effective programs for Spark (more on that to come later, perhaps). I have a pretty good handle on the build process, how many concepts in Maven builds work with SBT, and what to do when things go wrong. With my newfound knowledge, I coded up a very basic app to prove my skills to myself – see it on Github if you’re interested. I’ll likely go into a little more detail about that project in the future as a very small case study.

Some resources that I found along the way that others might find helpful:

Books – I don’t really keep language books on hand, as there are typically enough internet resources available with cursory Google searches.
- Scala for the Impatient. Perhaps not the first place I would go personally, but it might be a good suggestion for others.
- Learning Concurrent Programming in Scala. Having an extremely high opinion of a similar sounding book for Java, Java Concurrency in Practice (perhaps the best Java guide to writing correct concurrent programs), I found this book to be lacking. Perhaps that’s because JCIP is so complete, that there isn’t much else to say when thinking about how concurrent programs operate on the JVM. My gut feeling tells me that it’s more likely that there are many Scala concurrency gotchas that just aren’t well-known enough to make this volume really stand on its own.
Style guides – These are the things that really help me. As far as programming goes, I’m much more interested in convention over flexibility at this point in my life – so having a style that the community generally follows just makes things easier for everyone – myself, my code reviews, and my maintainers.
- Databricks Style Guide. After consulting with a few Scala folks, this seemed to be a very sensible best-practice document for writing Scala code.
- Twitter Style Guide. Probably just as high quality (and obviously written by smart Scala coders), but seems to give the coder a little more rope to hang themselves with when using the language.
- The “Official” Scala Style guide. This was much less useful for me, given that I’m looking for suggestions of “when you need to do x, here is how you should code it”. In my opinion, it suggests too many esoteric Scala constructs that compete with each other, making things more confusing for a Scala newcomer than they need to be.
Code helpers – Tools for non-omniscient beings, like myself, who do not have said best-practices / conventions memorized and cannot yet write perfect Scala code into a freeform text editor due to our lack of language mastery.
- Scala IDE for Eclipse. The code completion and syntax checking tools seem decent. Certainly worth the three minute install.
- ScalaStyle. Style checker, with a configurable rule set that I tweaked a bit to follow the DataBricks style guide a little more closely.
- Scalariform. Used for Scala code formatting. I’d rather have the auto-formatting in my IDE, but I’ll take what I can get. Actually, I’ve only used this once – so I can’t say I can recommend it (due to lack of experience).

With luck, I’ll continue to learn and grow more proficient with Scala. I think I’m at the point that I’d consider writing my own programs in Scala when starting from the ground up – we’ll see how that goes in the next few months.