Inspired by Peter’s comment I am following up on my last post on “The Myth about Software Reuse”. I received quiet a lot feedback concerning the topic and I felt I should share some of my thoughts and visions in order to answer some of the questions and destroy the concerns I may have provoked.
In my last post I concluded that OSGi has all the features necessary to create great and reusable artifacts/ bundles if you prefer. As Richard pointed out it is THE technology, if you’re trying to build modular applications and if done right even reusable modules. The only flaw here is the “if done right” part. People do mistakes, I do it all the time and I bet you are doing it too. It’s in our nature, we can’t help it. Unfortunately when talking about mistakes in the context of reusable artifacts the implications can be disastrous. Having a faulty versioned artifact and you rely on its correctness it will most certainly break some functionality or even the whole application. OSGi by itself doesn’t either enforce any constraints on versioning your bundles nor does it give you a detailed guideline on doing so (neither does Java). This is the reason, why the the folks from LinkedIn choose to only rely on one distinct version in their dependencies – the safest call for sure. If there are no rules, you check for them or predict the behavior of future bundles. Everything is custom made. This vacuum of control renders bundles unpredictable in their versioning behavior and almost impossible to use in a forward compatible way usually applied when talking about Software Product Lines for instance.
If you will, one can say that the lack of control is the root of our problem – enforce control end everything falls into part, right? Unfortunately control is a two bladed sword. On the one hand you have a controlled environment where you exactly know what is going on and what to expect, on the other hand it limits your possibilities and hinders exploring new ways of thinking. Especially if you feel like not having enough information/knowledge about the problem domain this is the ultimate killer criteria for progress – no exactly what we desire. Picking up Peters comment from my last blog post, this holds pretty much true for versioning policies in OSGi so far – we just don’t know enough yet. However, I gave it some thought and I think there is a way around this problem. Hold on a bit and I’ll explain what I am thinking.
The core of the problem from my point of view is the way we receive our dependencies. Being a good OSGi citizen one should use packages to express dependencies of course, but that’s just one part of the story. The other one are the bundles contributing these dependencies. As mentioned in my previous post, there are multiple repositories for 3rd party bundles one can use (like [spring-repo], [orbit-repo] or [osgi-repo]). The problem here however is that one has no guaranties on what you’re getting from there. Of course you get the source you require – hopefully, but not necessarily the correct meta data you are looking for or even worse are requiring (see bundlor bug report for instance). The core problem here is specifying versions and version ranges in particular. There are no fixed rules and as Peter stated in my previous post it is a field that needs more exploration, which I can’t agree more. However, I think there is a way to satisfy the need for room of further exploration as well as accomplish the need for more control – the issue with the two bladed sword, I was talking about earlier. Let me elaborate on this a little bit more…
In my opinion, all we need is actually a repository we can trust. Trust in that sense, that we know for certain that the artifacts provided are following certain rules. The rules however shouldn’t be set in a hard coded/wired way, so that the rules can evolve and provide extra information while we evolve in understanding the topic. Another important feature (for me at least) is the “not lock in” option. I don’t want to lock myself into some vendor specific rules, if I don’t have to or don’t agree on them. It would be nice, if certain vendors provide me with some of their artifacts, but ultimately I want to be in control of what is going into my application and how.
Now, I think all this (and even more) can be accomplished with the right repository design. The OSGi is currently working on their RFP 122 for repositories and as far as I can tell this would be a great opportunity to consider the following additions.
Imagine while uploading artifacts to the repository one can also provide additional meta data and go through a verification process where certain features are tested. For instance, assuming a base version is already at present, the provider can check what actually changed between the last and the current version. Assuming there are certain rules deployed to check for API changes, the one uploading/providing the artifact can be guided through a process where he can assign the correct version information. This goes so far that not only the exported packages can be checked but also the version ranges of the imports, because all artifacts known to the repository are going through the same process (assuming a proper base-lining of course). So what could these checks be?
- check for the minimal version to apply for an exported package (ensuring API breaks are assigned to a major version increase f.i.). Of course semantic changes can’t be picked up, but here the human interaction comes into play.
- check the smallest possible matching version for a package import known to the repository to ensure maximal compatibility. Again, human interaction or API test cases can assist for semantic incompatibilities.
- multiple exporters of the same package can be identified and if appropriate an optional property like the provider, purpose, etc. can be added to make a provider selection possible.
- even errors, like missing import statements can be detected here.
Now, after having checked for these and potentially other things, the bundle can be altered to contain the defined meta data. It can even be signed and express its validity by complying to these “rules”. The resulting bundle can now be downloaded or stored on the server for further use.
Of course, this brings some more problems. First of all, not everyone wants to have its components uploaded to some server, so these information on how to alter the bundle can be used as transformation guidelines and the actual artifacts remain on another server (to protect IP for instance). The repository is so to speak just a proxy. On a request, it takes the bundle, alters it and provides it to the requester (if he has the correct access rights). Now, of course not every “jar” is allowed to be altered. We need to have some sort of proof that the uploader/provider is the author or has the rights to do so. I can think of many ways to do so, like verifying domain ownership or manual approval processes, but this will not be the topic of this post.
Another, very important problem is the hosting. One might not have the ability to use a open, freely available repository, because the bundles in question are commercial with protected IP. In that case an instance of this very repository must be available for local installation, so it can be used in companies as well. Of course chaining of those repositories must be possible as well. This brings me to the next point.
Rules valid for the whole world might not hold true for a certain company or even more important, while the knowledge about how to handle these reusable artifacts evolves and finer, more advanced checks become necessary or other languages should be supported as well, the verification process must be pluggable, updatable to ones (evolving) needs. With this we don’t have to buy into a solution that has to be correct forever. We can evolve while we’re going. Because the rules on how to alter the original bundle are stored, they can be changed, enhanced or removed at any time later if necessary. Of course, this can potentially cause other problems, but at least it would be possible.
Having this flexibility, of course one needs to know for certain, what one will receive when requesting an artifact. In fact one might even won’t to have only certain rules applied or a special set of rules only in “beta” mode available. This should also be possible with a distinct request API.
With the ability to change bundles on the fly, it is also possible to reuse existing infrastructures like maven repositories, obr or p2 for instance. A maven repository for instance can theoretically provide the meta data necessary to create the correct bundles by providing a rule-set in a distinct file as meta data. With something like this a maven repo can be used as a data source for the bundle repository. Pretty much the same hold true for any other repository I can think of.
The beauty of such a repository is that no one is forced to go with the main stream. Everyone can for their own bundles overwrite the default behavior in their own instances of repositories and f.i. limit the versions chosen to exactly one instead of a range. The central repository however enforces certain rules, so everyone can trust the output and alter it as needed. Even the decision if a bundle should be altered or if it can only be re-wrapped in a new bundle can be defined by a rule the bundle provider can define. You basically get all the freedom to do what you want locally and rely on common rules from the central repo.
There is even plenty of space for service providers making money by providing their own repositories with enterprise support. Porting the latest OSS libraries to the repo or ensuring test coverage of the released bundles, advanced checks to detect semantical changes are just a few possible enterprise features.
However, this is just the surface I scratched, there are so many more things I could add here, but I think you got the basic idea. The remaining question now is: Are we ready for something like this? Is there anyone interested in such a repository? Talking for me, I was looking for something similar quite a while and whenever I talked to someone about this, they agreed that it even has a business case worth spending money on. Don’t get me wrong. I don’t think this is the silver bullet – there is no such thing, but I believe it can be the basis to propel real software reuse and form a coalition between vendors and open source – a common standard with a tool-set capable of pushing us further.
Currently I am thinking about proposing a talk for the upcoming OSGi DevCon in Zurich and was wondering if someone would be interested in this topic as a talk, BOF or even just a bunch of people getting together while grabbing a beer. Me and my company are currently at a point where we are needing something like this and I would very much like to share my ideas and get some other views and experiences on this one. Let me know what you’re thinking!
References (in chronological order):
[last post]: http://osgi.mjahn.net/2009/04/02/the-myth-of-software-reuse/
[bundlor bug report]: https://issuetracker.springsource.com/browse/BNDLR-196
[RFP 122]: http://www.tensegrity.hellblazer.com/2009/03/osgi-rfp-122—the-osgi-bundle-repository.html
[OSGi DevCon]: http://www.osgi.org/DevConEurope2009/HomePage