Monthly Archives: April 2009

The Quest for Software Reuse

Inspired by Peter’s comment I am following up on my last post on “The Myth about Software Reuse”. I received quiet a lot feedback concerning the topic and I felt I should share some of my thoughts and visions in order to answer some of the questions and destroy the concerns I may have provoked.

In my last post I concluded that OSGi has all the features necessary to create great and reusable artifacts/ bundles if you prefer. As Richard pointed out it is THE technology, if you’re trying to build modular applications and if done right even reusable modules. The only flaw here is the “if done right” part. People do mistakes, I do it all the time and I bet you are doing it too. It’s in our nature, we can’t help it. Unfortunately when talking about mistakes in the context of reusable artifacts the implications can be disastrous.  Having a faulty versioned artifact and you rely on its correctness it will most certainly break some functionality or even the whole application. OSGi by itself doesn’t either enforce any constraints on versioning your bundles nor does it give you a detailed guideline on doing so (neither does Java). This is the reason, why the the folks from LinkedIn choose to only rely on one distinct version in their dependencies – the safest call for sure. If there are no rules, you check for them or predict the behavior of future bundles. Everything is custom made. This vacuum of  control renders bundles unpredictable in their versioning behavior and almost impossible to use in a forward compatible way usually applied when talking about Software Product Lines for instance.

If you will, one can say that the lack of control is the root of our problem – enforce control end everything falls into part, right? Unfortunately control is a two bladed sword. On the one hand you have a controlled environment where you exactly know what is going on and what to expect, on the other hand it limits your possibilities and hinders exploring new ways of thinking. Especially if you feel like not having enough information/knowledge about  the problem domain this is the ultimate killer criteria for progress – no exactly what we desire. Picking up Peters comment from my last blog post, this holds pretty much true for versioning policies in OSGi so far – we just don’t know enough yet. However, I gave it some thought and I think there is a way around this problem. Hold on a bit and I’ll explain what I am thinking.

The core of the problem from my point of view is the way we receive our dependencies. Being a good OSGi citizen one should use packages to express dependencies of course, but that’s just one part of the story. The other one are the bundles contributing these dependencies. As mentioned in my previous post, there are multiple repositories for 3rd party bundles one can use (like  [spring-repo], [orbit-repo] or [osgi-repo]). The problem here however is that one has no guaranties on what you’re getting from there. Of course you get the source you require – hopefully, but not necessarily the correct meta data you are looking for or even worse are requiring (see bundlor bug report for instance). The core problem here is specifying versions and version ranges in particular. There are no fixed rules and as Peter stated in my previous post it is a field that needs more exploration, which I can’t agree more. However, I think there is a way to satisfy the need for room of further exploration as well as accomplish the need for more control – the issue with the two bladed sword, I was talking about earlier. Let me elaborate on this a little bit more…

In my opinion, all we need is actually a repository we can trust. Trust in that sense, that we know for certain that the artifacts provided are following certain rules. The rules however shouldn’t be set in a hard coded/wired way, so that the rules can evolve and provide extra information while we evolve in understanding the topic. Another important feature (for me at least) is the “not lock in” option. I don’t want to lock myself into some vendor specific rules, if I don’t have to or don’t agree on them. It would be nice, if certain vendors provide me with some of their artifacts, but ultimately I want to be in control of what is going into my application and how.

Now, I think all this (and even more) can be accomplished with the right repository design. The OSGi is currently working on their RFP 122 for repositories and as far as I can tell this would be a great opportunity to consider the following additions.

Imagine while uploading artifacts to the repository one can also provide additional meta data and go through a verification process where certain features are tested. For instance, assuming a base version is already at present, the provider can check what actually changed between the last and the current version. Assuming there are certain rules deployed to check for API changes, the one uploading/providing the artifact can be guided through a process where he can assign the correct version information. This goes so far that not only the exported packages can be checked but also the version ranges of the imports, because all artifacts known to the repository are going through the same process (assuming a proper base-lining of course). So what could these checks be?

  • check for the minimal version to apply for an exported package (ensuring API breaks are assigned to a major version increase f.i.). Of course semantic changes can’t be picked up, but here the human interaction comes into play.
  • check the smallest possible matching version for a package import known to the repository to ensure maximal compatibility. Again, human interaction or API test cases can assist for semantic incompatibilities.
  • multiple exporters of the same package can be identified and if appropriate an optional property like the provider, purpose, etc. can be added to make a provider selection possible.
  • even errors, like missing import statements can be detected here.

Now, after having checked for these and potentially other things, the bundle can be altered to contain the defined meta data. It can even be signed and express its validity by complying to these “rules”. The resulting bundle can now be downloaded or stored on the server for further use.

Of course, this brings some more problems. First of all, not everyone wants to have its components uploaded to some server, so these information on how to alter the bundle can be used as transformation guidelines and the actual artifacts remain on another server (to protect IP for instance). The repository is so to speak just a proxy. On a request, it takes the bundle, alters it and provides it to the requester (if he has the correct access rights). Now, of course not every “jar” is allowed to be altered. We need to have some sort of proof that the uploader/provider is the author or has the rights to do so. I can think of many ways to do so, like verifying domain ownership or manual approval processes, but this will not be the topic of this post.

Another, very important problem is the hosting. One might not have the ability to use a open, freely available repository, because the bundles in question are commercial with protected IP. In that case an instance of this very repository must be available for local installation, so it can be used in companies as well. Of course chaining of those repositories must be possible as well. This brings me to the next point.

Rules valid for the whole world might not hold true for a certain company or even more important, while the knowledge about how to handle these reusable artifacts evolves and finer, more advanced checks become necessary or other languages should be supported as well, the verification process must be pluggable, updatable to ones (evolving) needs. With this we don’t have to buy into a solution that has to be correct forever. We can evolve while we’re going. Because the rules on how to alter the original bundle are stored, they can be changed, enhanced or removed at any time later if necessary. Of course, this can potentially cause other problems, but at least it would be possible.

Having this flexibility, of course one needs to know for certain, what one will receive when requesting an artifact. In fact one might even won’t to have only certain rules applied or a special set of rules only in “beta” mode available. This should also be possible with a distinct request API.

With the ability to change bundles on the fly, it is also possible to reuse existing infrastructures like maven repositories, obr or p2 for instance. A maven repository for instance can theoretically provide the meta data necessary to create the correct bundles by providing a rule-set in a distinct file as meta data. With something like this a maven repo can be used as a data source for the bundle repository. Pretty much the same hold true for any other repository I can think of.

The beauty of such a repository is that no one is forced to go with the main stream. Everyone can for their own bundles overwrite the default behavior in their own instances of repositories and f.i. limit the versions chosen to exactly one instead of a range. The central repository however enforces certain rules, so everyone can trust the output and alter it as needed. Even the decision if a bundle should be altered or if it can only be re-wrapped in a new bundle can be defined by a rule the bundle provider can define. You basically get all the freedom to do what you want locally and rely on common rules from the central repo.

There is even plenty of space for service providers making money by providing their own repositories with enterprise support. Porting the latest OSS libraries to the repo or ensuring test coverage of the released bundles, advanced checks to detect semantical changes are just a few possible enterprise features.

However, this is just the surface I scratched, there are so many more things I could add here, but I think you got the basic idea. The remaining question now is: Are we ready for something like this? Is there anyone interested in such a repository? Talking for me, I was looking for something similar quite a while and whenever I talked to someone about this, they agreed that it even has a business case worth spending money on. Don’t get me wrong. I don’t think this is the silver bullet – there is no such thing, but I believe it can be the basis to propel real software reuse and form a coalition between vendors and open source – a common standard with a tool-set capable of pushing us further.

Currently I am thinking about proposing a talk for the upcoming OSGi DevCon in Zurich and was wondering if someone would be interested in this topic as a talk, BOF or even just a bunch of people getting together while grabbing a beer. Me and my company are currently at a point where we are needing something like this and I would very much like to share my ideas and get some other views and experiences on this one. Let me know what you’re thinking!

Cheers,
Mirko

References (in chronological order):

[last post]: http://osgi.mjahn.net/2009/04/02/the-myth-of-software-reuse/
[LinkedIn]: http://blog.linkedin.com/2009/02/17/osgi-at-linkedin-bundle-repositories/
[spring-repo]: http://www.springsource.com/repository/app/
[orbit-repo]: http://www.eclipse.org/orbit/
[osgi-repo]: http://www.osgi.org/Repository/HomePage/
[bundlor bug report]: https://issuetracker.springsource.com/browse/BNDLR-196
[RFP 122]: http://www.tensegrity.hellblazer.com/2009/03/osgi-rfp-122—the-osgi-bundle-repository.html
[OSGi DevCon]: http://www.osgi.org/DevConEurope2009/HomePage

Share

The myth of software reuse

Are we fooling ourselves? Or is it real?

The ones of you who know me, know that I believe in software reuse and I am trying to evangelize about it for quiet some time. Methods, Classes, Aspects, Components, Modules, Software Product Lines, you name it – everything is focused on reuse to some extend. With OSGi for the first time we are able to create true reusable software artifacts and theoretically mix and match them on an as needed basis. Great new opportunities are coming and with the recently held OSGi Tool Summit chances are good that we’ll soon see more and better integrated tooling support in our daily work with OSGi. So, we all should be happy bees right?

Well, what we are producing right now is a massive amount of bundles, no question about this. Virtually everyone is now shipping with at least some basic OSGi headers. Several projects are trying to provide bundle repositories for the not (yet) converted projects (like [spring-repo], [orbit-repo], [osgi-repo]) and some go even further and try to wrap incompatible APIs in a compatibility layer (see [DynamicJava] for instance). Great work and very much appreciated, but this is not the point. What we got so far are “just” explicit software artifacts. We only know what is behind bundle X in version Y, a subjectively taken SNAPSHOT somewhere in the infinity of time. It doesn’t answer the question on how this code can and will evolve or how future versions will fit into my universe.

The problem domain

Too abstract? Let’s examine an arbitrary example. Assuming you are trying to create a web application similar to flickr but on some sort of an embedded device to use at home. Taking the state of the art tools, you might choose to develop your app with Spring DM or any other frameworks you like, take your pick. For now, we stick to this one. As a starter we use the following assets: Spring DM 1.1.3, Spring 2.5.6 and Maven 2 as a built system. So far so good. Once you start developing you realize that you also need a file upload feature for your pictures. Of course someone already developed something to upload files and we are trying to reuse that stuff of course, so you look around. Soon, you’ll find commons-fileupload. The spring guys are putting a lot of work into building up a repository of reusable osgi bundles in their bundle repository and we can just use the OSGi-ified version of it from here: [spring comFileUpload], which is really great! But then you soon realize that you can’t use that version. Take a look at it’s (simplified) header:

Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: Apache Commons File Upload
Bundle-Version: 1.2.0
Bundle-SymbolicName: com.springsource.org.apache.commons.fileupload
Bundle-Vendor: SpringSource
Export-Package: org.apache.commons.fileupload;version="1.2.0";
  uses:="javax.servlet.http",
 org.apache.commons.fileupload.disk;version="1.2.0";
  uses:="org.apache.commons.fileupload",
 org.apache.commons.fileupload.portlet;version="1.2.0";
  uses:="javax.portlet,org.apache.commons.fileupload",
 org.apache.commons.fileupload.servlet;version="1.2.0";
  uses:="javax.servlet,javax.servlet.http,org.apache.commons.fileupload",
 org.apache.commons.fileupload.util;version="1.2.0"
Import-Package: javax.servlet;version="[2.5.0, 3.0.0)",
 javax.servlet.http;version="[2.5.0, 3.0.0)",
 javax.portlet;version="[1.0.0, 2.0.0)";resolution:=optional,
 org.apache.commons.io;version="[1.4.0, 2.0.0)",
 org.apache.commons.io.output;version="[1.4.0, 2.0.0)"

As you can see, it requires to have the Servlet API in version 2.5, which is not available in our case on the embedded device. What now? Well, if you know the bundle, you also know, that in fact it doesn’t require 2.5! You can re-bundle it and give the bundle another version and here is the problem. The Spring guys did a great work bundling this artifact for us, but they created it based on THEIR environment and not on the minimal requirements of that specific bundle. Even worse, there is not even a real specification stating what version to take and how to version changes correctly. Forward compatibility is just a beautiful dream rather than an actual fact. If you look at well known companies adapting OSGi like LinkedIn, you’ll see what I mean. Where are we heading to when one is forced to define dependencies like this:

Import-Package:
 com.linkedin.colorado.helloworld.api;version=”[1.0.0,1.0.1)”,
 com.linkedin.colorado.helloworld.client;version=”[1.0.0,1.0.1)”,
 org.osgi.framework;version=”[1.4.0,1.4.1)”

Doesn’t this feel totally odd and just plain wrong? Shouldn’t we be able to trust the authors or “something else” to take care of compatibility issues? Someone who knows the code, its changes and its backwards compatibility better than we do? I don’t want to change all of my 200 bundles just because we have a new OSGi framework version that now provides a new final static property for instance, I am not using anyway! I want to express that I don’t care what changed in the provided bundle as long as my code still works as expected and I want to be sure that the provider has the exact same concept about “compatible code changes”.  The Eclipse community already realized this and created an API for baselining components and a version schema to apply versions. Along with several detailed instructions on what to consider a change and how this change should be treated in terms of its version (eclipse-vers1, eclipse-vers2, eclipse-vers3), they have created the first usable guide on defining version numbers for bundles. However, this is just a small step towards reusable artifacts. We also need to agree on such rules and be able to enforce them. Unfortunately a lot of questions remain unanswered yet. For instance, what about semantical changes? How to treat them and more importantly, how to identify those? Code analysis won’t always help here.

So what do we learn from this? Basically that we can’t trust any bundle provided by a third party just yet. It doesn’t matter how good the providers intentions are on providing the perfect bundle. Right now, we are basically left alone. Every time we update to a new version it is like a game. You never know what you’ll be facing.

Ok, we are screwed. Who to blame?

Now you might ask yourself who’s fault is it? It is certainly not the Spring guys fault who provide us in this example with the software, but who to blame instead? The tools? Most people either use BND, Eclipse or the newly created bundlor from Spring. All of these are pretty dumb. They can’t possibly know, which version to take (although they are trying hard to guess). There is no baseline, no knowledge about the domain or infrastructure). Too many questions are unanswered and the tool authors are left alone, so I think the tools are the last ones to blame. OK, so what about the OSGi specification? It is so vaguely written, when it comes to versioning your bundles – you can’t possibly draw any universal conclusion what version to apply. Everyone can have their own interpretation of “compatible” changes, which is not compatible among each other. All true, but I don’t think that a simple specification will be enough nor is the OSGi suitable for that. The issue is too big to be solved by only one company or organization all by themselves. Sun might be the only one fitting, but after all the problems with JSR 277 and project JigSaw, I have no convidence in their ability and willingness anymore. To be fair, one have to admit that the Java Language Specification does provide a chapter about binary compatibility, but it is not much of a help, because not all cases are covered and there is of course no notion of a bundle (I would love to rant about not treating packages as 1st class citizen in Java, but that’s an entirely different post). Sun also has a sigtest tool to check for API breaks, but with the given functionality it is pretty much worthless for what we need.

What next?

Is it the job of an external organization to define rules everyone has to apply while developing reusable artifacts for a specific language? I don’t think so. I think this should be the job of all of us. Maybe as a joint project, maybe umbrella’d by the JCP, I don’t no, but definitely as a open and community driven effort. I don’t wonna lock myself to any vendor or proprietary standard I might get stuck with. I dream of a central repository (maybe based on RFP 122, maybe something completely different), where I have a one stop shop for all the 3rd party artifacts I need. At the same time being able to do in-house development with the same reliable system not having to expose anything to the outside world. Open, reliable and trustworthy software with a healthy community of open source artifacts – does it have to be a dream or can we make it real? I already have ideas how this can come true, but I would be very much interested if you’re feeling the same or even having concrete projects concerning something similar I haven’t reference here? Is there a potential for collaboration? What do you think?

My 2 cents,
Mirko

References (in chronological order):

[OSGi]: http://www.osgi.org/
[OSGi Tool Summit]: http://www.osgi.org/Event/20090327/
[spring-repo]: http://www.springsource.com/repository/app/
[orbit-repo]: http://www.eclipse.org/orbit/
[osgi-repo]: http://www.osgi.org/Repository/HomePage/
[DynamicJava]: http://www.dynamicjava.org/projects/jsr-api/
[Flickr]: http://www.flickr.com/
[Spring DM 1.1.3]: http://www.springsource.org/osgi/
[Spring 2.5.6]: http://www.springsource.org/download/
[Maven 2]: http://maven.apache.org/download.html
[commons-fileupload]: http://commons.apache.org/fileupload/
[spring comFileUpload]: http://www.springsource.com/repository/app/bundle/version/detail?name=com.springsource.org.apache.commons.fileupload&version=1.2.0
[Servlet API]: http://java.sun.com/products/servlet/
[LinkedIn]: http://blog.linkedin.com/2009/02/17/osgi-at-linkedin-bundle-repositories/
[baselining]: http://www.ibm.com/developerworks/opensource/library/os-eclipse-api-tools/index.html
[version schema]: http://wiki.eclipse.org/Version_Numbering
[eclipse-vers1]: http://wiki.eclipse.org/Evolving_Java-based_APIs
[eclipse-vers2]: http://wiki.eclipse.org/Evolving_Java-based_APIs_2
[eclipse-vers3]: http://wiki.eclipse.org/Evolving_Java-based_APIs_3
[BND]: http://www.aqute.biz/Code/Bnd/
[eclipse]: http://www.eclipse.org/
[bundlor]: http://www.springsource.org/bundlor/
[JSR 277]: http://jcp.org/en/jsr/detail?id=277
[jigsaw]: http://osgi.mjahn.net/2008/12/04/componentization-wars-part-ii-guerrilla-tactics/
[Java Language Specification]: http://java.sun.com/docs/books/jls/third_edition/html/j3TOC.html
[binary compatibility]: http://java.sun.com/docs/books/jls/third_edition/html/binaryComp.html
[sigtest tool]: https://sigtest.dev.java.net/
[RFP 122]: http://www.tensegrity.hellblazer.com/2009/03/osgi-rfp-122—the-osgi-bundle-repository.html

Share