How many ways can you link a node?

Got a call from a friend last week that wanted to chat about some work we did a few years back to migrate data from one Product Data Management (PDM) system to another. Not for the first (or last) time we have come full circle and they are now looking to prise the data out and pop it back somewhere else.

PDM systems typically store mechanical design data – parts of things, if you like – along with meta data relating to the part (part number, material, weight, colour, edibility…). Perhaps the most fun thing they tend to hold is relational and positional and configuration data – what connects to what and where it is in a particular variant of whatever is being designed. Actually, edibility is quite fun too when it comes to designing motor vehicles or nuclear submarines.

Strangely, earlier in the year I had been thinking of the approach we had used to transform the data to get it in a useable form between these systems. Back in 2005 we developed a concept that Andreas Tsiotsias had dreamed up as part of some thesis or other.

Andreas had created a number of Java classes around something called PLMStructuredData in a collective called DataAdapter. Essentially there were three elements to PLMStructuredData – nodes, nodeLinks and nodeLinkInstances.

Nodes represent “something”. Which can be just about anything. A part, a tree, a picture of a part in a tree – that kind of thing. You might also hear nodes called “entities” by lesser mortals.

NodeLinks are the basic relationships between two nodes. They don’t convey very much other than that two nodes might be linked in some circumstances. There is also a hierarchy – one is a parent and one is a child. So you might have a bolt and a nut that have a nodeLink between them where the bolt is the parent and the nut is the child – and they both may also have a nodeLink to a “crazed robot mechanical arm” assembly.

The amazing trick with nodeLinks is that they are very fast to navigate to uncover relationships between stuff. In fact, they are not just amazing – they are amazingly amazing. At connecting stuff that is. I’m not so sure they would be much good to defend yourself against a bear or to borrow a fiver for your lunch from.

NodeLinkInstances define the conditions where a particular nodeLink applies. This might be “always” or if a particular attribute is true or if the root parent node in an assembly is a particular flavour. So – you might only include a link between “dashboard” and “Sennboseheiser Omni Megga Thump 2000 Sub-woofter” if the parent root node at the top of all the parents is “Ridiculously stoopid jacked up neon green drugdealermobile”. But it might also only apply if the build date was between 1986 and 1992. NodeLinkInstances transform buckets of loosely related nodes in to particular things that have meaning.

Armed with these three things, we were able to model just about anything in the universe – including the universe itself. As it was, we used it to model data in PDM systems – mostly of motor cars. At least, we were able to once I had put in place a documented and clearly defined class structure (UML with forward and reverse updating between the diagrams and the method stubs), built all the code with jUnit and dbUnit test cases and written education presentations so that the folks in India could understand what the hell we were on about.

Oh, and commented the code too. Although Dick Stephen might point out that my comments reflected the same tone as this blog entry and so some of the time were very entertaining, if a little confusing.

I also added a set of transformation plugins, a new language for defining how they operated on data (although interestingly it is the only language I know of that doesn’t actually have a name!) and invented what we called flavoured data.

When you want to transform data from one system to another – you are really just changing it’s flavour – and instead of “transformation” you are “flavouring”. It is remarkably easy to change the flavour of data once you know what you want to change it to.

There are, I know, many variants around in one form or another of what I have described above. I know this because I have seen and read about them – not always quite as obvious – but the same none the less. I also know they are the same so I won’t get sued for publishing this.

I did love the beautiful simplicity of our approach – and the innocent arrogance we had in delivering in a few months what many vendors have spent years developing.

But with the passage of time a few thoughts have been forming in my head. Yes, it doesn’t happen very often – and even when it does, it usually involves sheep or rockets or sheep powered rockets. But thoughts have formed none the less.

First of all, a confession. We were all out and out Java zealots at the time. Well, the ones who were important and knew what was what were. I’d have specified Java if I was being asked to paint a house or light a fire. Java was awesome for doing PLMStructredData – and once we had twigged the memory stuff and dropped a lot of the unwanted overheads like multi-threading when you don’t want or need multi-threading, we were able to make it really cook. But I have at last come to the realisation that maybe, just maybe, there are circumstances when Java might not be the bees knees. Not many circumstances mind – I’m painting my burning house in Java next week.

You see, one of the problems we had was that we ran all this in memory. Which is fine for smaller, lighter structures – but once the data gets hefty we had to resort to techniques such as the BOM skeleton – where you hold the structural information in memory and then make calls out to a relational database for the heftier bits of attribute data only when you need it.

But in working recently with loads and loads of less complex data (vastly less complex – here I am, brain the size of a planet…) another thought has occurred to me – why not just hold the whole of PLMStructredData in a relational database in the first place and use Database caching and indexing to achieve what we did with our in memory techniques.

I really hope that Mike Forest isn’t reading this as this is something along the lines that he maintained we should have been doing all along (having not been quite as beguiled by Java as normal people were).

Some may remember the tree walk race that Mike and I had – and if not, may remember the constant bitching from me with a variety of excuses I dreamed up for having lost the race. But only by a few seconds. And in any real world scenario I would have won. Or if applied across multiple vehicle variants he would have been toast. And anyway – stored procedures are so 90’s and not platform or database independent. So there.

One of the standard out of the box input and output plugins for DataAdapter was a means to store and retrieve data from a relational database. But this really was just to dump the data rather than to operate on it. It would be much more interesting to enhance these and the transformation plugins for operations in the database itself.

So I have been tinkering with a few Java classes (alas – without DataAdapter as it is no longer mine). It’s quite good fun to crack open eclipse again and give it a whirl. To be honest, I haven’t the foggiest if what I come up with will be useful – but it will be fun and if it doesn’t come off – it just gives me another reason why I was right all along.