ORM’s Dreaming Big: Pt 3 (Big Pappa ORM)

Here is Part 1 and Part 2 if you are interested

Two Orms to date have been very interesting to me.

  • RethinkDB – Pushes events to any listeners. This inherently supports cluster since if a socket attempting to synchronize, it needs to know the changes. Luckilly, the changes will be pushed to all threads.
  • Waterline – What I really like about waterline is that instead of being a storage system, it is just the interface to it. This allows you to have specific storages and maximize their best parts without having to write in different query languages.

Databases have gone in and out of style so fast over the past years. MongoDB, CouchDB, Postgres, Hadoop, MySQL. All of which are competing for the marketshare of being the “database of choice”. That being said, All of them have distinct advantages and disadvantages. Anything SQL also gives you the ability to run a PHP stack without much trouble. Anything JSON allows you to store documents in the format your coding in. Additionally, Redis has shown that moving the global memory of sessions and the like to a single store is very important for clusters. As a result, the Storages increase and unfortunately, the interface increases as well.

Queries, Adapters, The Bazaar and the Cathedral

If you havent read this essay, I think you should. The gist of it is there is a clear difference in the way people write parts and something is created in one huge gulp at a time. Now, I’m in the parts boat because as time continues, there may be new Databases. There may be new fancy things. And as a result, you don’t want your api to change with each database you move to. In a bazaar, you have your fruit farmers, butchers, jewlery, etc. Each is specialized with a solid respect to each person and what they do. In a cathedral, you are given your daily bread and wine on special occassions. This is simple and works however, some cathedral’s bread is better than others. And sometimes you get olive oil with yours. I believe that allowing as many databases to interact with your ORM is superior to forcing a the API to adhear to your Databases design. In this way, you can make the Query Language the best Possible without breaking compatibility with an Databases. As a result, there is a few simple laws I believe he ORM should adhear to

  • One Query Syntax for all databases it interfaces with
  • “Adapters” decompile the query into what the database will actually use and send it

As result, there becomes a dependency. Adapters -> Models. However, Adapters are more abstract, they generally should be reused for multiple databases of the same type. As a result there becomes a further dependency.

Adapters -> Connection to a Database -> Model/Table

Pluggin in, Plugging out and letting everything just Figure itself out

Content management systems are allowing you to design databases in the browser. WordPress has custom PostTypes. Drupal has Schemas. Treeline is making “machines”. However, the most important concept here is that when you make an edit to the a database model, the whole server doesn’t have to shut down on the way there. PHP has the advantage of not existing in memory as a result, each call to the server is essentially a “fresh” instance. NodeJS doesn’t have that kindness. As a result, making sure your orm is designed in such a way that dependencies don’t require the need to Destroy and recreate is of upmost importantce. Something simple such as

orm.addModel("ModelName", modelConfig)
orm.removeModel("ModelName");

Can really make or break what an ORMs capable of. A simplistic Algorithm would be something like this.

util.inherits(ORM,EventEmitter);

ORM.prototype.addModel = function(name, config){
  this.models[name] = new Model(config);
  var connDeps = getConnectionDeps(config);
  var modelDeps = getModelDeps(config);
  allListeners(name,modelDeps,this,"model");
  anyListeners(name,connDeps,this,"connection");
}



function allListeners(name,deps,orm,key){
  var depnum = 0;
  var addlistener = function(){
    depnum++;
    if(depnum === 0){
      orm.emit(
        "add-"+key+"["+name+"]",
        orm[key][name]
      )
    }
  }
  var remlistener = function(){
    if(depnum === 0){
      orm.emit(
        "rem-"+key+"["+name+"]",
        orm[key][name]
      )
    }
    depnum--;
  }

  deps.forEach(function(depname){
    if(!orm[key][depname]){
      depnums--;
      orm.on(
        "add-"+key+"["+depname+"]",
        addListener
      )
    }else{
      orm.on(
        "rem-"+key+"["+depname+"]",
        remListener
      )
    }
  });
  orm.on("destroy-"+key+"["+name+"]",function(){
    deps.forEach(function(depname){
      orm.off(
        "add-"+key+"["+depname+"]",
        addListener
      )
      orm.off(
        "rem-"+key+"["+depname+"]",
        remListener
      )
  });
}

function anyListeners(name,deps,orm,key){
  var depnum = 0;
  var addlistener = function(){
    if(depnum === 0){
      orm.emit(
        "add-"+key+"["+name+"]",
        orm[key][name]
      )
    }
    depnum++;
  }
  var remlistener = function(){
    depnum--;
    if(depnum === 0){
      orm.emit(
        "rem-"+key+"["+name+"]",
        orm[key][name]
      )
    }
  }

  deps.forEach(function(depname){
    if(!orm[key][depname]){
      orm.on(
        "add-"+key+"["+depname+"]",
        addListener
      )
    }else{
      depnums++;
      orm.on(
        "rem-"+key+"["+depname+"]",
        remListener
      )
    }
  });
  orm.on("destroy-"+key+"["+name+"]",function(){
    deps.forEach(function(depname){
      orm.off(
        "add-"+key+"["+depname+"]",
        addListener
      )
      orm.off(
        "rem-"+key+"["+depname+"]",
        remListener
      )
  });
}

With even emitters, you can toggle back and forth with minimal issues. Theres other parts that are important such as…

  • Binding a model to the orm instead of making the call itself
  • Being able to queue requests from a model
  • Throwing errors when things fail

Cluster Support

Cluster support is one of the most important parts about any modern javascript module now adays. If it can’t be run on a cluster, its not fit for production. If its not fit for production, its going to end up being just fgor somebodies side project. From a simple concept, you can add cluster support by relaying events. This simple example is all we really need to ensure we are sending events properly. First off we must figure out what events need to be sent globally. For our cases, we’ll do delete of a model

ORM.prototype.addModel = function(){
  var model = this.figureOutDeps(arguments);
  var self = this;
  model.on("delete", function(instances){
    if(self.isChild){
      process.send({
        type:"orm-rebroadcast",
        event:"model["+model.name+"-delete"
        data:instances
      });
    }
    self.emit(
      "model["+model.name+"-delete",
      instances
    );
  });
}

As you can see, when a delete has happened locally, we then tell the master what has happened. From here the master tells every other worker what has happened

ORM.prototype.asMaster = function(workers){
  var self = this;
  workers.forEach(function(worker){
    worker.on("message", function(msg){
      if(msg.type === "orm-rebroadcast"){
        self.broadcast(msg,worker);
      }
    });
  });
  this.workers = workers;
}

ORM.prototype.broadcast = function(msg,not){
  this.workers.forEach(function(worker){
    if(worker === not) return;
    worker.send(msg);
  });
}

From there we can implement the worker’s listeners

ORM.prototype.asWorker = function(){
  var self = this;
  process.on("message", function(msg){
    if(msg.type === "orm-rebroadcast"){
      self.emit(
        msg.event,
        msge.data
      );
    }
  })
}

There are things that can be done a little bit nicer. For example, having workers tell the master what the want to listen and not listen for. Additionally, We can reimplement this with Redis or any other api because it really isn’t that complicated.