EventModel, Callbacks, Promises, Generators, Await: Opinion about Async and Threading

Why are things Async?

This is the first thing we need to ask ourselves before continuing and there is a good reason. What we need to understand about aynschronous programming is waiting. Waiting is the worst enemy of speed. And speed is important for User Experience and the ‘ol time=money equation. In the world where nothing is asynchronous. We’ll use dummy milliseconds to figure out total time (I should really just create a performance test for this, but I’m just doing this on the fly).

  • Our Server Waits for a Connection – This is what will start the following and what we want our server to do 100% of the time (nearly impossible)
  • On Connection
    • 2 ms – Our Server processes the Domain Name/Path
    • 3 ms – Our Server digests the Query (If applicable)
    • 10 ms – Our Server digests Post Data (If applicable)
    • 10 ms – Our Server Validates Query/Post Data
    • 20 ms to 50 ms – Our Server makes a database call – It is actually unknown how long it will take, but may be one of the slower aspects of our application
    • 30 ms – Our server Turns the data into servable HTML
    • 10 ms – Our server serves the HTML
  • We start waiting for the next connection

So an application may take between each connection We are using 65 for absolutely necessary aspects and 20 to 50 for wasted time. This also means we likely cannot handle 100 connections a second (highly unlikely). But that being said, the 20 to 50 is what Asynchronous is really about. I’m most likely over estimating servers because I love them so much, but thats Lets make a clientside example. Our Client does a few ajax requests to our server for an awesome app

  • 30 ms – The Page is rendered
  • 100 ms – We render the initial Map on a canvas
  • 100 ms – We make a call to our server getting favorite locations
  • 100 ms – We make an api call to a map application to get all possible locations
  • 50 ms – We Position the map to our current location
  • 20 ms – We Render favorite locations on the map
  • Wait for Click
    • 100 ms – We make a call to our server load that specific locations information
    • 300 ms – We animate the item click to display a popup
    • 20 ms – We render the location in the popup
  • Wait for Click

This is where asynchronous becomes all the more important. In the server its more about fear, scalability and just sexy programming. In the Clientside, every time we block the user loses control. Every time the user loses control, the experience degrades immensly. In this example we have wasted about 400 ms on startup or half a second and about 420 ms (half a second again) every time they click.  These wait times are absolutely absurd from a experience example. What asynchronous programming allows us to do allow events to tell us when something should happen next. In its most basic form

  • Event Loop – while(true){ scripts.forEach(script -> script.execute() );
  • Our Server Waits for a Connection – This is what will start the following and what we want our server to do 100% of the time (nearly impossible)
  • On Connection
    • 2 ms – Our Server processes the Domain Name/Path
    • 3 ms – Our Server digests the Query (If applicable)
    • 10 ms – Our Server digests Post Data (If applicable)
    • 10 ms – Our Server Validates Query/Post Data
    • 10 ms – make database call
      • on return (10 to 40 ms)
        • 30 ms – Our server Turns the data into servable HTML
        • 10 ms – Our server serves the HTML
  • We start waiting for the next connection

We now have split up a connection, 35 ms to process the query then 40 ms to send it back. The Waiting is not even considered stopped at this point.

  • 30 ms – The Page is rendered
  • Wait for all both
    • 10 ms – We make a call to our server getting favorite locations
    • 10 ms – We make an api call to a map application to get all possible locations
    • On Return (100 ms)
      • 50 ms – We Position the map to our current location
      • 20 ms – We Render favorite locations on the map
  • 100 ms – We render the initial Map on a canvas
  • Wait for Click
    • 10 ms – We make a call to our server load that specific locations information
      • On Return (100 ms)
        • 20 ms – We render the location in the popup
    • 10 ms – Start animation
      • On Finish (300 ms)
        • Thats it
  • Wait for Click

Here We get far larger speed increases, startup is now only 150 ms on initial and 70 ms when it comes back  and only 20 are used up between clicks and 20 when the ajax call comes back. The animation is essentially there just to mask the ajax call anyway. These are big differences

85 compared with 35 + 40

400 and 420 compares with 150 + 70 and 20 + 20

The breakups are really important as well since everytime Its broken up, it allows the application to do other tasks

So Async is Perfect, no problemo

Not exactly… Similar to Functional Programming (which is likely going to be a different topic), this will make things fast (and in functional programming arguably more reliable and predictable). However, as it stands the way you have to design your application becomes a little bit stranger. Right off the bat its important we talk about threading.

Threading – The Building block of Async

In the Async programming model generally what happens is there is a seperate worker thread that recieves work, does it, then sends the result back (basically a function). So if we are to do this raw dog, this is what we are looking at

var worker = new Worker("./Path/to/a/script");

worker.onMessage = doTheRest;

worker.sendMessage(input)

This is a basic Model for Workers.  We will create a thread, listen for when it returns us data and provide it an input to do. However, multiple scripts/modules/whatever you want to call them will likely be using this one worker. Something like an Ajax call is common for a ton of applications to use and we don’t know who will get what.

setTimeout(function(){
  worker.onMessage = doTheRestOne;
  worker.sendMessage(inputOne);
}, Math.random()*1000)

setTimeout(function(){
  worker.onMessage = doTheRestTwo;
  worker.sendMessage(inputTwo);
}, Math.random()*1000);

Which happens first? Will doTheRest be registered before inputOne is finished? As a result we need to consider how to keep it relatively resusable.

Object Oriented Events – The XMLHTTPRequest standard

First I will create a rather unoptimized class

function OurWorkerClass(){
  this.worker = new Worker("path/to/a/script");
  this.worker.onMessage = function(packet){
    if(packet.error){
      return this.errorFn(packet.error); 
    }
    this.finishFn(packet.output);
    this.worker.destroy();
  }.bind(this);
}

OurWorkerClass.prototype.onFinish = function(fn){
  this.finishFn = fn;
}

OurWorkerClass.prototype.onError = function(fn){
  this.errorFn = fn;
}

OurWorkerClass.prototype.doWork = function(input){
  this.worker.sendMessage(input);
};

This is unoptimized since we are creating a worker and a closure for every instance. But this is to show how this thing works

var worker = new OurWorkerClass();
worker.onError(handleError);
worker.onFinish(function(output1){
  var outherWorker = new AnotherClass();
  otherWorker.onError(handleError);
  otherWorker.onFinish(function(output2){
    var thirdWorker = new thirdClass();
    thirdWorker.onError(handleError);
    thirdWorker.onFinish(finished);
    thirdWorker.doWork(output2);
  });
  worker.doWork(output1);
});
worker.doWork(input);

I would say the frame work muddles the code. Much more initialization then logic step by step progress. Ugly stuff.

Enter Callbacks

var worker = new Worker("./Path/to/a/script");

var pendingWork = {};

function doWork(input, callback){
  var id = Date.now() + Math.random().toString();
  pendingWork[id] = callback;
  worker.sendMessage({id: id, input: input});
}

worker.onMessage = function(packet){
  var id = packet.id;
  var error = packet.error;
  var output = packet.output
  pendingWork[id](error, output);
  delete pendingWork[id];
}

This is a basic Callback Model for Workers.  To use it we would call the doWork function with an input and a callback and it will correctly notify us which work did what. However, when in practice, this is what it turns into.

doWork(input1, function(err1, ouptut1){
  if(err1) return finished(err1)
  doOtherWork(output1, function(err2, output2){
    if(err2) return finished(err2);
    thirdWork(output2, function(err3, output3){
      if(err3) return finished(err3);
      fourth(output3, function(err4, output4){
        if(err4) return finished(err4);
        finished(void 0, output4);
      });
    });
  })
});

Theres the argument that Ryan Dall spoke about in terms of creating multiple functions to avoid it. This actually isn’t a bad idea in general as every function you create in another function would literally be created instead of being referenced from before. This is what it looks like though

doWork(input, callback1.bind(void 0, finished));

function callback1(finished, err, output){
  if(err) return finished(err);
  doOtherWork(output, callback2.bind(void 0, finished));
}

function callback2(finshed, err, output){
  if(err) return finished(err);
  thirdWork(output, callback3.bind(void 0, finished));
}

function callback3(finished, err, output){
  if(err) return finished(err);
  fourthWork(output, finished);
}

And this only exists because javascript exists as a two sweep scripting language and hoists functions to the top. In my humble opinion, this is fugly.

Promises – One of the many gifts jQuery popularized

Promises are one of the greatest things that has ever happened, I assure you. But they aren’t too freindly from a speed/memory perspective according to many node contributers.

var availableWorkers = [];
function getWorker(){
  if(availableWorkers.length){
    return availableWorkers.shift();
  }
  return new Worker("path/to/our/script");
}

function finishedWorker(worker){
  availableWorkers.push(worker);
}

function doWork(input){
  var worker = getWorker();
  return new Promise(function(res, rej){
    worker.onMessage = function(packet){
      finshedWorker(worker);
      if(packet.error) return rej(packet.error);
      res(packet.output);
    };
    worker.sendMessage(input);
  });
}

perhaps I’m muddling too much worker code with these examples. It just is a lot of fun. Regardless, this is what it turns into

doWork
  .then(doOtherWork)
  .then(thirdWork)
  .then(fouth)
  .catch(handleError);

Sexy, clean, beautiful. Really is georgous in my humble opinon. However, things aren’t always so clean. See Example B

doWork(input).then(function(output1){
  var p = doOtherWork(output1);
  p.catch(handleSpecialError);
  return p.then(function(output2){
    return doOneWith2Arguments(output1, output2);
  });
}).then(doThird.bind(void 0, input))
.then(function(output3){
  return doFourth(output3, input);
}).then(function(output4){
  return Promise.all([
    doFifthA(output4),
    doFifthB(input)
  ]).then(function(outputs){
    return finishFifth(outputs[0], outputs[1]);
  });
}).catch(handleError);

Once we start customizing our catches and arguments, things start getting weird. It can start to become quite difficult to figure out what the hell is going on. On line three, that catch will exit the program. doThird recieves the output of doOneWith2Arguments and also takes in input1 as its first parameter. For doFourth we need to pass in input as the second argument. The fifth is attempting to do two works side by side. So what are we supposed to do?

Generators – Not made for Async, but looks like it

Going Async With ES6 Generators

This is what above looks like

runner(function main*(){
  try{
    var output1 = yield doWork(input);
    var output2;
    try{
      output2 = yield doOtherWork(output1);
    }catch(e){
      specialErrorHandle(e);
      return;
    }
    var output3 = yield doneWith2Arguments(
      output1,
      output2
    );
    var output4 = yield doThird(input, output3);
    var output5 = yield doFourth(output4, input);
    var outputs = yield Promise.all([
      doFifthA(output5);
      doFifthB(input);
    ]);
    var finaloutput = finishFifth(
      outputs[0],
      outputs[1]
    );
    finished(finaloutput);
  }catch(e){
    handleError(e);
  }
})

This is almost the holy grail. What we’ve been waiting for. Something that looks like what is should be. Its a crazy thought right? A program being sequential and effective? Wild stuff really. Unfortunately, these still need to be wrapped in some function or be used via promises or callbacks.

Await – The True Holy Grail

https://jakearchibald.com/2014/es7-async-functions/

Basically above only await can be used anywhere and likely on anything that returns a promise. It will be glorious.

So, this post is over right?

Well, yes and no. Lets go back for a second. So Async allows us to handle work in a seperate thread and continue execution without blocking the event loop. This is the important thing though. Blocking. If it weren’t for the blocking of the main thread, there would be no problem. But as GPU processing becomes easier to use and CPUs go from 64 bit single threaded to 64 bit 4 core we start seeing the opportunity to maximize what we have.

What if events spawned a new thread?

Lets look at our server example

  • Wait for Connection
    • Create a new thread (or retreive one from the pool)
    • Provide the thread the connection
  • Wait for Conection

Problem here is that Database and http calls would require witing in threads probably causing 20 to 30 threads running at one time slowing down everything

Lets look at a client Example

  • On Click
    • CSS Animations (GPU Bound)
    • Ajax Call -> dom manipulations
      • Next Animation Frame Write dom to GPU

CSS animations is fine, dom manipulations are global. This means that there would need to be a global thread that is mutable by all others.

So the issues would be

  • Mutability of Shared Resources (Dom specifically)
  • There may be a situation where there are more threads than necessary running at once causing everything to slow down.

Lazy Everything: Dirty checking, Caching results, single iteration

One thing that is somewhat popular now adays is Lazy Evaluation.  Lodash implemented a form after a competitor (Lazy.js) where showing big promise for takeover. However, javascript isn’t the source of all lazy evaluation. Haskell, Scala and other Functional Languages have been taking advantage of it in full effect for a while. Things Such as the streaming api in node and lazy getters will likely cause a vast speed increase in your application.

Functional Programming Model

This is where I can go down another path. But heres the basics of it. Every time you set an output of a function to another you create a memory pointer or clone the variable. Additionally Whenever you provide it as an argument, the same thing occurs. Hitting memory 1 time instead of 2 times may increase your speed greatly. Referencing a property directly instead of through a pointer may also greatly increase the speed of your application. But beware, this may be what your application looks like

lastFunction(
  firstFunctionCalled(),
  FifthfunctionCalled(
    ThirdFunctionCalled(
     SecondFunctionCalled()
    ),
    FourthFunctionCalled()
  )
)

This may start becoming intuitive but for me I usually think in step by step instead of what to do last. That being said, generally your application will not have mutations to the global scope or mutate the arguments so this form of programming should be fine

Closing thoughts

Async solves the waiting problem which is a very important problem indeed.