Decentralization : Internet, Identity and Peers

Decentralization is a hard problem to solve. Here I’m going to talk about a few forms of Decentralization. When you are forced to trust, when trustless systems start to become burdensome and how finding a deliverable involves a different form of trust than the deliverable itself. I’m going to be as factual as I can be since I am not an expert on the subject though its something I’ve thought a lot about. Consider this more of an intro to decentralization.

Decentralization of the Internet

Inorder to understand the decentralization of the internet we need to assume a few things

  1. Electricity is free (whether personally generated or taken from a source)
  2. Every Person has a wifi (or every ‘zone’)
  3. Each wifi extends just enough to contact other nodes

And there you have it. now this isn’t a ‘decentralized internet’ just yet, we also need domain names. This is done by a trustless ledger allowing the reservation of such names, probably associated to an economy of some sort. However, there exists a new problem, how do you get an IP address?

Before IP addresses were designed with location in mind. Each part of the ip homing in on a part of the world. However, now that we have become decentralized, how do we organize IP addresses?

  1. 8 Parts of the Earth – Right off the bat, splitting the earth into 8 triangles makes things very very simple
  2. Triangulation – Each of the parts can then be split into smaller and smaller triangles. Ideally 4 equilateral triangles until we find our local node
  3. Local List – When the purchase has been made, the node knows which connection / wallet made the purchase. In knowning this, it can very simply deliver the information to the correct computer.

But this implies we have all money and all technology available to us, which we don’t. So instead we would likely have to have a much more disorganized method.

  1. Tell all connected wifi nodes about my request
  2. Recursively – nodes tell all their peers (connected wifis)
  3. Event – Correct Peer is found
  4. A record of the pathway from current node to originator is noted.
  5. A response is then sent back

But how do we know who the correct peer is and who isn’t? By issuing a public key, a situation can be made where only the one with the private key knows how to decrypt it. However, there needs to be an additional location based or peer based signiture associated to it, otherwise it can be brute forced.

How do you create a node that is peer based?

  • How many of your connected peers are connected to eachother?
  • How many peers to each of your connected peers have?
  • What is the number your peers gave you when you entered?
  • How many of your peers have reached maximum registration?
  • How far until no more unique peers are found?
  • What is the biggest loop that can be made?
  • What signiture did your peers give you?

If a node goes offline then online

  • When you created a peer, what was your signiture?
  • What were the signitures of your creators?
    • What was the signiture they gave to you?

This involves a lot of trust which may or may not be possible in some certain situations. Ideally these will be unique. They likely will be so long as everything is counted and direct connections update their identities often. This has been a fun thought experiment but there is much more to discuss and diagrams to make

Free Write: Compression Algorithms

So, I still think about compression algorithms from time to time for giggles and I’ve come to more conclusions. The combinations of statistics quite innadequate to be used as a means of compression. For a bit of background, the way it works is it gets statistics on the whole then splits it up by groups. With the groups it gets statistics on each and consolidates them as much as possible. The problem with this is ordering and redistributing resources amoungst the groups. With redistribution, I have attempted to find ways to create strong distinctions between them. Cases such as ‘sum of unique pattern lengths’, ‘number of unique pattern lengths’, ‘number of unique prime patterns’ and ‘total factors in lengths’ allow us to figure out some of the patterns simply. But in some cases its not that simple.

Given unique pattern sum of 20, 3 unique patterns, 3 primes and 3 total factors, find all possible combinations

  • 20 = 11 + 7 + 3
  • 20 = 13 + 5 + 3

As you can see, we have a problem. There are two possibilities for what we want to be only 1. If we specify the largest value, finished right? Nope

  • 42 = 23 + 11 + 7 + 3
  • 42 = 23 + 13 + 5 + 3

Arguably this is fine because we can make this enumerated. However, then we are generating all possible values that meet this requirement. How do you even begin with that?

  • With 4 values, we know that 2, 3, and 5 are the lost possible. That means  that the number that is fluid is actually 42 – 10 = 32.
  • 32’s closest prime is 31 and we can only add the the greatest low prime (5). 1 + 5 = 6
  • However 6 is not a prime. As a result we need to find the next highest prime (29) and distribute the 3 until we find all primes
  • This however is not possible so we do it again with 23 (as that is the next available prime) etc

Perhaps there are less than 2^32 values, perhaps not. But lets say there is. Now we know The segment lengths. How are the repititions distributed? We know there are X total repititions distributed amoungst Y of the possible Unique Segment Lengths that Total up to Z of the Complete resource. Lets Enumerate that! Why not? We’re getting sloppy already, why not another Enumeration? Ok, so we know what percentages of all repetitions belong to each unique segment length. Now, how many times does Each unique segment length happen? How are the reptitions split within each segment lengths domain?

Lets Make sure we know all the Questions. So our method is divide and conquer. We will split the data into parts based on how its repeating (111000111000 etc).

  • How many unique segment lengths amoungst the patterns?
  • How many patterns are there?
  • How many of each pattern is a one of the unique segments?
  • What are the Unique Segment Lengths?
  • How many repetitions are there?
  • How many of the repetitions belong to each unique segment length?
  • How does each unique segment length distribute its repetitions?

Then we have ordering. I had tried to split this up into different types of ordering. Heres what we know.

  1. Each Pattern Starts with a different bit than the last one ends – that means if the previous ended with 1 the current starts with 0
    1. Example : 111100001010, 00011001100
    2. We can treat all patterns as 1/0, 1/1 || 0/0, 01
    3. All similarities can be combined (1/0, 1/1+0/0, 1/0 -> 1/0)
    4. This can be done any number of times
  2. Each Pattern Length is different from the next pattern length – This means that there will be a rise or a fall between each pattern
    1. There are cases when knowing highs and lows may not tell the complete pattern – High, Low, High, Low
    2. If done recursively for each high and low, the complete order can only not be found due to 2 equal lengths that are tested
  3. There is a minimum start point for each segment length
    1. Example: 5 segment A, 4 unique – A must surround unique
    2. Example: 5 segment B, 20 unique – B must start before 16
  4. There are unique occasions such as
    1. Patterns next to eachother have the same number of bits
    2. Patterns next to eachother have the same number of 1s/0s
    3. Patterns next to eachother have the same number of repetitions

These properties can filter the overall orders significantly, but there is still a problem. Coming up with algorithms to create the possibilities. But in the end, enumeration becomes an issue again.

Why is Enumeration Evil?

The reason why its evil is simple. Data is an enumeration of itself. There are 1000 possibilities for the number up to 1000, 10 possibilities for the number up to 10 and 9999 possibilities for the number up to 9999. Really a file and any peice of data is just a really big number that is in base 256. As a result, anytime we are are possibly creating a whole new file for this specific situation. It may be big, maybe small, we don’t know.

How do we avoid Enumeration

The way we avoid enumeration is looking for ways to make patterns or rules within a peice. We see a series of bytes get repeated? We remove the series and add a pointer to the first. This is at least the gzip method. Another possibility is finding positions in indefinite random sequences. Its important to note that none of these random sequences should overlap. Issue with this case is that we basically make threads for each random seed until one of them gives us the number we want. The Discovery is terrible.

One idea I had was that the data can be broken up into two equations

  • y = X%2
  • y = (X + 1)%2

From these two equations, we then figure out Each Start, Stop

  • X Offset
  • On, Off

We can then create 2 Strings with the X Offsets. However, this will bring us to a list of on/offs. Each On/Off takes up 16 bytes for the offset and is only streamable when we specify the length of On and reach at least halfway. Its quite possible that this would infact be larger than the data itself.

Another form of enumeration avoidance is a different divide and conquer approach. Lets say you take a file and split it up into bytes. You then remove all the zeroes and mark where they are as a seperate byte pattern (100111001100, where 1 means there is a zero and 0 means it is from the original list). If there is 100 zeroes in a 200 byte file, the file has effectively been reduced to 100 + Ciel(100/8) or around 113. These are significant gains and only increase as more zeroes are found. Except when there are only 8 or 16 zeroes, the gains are miniscule. What if the entire file was split up? Well, then we have to mark how long each of the patterns is going to be (probably with a 64 bit integer) then we have to do it for 256 values. For a file that is 200 bytes, the result will be 4 + 256 * 200/8 or 4 + 256 * 25. Basically we are effectively increasing the overall size of the document. If we were to try and do it for only the most counted, in situations where all the bytes occur equally no compression could be done.

Moving Forward

Ultimately, heres where I stand on compression

  • Describing data through properties is ok for verification of it but trying to compress it with properties is cool but ultimately a red herring
  • Trying to force a file to fit within a constraint leads to ‘maybe compresses’ type situations.
  • Trying to match up a file to a random number requires multiple threads to be created.
  • Trying to match up a sequence with something that has happened previously requires big storage

Compression isn’t simple but it sure is a lot of fun!

Failure and Compression

The last few days I was hammering away at some interesting ideas. They were about making any file into a 16 hexadecimal string.  The idea came about when I saw a post on hacker news about distributed systems and url standards for them. When it comes to distributed systems and sharing of files I’m of this opinion

  • You want the requestor able to get chunks from multiple computers and consolidate them into one – This also means that all the chunks know that they are a part of a whole and are able to point to the whole
  • You want the requestor to know whats coming to them – This can be metrics about the file like length, checksum, hash, etc.
  • Security is secondary to ensuring the system is working without hicups. Secure requests should be a manner of encrypted messeges with keys that are already available to each computer.

I want to take the second one a step further. I want to make the url so detailed that the computer can actually recreate the file from their tiny little hash! Impossible? Never! Here is the product of my work https://github.com/formula1/compression-failures

Technique 1: File attributes and filtering possibilities

So the first thing I went after was attempting to turn any file into a few 32 bit integers. This included obvious things like file length to ensure I have a finite number of possiblities and checksums to remove extremes and to limit my possibilities even more. From there its kind of free shooting. I figured my best next step is to seperate the file out into what I will call ‘patterns’. The way patterns work are as so

  • A pattern has a segment length – 010 has a length of 1, 001100 length of two, 000111 length of three, 1111 4, 00000 5, etc
  • A pattern has a repition count – 010 has 3 repetitions, 001100 also has 3, 000111 has 2, 1111 has 1 and 00000 has 1
  • A pattern must also either start with 1 or zero
  • A pattern ends when what would be the next repetition ends early or goes for too long.

Splitting the file up into patterns sounds like a swell idea right? well with the 64k byte file I used, it had around 10,000 patterns. However, more interestingly is that the segment lengths were at 9 and the repetitions were at 7. Now these two aspects filter the number of possibilties even more (which I did not mathematically calculate). Ideally, each of these attributes would get me closer and closer to finally understanding what the file actually contains. However, I realized this isn’t going to be nearly as simple as what I thought

  • given
    • 00110011
    • 11001100
  • can you tell the difference between their length, checksums, unique segments, unique repetitions, total repetitions and starts with 1 counts? yes you can!
    • 8, 4, 1, 1, 4, 0
    • 8, 4, 1, 1, 4, 1
  • what about
    • 010011000111
    • 001101000111
  • You cannot
    • 12, 6, 3, 1, 6, 0
    • 12, 6, 3, 1, 6, 0

So I had this brilliant idea! What if I can figure out the order? Not the actual count, just the order of largest to least. To do this I would check the difference between each pattern. If the pattern increased in segment length 1 if decreased 0. What could go wrong?

  • given
    • 010011000111
    • 001101000111
  • You can!
    • 12, 6, 3, 1, 6, 0, 11
    • 12, 6, 3, 1, 6, 0, 01
  • What about
    • 001101000111
    • 000111010011
  • You cannot
    • 12, 6, 3, 1, 6, 0, 01
    • 12, 6, 3, 1, 6, 0, 01

How about if I get the order of the highest and lowests? Technically I can do this indefinitely until I order them completely!

  • given
    • 001101000111
    • 000111010011
  • You can!
    • 12, 6, 3, 1, 6, 0, 01, 1
    • 12, 6, 3, 1, 6, 0, 01, 0
  • what about
    • 0000011111010011
    • 0000111101000111
  • You cannot
    • 12, 6, 3, 1, 6, 0, 01, 0
    • 12, 6, 3, 1, 6, 0, 01, 0

I would need ot be able to point out exactly how much is in each of those highest amounts. Perhaps I can count the number of unique differences? Nope, because both are unique. This was my snag. I stil don’t know how to handle it. I am also fully aware that the ordering may actually be far more data than I anticipate.

Technique 2: Prime numbers!! 😀

:C So the way I planned to do it was as so:

  • Turn the file into a Gigantic number
  • Boil the number down to prime numbers and exponents
  • If the prime number if too big, represent it as the ‘nth prime’ instead.
  • if the numbe rof factors are too long
    • find the closes prime number to the gigantic number
    • subtract the gigantic number from the prime
    • Do it again with the left over

What I would be left with is a sequence like this

mul(2,3,nth(7),add(nth(13), mul(5,7)))

Looks great right? I thought so too! I deliberately created the syntax so I can format anything into 4 bit operations. Unfortunately I didn’t realize how slow prime number generation could really be. I ended up creating my own because I needed to be constantly streaming them and or hold a binary list. However, the problem with prime generation is that they always need the previous numbers to go forward. Finding a prime is actually about finding what isn’t a prime and collecting the leftovers. This sounds straight forward but its actually pretty ugly and leaves me waiting for minutes at a time to handle 24 bit primes.

  • stack = [];
  • i = 2
  • do
    • yield i
    • stack.push({ value: i + i, jump: i })
    • while(stack[0].value == ++i)
      • var jump = stack.shift().jump
      • var value = i;
      • var index = -1;
      • do
        • value += jump
        • index = stack.find({ value : value });
      • while(isFound(index))
      • stack.insertAt(index, { value: value, jump: jump})
  • while(true)

This generates primes fast, don’t get me wrong. I was suprised at myself at how well it worked. Truelly. Not only that, I can take full credit (with inspirations from some seives). But if it doesn’t create primes, it finds what isn’t primes. If it can’t handle 24 bits, whats to say it can handle 10 bytes or even 1000 bytes? When I was writing the readme I decided I would write a bit about making workers and making it threaded. This is kind of a neat idea but its still not perfect considering each worker still must wait for its previous workers primes. as we get into huge numbers, that is less true because 2*current may often be 3 workers later. Another concept is using a less absolute prime number creator like a mersenne prime. These are easily calculateable and also can interact well with logs so Its possible I could speed up the algorithm to a huge degree. Instead of trying to find out if its prime. I check how far away from the next 2, If 1, I consider it a mersenne prime. Else, get the mersenne prime of the two before it. Multiply it a few times. Subtract the total. And do it again with the left overs. This seems just as good but prime numbers are pretty special. And as good as mersenne primes are, they will probably not always be good enough for my purposes.

What can I say about it?

What I love about working on projects like these is I tend to want to end them. Usually, I want to come to some conclusion like “it’s imposible” or “way too slow” but I always find a way to make it work. From there its just about implementing it or ignoring it. A project like this has huge uses but many of the uses I don’t have direct interaction with. Only potential. I do believe growing myself is important but I don’t want to lose it over a cute little experiment with big potential

Decentralized Name Ledger : Free Write

Whats worse than a javascript developer? A javascript developer with an unthemed wordpress blog, who wants to decentralize javascript.

So, package management is kind of a big deal. But for me it really became in the limelight with npm. I’m truelly amazed at how a tool like npm can speed up development so much. Turning my little project into something I can build anywhere! However, like the great torrent, decentralization is a possibility. Many node developers. Here, I went at length trying to fight the idea of a decentralized name ledger. Insistant that competition is the only way for a name ledger to work. Hypothetically that may be true but only if proven. Here I will attempt to understand the complexities involved in a blockchain style ledger.

So how do we do this?

What maintainence needs to be done? To start out, somebody needs to keep tabs of it. And not just one person, many people. Normally every individual does but realistically a few groups do. This involves

  1. Accepting New transactions – which may happen at an absurd rate
  2. Mining (Validating) the chain – involves heavy duty processing power
  3. Having the ledger available for distribution – Hypothetically, this can be done in chunks since each block can be infohashed and etc. But, realistically, I have no idea how the the algortihm works so it may be that the entire chain changes per transaction (Highly doubt it). (EDIT) Probably because for this to happen, the block cannot change inorder to ensure immutability. However, if it cannot change, it cannot point forward, only backward. Does blockchain try to point forward? It doesn’t need to technically

Someone needs to do this work. Arguably, the makers / Users / Investors of the block chain will but that means that our coin is… well… valuable. And as a maker, theres only so much I am willing to invest until I realize the money / work I put in will not be the same as I get out. I expect no one else to do the same unless they feel like waisting electricity for a place in internet history. However, many of these issue can be solved by…

  1. Using someone elses block chain

Enter Etherium. This is quite literally what it was made for. For other people to tag on their arbitrary nonsense. So… technically…

  1. Our Coin is “valuable”
  2. There exists a mining community
  3. There are individuals willing to distribute the ledger

And unless someone decides to sell big it will stay that way. Alright. Great. Now lets make a name ledger!

Ok, there exists examples. So EtherId decides to allow you to pay for time. Then you must pay for more time. Who do you pay? Because ethereum has a limit, am I really willing to dish out (now 5 dollars) up to 100 dollars for a name? can I sell it back?  Paying for time seems like it could end up in an endless battle. Lets organize these concerns

  1. does there need to be a cost for a name? – I would say yes. The last thing anyone needs is a spammer destroying a currency.
  2. Is the cost for a name static or can it rise/fall? – It should. There is no reason why a name should always cost 1 ether especially considering nobodies are meant to be publishing their packages happily.
  3. Where does my money go? Can I release the name and recouperate it? – If say no, I have sincere doubts about the viability of the system as a whole. At that point the currency is trying to keep their necks above water where it consumes from everyone else effectively causing deflation.
  4. Paying for time – Lets assume we own the name ‘hello-world’ which resolves to a package used by newbies world wide. Lets say I didn’t pay this month because I was broke. Someone will ill intentions swoops in and takes it. Now they never have to sell or give it back. There is no centralized authority to fight about this. There may be no public shaming if done anonymously. So long as this other individual has enough, they will always have control over it.

Lets trial and error some solutions!

  1. Names will increase in value the more that are taken within a given letter range
    1. Example
      1. For 1 letter names, there exists 36 possibilities.
      2. The first name may be free
      3. the second may cost a bit
      4. third more
      5. etc
      6. Each name’s maximum is the maximum number of Permutations for given length (name.length!)
    2. Good
      1. What this does is curbs spamming of names and rewards more lengthy unique names
      2. When done only done to individual accounts, it doesn’t bring into account when a spammer makes 1,000,000 accounts each registering one name
    3. Bad
      1. Puts a cost (though small at times) on registering names. This will prevent most people from ever registering since then they have to actually have ether
  2. The price of keeping a name increases as the bid to buy it increases
    1. Example
      1. I buy the name ‘big-money-no-wammies’ name at 0.01 ether
      2. Years go buy with not another dime spent
      3. Someone comes in and wants it and offers 0.02 ether
      4. I don’t want to sell but they other person impatiently waits
      5. The price of keeping the name is now 0.01 ether
      6. Another person comes in and sets the price at 0.005
      7. The price starts at 0.01 ether
      8. Another person comes in and makes a bid for 0.02 ether
      9. The price remains at 0.01 ether
      10. Another person comes and makes a bid at 100 ether
      11. The price for keeping the name is now 50 ether
      12. I cannot pay
      13. I recieve their 100 ether bid, I recieve all the money I’ve had to invest back
      14. All other bids persist
    2. Good
      1. People cannot squat on a name
      2. Its possible to crowdsource the purchase
      3. Most names do not need to maintain upkeep so people can keep them for free
      4. People cannot bully others quickly without taking a hit
    3. Bad
      1. This can turn into bullying. Forcing someone to continuously pay .1 is may still cause many people to lose their names
      2. ‘Fairness’ is not rewarded – I lose my name to someone forcing me to pay 10. I want it back so I use their 10 to make them pay. They dont pay so we get the name. They make a bid for 10 again. And the cycle continues where I must invest while they just force me to invest
  3. Rising and failing of the price of a name can be solved by making our own crypto currency! Oh wait…. Bad solution…
  4. Clients can ‘lockin’ a name to a specific user. That user can then point a new name to be considered what the old name was
    1. Example
      1. I get bullied out of ‘poop in my pants’
      2. People only trust me despite the fact the other person is in control. In their client that resolved the name, they ‘locked’ me to the name
      3. I create a new name that points to the old one.
      4. The next time they resolve my name, they then get redirected.

Ok, those solutions are fine but I still think that competitive centralization is a better form. for a couple of reasons

  1. Bullying can be avoiding by blocking IP addresses/deleting users and undoing whatever crap has happened. In our examples, we are trying to curb it but it still can exist
  2. We raise the barrier to entry- We are effectively pushing away people who want to try it and enabling those who want to find reasons to destroy it
  3. Finding out who is the true owner doesn’t need to happen through parsing possibly millions of transactions

All of these are very real and very bad problems.

Here Enters the “Free Market”

So instead of having a distributed ledger with no one source of truth. Lets try to build tools so that individuals don’t have to depend on one source of truth.

The Client Can
  • Use Mulitple Ledgers to resolve a name
    • Issue – This may or may not resolve to the same outcome
      • Solution – Its in the best interest for each to resolve to the same outcome
  • Trust one ledger over others when they conflict
  • Tell multiple ledgers that they wish to take a name, get rid of a name or transfer a name
    • Transfering implies the other user exists on all ledgers and are associated with a single unique identifier
  • Do research on a ledger to identify them as someone they want to support or not
A Ledger Can
  • Be a part of a network using a distributed ledger – This would be ideal if the market becomes crowded and/or to enable competition. In this manner, all shops can run. Some will be used more than others and some will have quality control tests
    • Issue – if adding to the ledger is a free practice, it may result in issues
    • Solution – since this is a distributed network rather then attempting to ‘be’ blockchain, ledger writes may be white listed
  • Deny access to the resource associated to the name for any of the following reasons
    • The person who had registered it had been removed from this ledger for reason X
    • The person who registered it has not associated it to anything
    • The result of the name does not pass the quality requirements of the ledger
    • They don’t feel like it
    • A perfectly valid reason that I cannot think of and they may not be able to either
  • Send a breif history of that name along with the result of resolving the name to prove validity of themselves or others
Everything else is Survival of the Fittest

 

So Whats the point?

Well, to be fair, if there are a million name resolvers running around it can get pretty ugly. But additionally, a completely inhuman approach leads to the inability to police it except from a game theory perspective. I think long term the game theory approach is probably the best but long term is not short term and short term I believe the free market approach is far more appealing. We avoid the headaches of ‘Whoops, didn’t think of that’ and allow people who did think of it to implement it. Additionally, I believe people’s priorities will change over time. Now its just about getting a name. In the future, a testing suite might be necessary. In the far distant future, you may be forced only to use certian technologies. I do believe its in a ledgers best interest to use a distributed ledger. But I don’t think that making it trustless is necessarilly possible. Making it safe is much more possible

EventModel, Callbacks, Promises, Generators, Await: Opinion about Async and Threading

Why are things Async?

This is the first thing we need to ask ourselves before continuing and there is a good reason. What we need to understand about aynschronous programming is waiting. Waiting is the worst enemy of speed. And speed is important for User Experience and the ‘ol time=money equation. In the world where nothing is asynchronous. We’ll use dummy milliseconds to figure out total time (I should really just create a performance test for this, but I’m just doing this on the fly).

  • Our Server Waits for a Connection – This is what will start the following and what we want our server to do 100% of the time (nearly impossible)
  • On Connection
    • 2 ms – Our Server processes the Domain Name/Path
    • 3 ms – Our Server digests the Query (If applicable)
    • 10 ms – Our Server digests Post Data (If applicable)
    • 10 ms – Our Server Validates Query/Post Data
    • 20 ms to 50 ms – Our Server makes a database call – It is actually unknown how long it will take, but may be one of the slower aspects of our application
    • 30 ms – Our server Turns the data into servable HTML
    • 10 ms – Our server serves the HTML
  • We start waiting for the next connection

So an application may take between each connection We are using 65 for absolutely necessary aspects and 20 to 50 for wasted time. This also means we likely cannot handle 100 connections a second (highly unlikely). But that being said, the 20 to 50 is what Asynchronous is really about. I’m most likely over estimating servers because I love them so much, but thats Lets make a clientside example. Our Client does a few ajax requests to our server for an awesome app

  • 30 ms – The Page is rendered
  • 100 ms – We render the initial Map on a canvas
  • 100 ms – We make a call to our server getting favorite locations
  • 100 ms – We make an api call to a map application to get all possible locations
  • 50 ms – We Position the map to our current location
  • 20 ms – We Render favorite locations on the map
  • Wait for Click
    • 100 ms – We make a call to our server load that specific locations information
    • 300 ms – We animate the item click to display a popup
    • 20 ms – We render the location in the popup
  • Wait for Click

This is where asynchronous becomes all the more important. In the server its more about fear, scalability and just sexy programming. In the Clientside, every time we block the user loses control. Every time the user loses control, the experience degrades immensly. In this example we have wasted about 400 ms on startup or half a second and about 420 ms (half a second again) every time they click.  These wait times are absolutely absurd from a experience example. What asynchronous programming allows us to do allow events to tell us when something should happen next. In its most basic form

  • Event Loop – while(true){ scripts.forEach(script -> script.execute() );
  • Our Server Waits for a Connection – This is what will start the following and what we want our server to do 100% of the time (nearly impossible)
  • On Connection
    • 2 ms – Our Server processes the Domain Name/Path
    • 3 ms – Our Server digests the Query (If applicable)
    • 10 ms – Our Server digests Post Data (If applicable)
    • 10 ms – Our Server Validates Query/Post Data
    • 10 ms – make database call
      • on return (10 to 40 ms)
        • 30 ms – Our server Turns the data into servable HTML
        • 10 ms – Our server serves the HTML
  • We start waiting for the next connection

We now have split up a connection, 35 ms to process the query then 40 ms to send it back. The Waiting is not even considered stopped at this point.

  • 30 ms – The Page is rendered
  • Wait for all both
    • 10 ms – We make a call to our server getting favorite locations
    • 10 ms – We make an api call to a map application to get all possible locations
    • On Return (100 ms)
      • 50 ms – We Position the map to our current location
      • 20 ms – We Render favorite locations on the map
  • 100 ms – We render the initial Map on a canvas
  • Wait for Click
    • 10 ms – We make a call to our server load that specific locations information
      • On Return (100 ms)
        • 20 ms – We render the location in the popup
    • 10 ms – Start animation
      • On Finish (300 ms)
        • Thats it
  • Wait for Click

Here We get far larger speed increases, startup is now only 150 ms on initial and 70 ms when it comes back  and only 20 are used up between clicks and 20 when the ajax call comes back. The animation is essentially there just to mask the ajax call anyway. These are big differences

85 compared with 35 + 40

400 and 420 compares with 150 + 70 and 20 + 20

The breakups are really important as well since everytime Its broken up, it allows the application to do other tasks

So Async is Perfect, no problemo

Not exactly… Similar to Functional Programming (which is likely going to be a different topic), this will make things fast (and in functional programming arguably more reliable and predictable). However, as it stands the way you have to design your application becomes a little bit stranger. Right off the bat its important we talk about threading.

Threading – The Building block of Async

In the Async programming model generally what happens is there is a seperate worker thread that recieves work, does it, then sends the result back (basically a function). So if we are to do this raw dog, this is what we are looking at

var worker = new Worker("./Path/to/a/script");

worker.onMessage = doTheRest;

worker.sendMessage(input)

This is a basic Model for Workers.  We will create a thread, listen for when it returns us data and provide it an input to do. However, multiple scripts/modules/whatever you want to call them will likely be using this one worker. Something like an Ajax call is common for a ton of applications to use and we don’t know who will get what.

setTimeout(function(){
  worker.onMessage = doTheRestOne;
  worker.sendMessage(inputOne);
}, Math.random()*1000)

setTimeout(function(){
  worker.onMessage = doTheRestTwo;
  worker.sendMessage(inputTwo);
}, Math.random()*1000);

Which happens first? Will doTheRest be registered before inputOne is finished? As a result we need to consider how to keep it relatively resusable.

Object Oriented Events – The XMLHTTPRequest standard

First I will create a rather unoptimized class

function OurWorkerClass(){
  this.worker = new Worker("path/to/a/script");
  this.worker.onMessage = function(packet){
    if(packet.error){
      return this.errorFn(packet.error); 
    }
    this.finishFn(packet.output);
    this.worker.destroy();
  }.bind(this);
}

OurWorkerClass.prototype.onFinish = function(fn){
  this.finishFn = fn;
}

OurWorkerClass.prototype.onError = function(fn){
  this.errorFn = fn;
}

OurWorkerClass.prototype.doWork = function(input){
  this.worker.sendMessage(input);
};

This is unoptimized since we are creating a worker and a closure for every instance. But this is to show how this thing works

var worker = new OurWorkerClass();
worker.onError(handleError);
worker.onFinish(function(output1){
  var outherWorker = new AnotherClass();
  otherWorker.onError(handleError);
  otherWorker.onFinish(function(output2){
    var thirdWorker = new thirdClass();
    thirdWorker.onError(handleError);
    thirdWorker.onFinish(finished);
    thirdWorker.doWork(output2);
  });
  worker.doWork(output1);
});
worker.doWork(input);

I would say the frame work muddles the code. Much more initialization then logic step by step progress. Ugly stuff.

Enter Callbacks

var worker = new Worker("./Path/to/a/script");

var pendingWork = {};

function doWork(input, callback){
  var id = Date.now() + Math.random().toString();
  pendingWork[id] = callback;
  worker.sendMessage({id: id, input: input});
}

worker.onMessage = function(packet){
  var id = packet.id;
  var error = packet.error;
  var output = packet.output
  pendingWork[id](error, output);
  delete pendingWork[id];
}

This is a basic Callback Model for Workers.  To use it we would call the doWork function with an input and a callback and it will correctly notify us which work did what. However, when in practice, this is what it turns into.

doWork(input1, function(err1, ouptut1){
  if(err1) return finished(err1)
  doOtherWork(output1, function(err2, output2){
    if(err2) return finished(err2);
    thirdWork(output2, function(err3, output3){
      if(err3) return finished(err3);
      fourth(output3, function(err4, output4){
        if(err4) return finished(err4);
        finished(void 0, output4);
      });
    });
  })
});

Theres the argument that Ryan Dall spoke about in terms of creating multiple functions to avoid it. This actually isn’t a bad idea in general as every function you create in another function would literally be created instead of being referenced from before. This is what it looks like though

doWork(input, callback1.bind(void 0, finished));

function callback1(finished, err, output){
  if(err) return finished(err);
  doOtherWork(output, callback2.bind(void 0, finished));
}

function callback2(finshed, err, output){
  if(err) return finished(err);
  thirdWork(output, callback3.bind(void 0, finished));
}

function callback3(finished, err, output){
  if(err) return finished(err);
  fourthWork(output, finished);
}

And this only exists because javascript exists as a two sweep scripting language and hoists functions to the top. In my humble opinion, this is fugly.

Promises – One of the many gifts jQuery popularized

Promises are one of the greatest things that has ever happened, I assure you. But they aren’t too freindly from a speed/memory perspective according to many node contributers.

var availableWorkers = [];
function getWorker(){
  if(availableWorkers.length){
    return availableWorkers.shift();
  }
  return new Worker("path/to/our/script");
}

function finishedWorker(worker){
  availableWorkers.push(worker);
}

function doWork(input){
  var worker = getWorker();
  return new Promise(function(res, rej){
    worker.onMessage = function(packet){
      finshedWorker(worker);
      if(packet.error) return rej(packet.error);
      res(packet.output);
    };
    worker.sendMessage(input);
  });
}

perhaps I’m muddling too much worker code with these examples. It just is a lot of fun. Regardless, this is what it turns into

doWork
  .then(doOtherWork)
  .then(thirdWork)
  .then(fouth)
  .catch(handleError);

Sexy, clean, beautiful. Really is georgous in my humble opinon. However, things aren’t always so clean. See Example B

doWork(input).then(function(output1){
  var p = doOtherWork(output1);
  p.catch(handleSpecialError);
  return p.then(function(output2){
    return doOneWith2Arguments(output1, output2);
  });
}).then(doThird.bind(void 0, input))
.then(function(output3){
  return doFourth(output3, input);
}).then(function(output4){
  return Promise.all([
    doFifthA(output4),
    doFifthB(input)
  ]).then(function(outputs){
    return finishFifth(outputs[0], outputs[1]);
  });
}).catch(handleError);

Once we start customizing our catches and arguments, things start getting weird. It can start to become quite difficult to figure out what the hell is going on. On line three, that catch will exit the program. doThird recieves the output of doOneWith2Arguments and also takes in input1 as its first parameter. For doFourth we need to pass in input as the second argument. The fifth is attempting to do two works side by side. So what are we supposed to do?

Generators – Not made for Async, but looks like it

Going Async With ES6 Generators

This is what above looks like

runner(function main*(){
  try{
    var output1 = yield doWork(input);
    var output2;
    try{
      output2 = yield doOtherWork(output1);
    }catch(e){
      specialErrorHandle(e);
      return;
    }
    var output3 = yield doneWith2Arguments(
      output1,
      output2
    );
    var output4 = yield doThird(input, output3);
    var output5 = yield doFourth(output4, input);
    var outputs = yield Promise.all([
      doFifthA(output5);
      doFifthB(input);
    ]);
    var finaloutput = finishFifth(
      outputs[0],
      outputs[1]
    );
    finished(finaloutput);
  }catch(e){
    handleError(e);
  }
})

This is almost the holy grail. What we’ve been waiting for. Something that looks like what is should be. Its a crazy thought right? A program being sequential and effective? Wild stuff really. Unfortunately, these still need to be wrapped in some function or be used via promises or callbacks.

Await – The True Holy Grail

https://jakearchibald.com/2014/es7-async-functions/

Basically above only await can be used anywhere and likely on anything that returns a promise. It will be glorious.

So, this post is over right?

Well, yes and no. Lets go back for a second. So Async allows us to handle work in a seperate thread and continue execution without blocking the event loop. This is the important thing though. Blocking. If it weren’t for the blocking of the main thread, there would be no problem. But as GPU processing becomes easier to use and CPUs go from 64 bit single threaded to 64 bit 4 core we start seeing the opportunity to maximize what we have.

What if events spawned a new thread?

Lets look at our server example

  • Wait for Connection
    • Create a new thread (or retreive one from the pool)
    • Provide the thread the connection
  • Wait for Conection

Problem here is that Database and http calls would require witing in threads probably causing 20 to 30 threads running at one time slowing down everything

Lets look at a client Example

  • On Click
    • CSS Animations (GPU Bound)
    • Ajax Call -> dom manipulations
      • Next Animation Frame Write dom to GPU

CSS animations is fine, dom manipulations are global. This means that there would need to be a global thread that is mutable by all others.

So the issues would be

  • Mutability of Shared Resources (Dom specifically)
  • There may be a situation where there are more threads than necessary running at once causing everything to slow down.

Lazy Everything: Dirty checking, Caching results, single iteration

One thing that is somewhat popular now adays is Lazy Evaluation.  Lodash implemented a form after a competitor (Lazy.js) where showing big promise for takeover. However, javascript isn’t the source of all lazy evaluation. Haskell, Scala and other Functional Languages have been taking advantage of it in full effect for a while. Things Such as the streaming api in node and lazy getters will likely cause a vast speed increase in your application.

Functional Programming Model

This is where I can go down another path. But heres the basics of it. Every time you set an output of a function to another you create a memory pointer or clone the variable. Additionally Whenever you provide it as an argument, the same thing occurs. Hitting memory 1 time instead of 2 times may increase your speed greatly. Referencing a property directly instead of through a pointer may also greatly increase the speed of your application. But beware, this may be what your application looks like

lastFunction(
  firstFunctionCalled(),
  FifthfunctionCalled(
    ThirdFunctionCalled(
     SecondFunctionCalled()
    ),
    FourthFunctionCalled()
  )
)

This may start becoming intuitive but for me I usually think in step by step instead of what to do last. That being said, generally your application will not have mutations to the global scope or mutate the arguments so this form of programming should be fine

Closing thoughts

Async solves the waiting problem which is a very important problem indeed.

The day docs can stop explaining in english…

So right now I’m attempting to deploy to ec2. I’m doing it through elastic beanstalk because the less I have to think, the more I can move on with my life. However, if I’m going to document how things work, it this is how I can best explain it

  • There are about 4 different objects
    1. Instance – CPU, Memory, Runtime Environment
    2. Volume – Basically a file system
    3. VPC – Provides a public IP and Binds Private IP Addresses to Instances
    4. Auto Scale Group – Will Scale based upon CPU usage
  • With Each Instance
    • You can choose an operating system
    • You can choose size of the default volume
    • You can choose what VPC it should belong to
    • Can save these configurations as an “AMI” for later use
  • VPC
    • Can add security groups to it
      • You can choose protocols that are allowed
      • You can choose what ports will be open
      • You can choose what IP addresses are allowed
  • Auto Scale Group
    • Can provide an AMI that it will replicate
    • Can Hide an AutoScale Group behind an VPC
    • Will load Balance between all the instances

Now an Elastic Beanstalk instance is a culmination of these aspects only it allows you to simply push as if you were using git and seperate the instances as you please. The Problem for me here is simple: Despite they allow you to choose nodejs as the ‘language’ they don’t install git, probably not python and probably not a full package. Basically at that point it would seem like a better idea to just create the AMI’s yourself and create deploy scripts instead of using the elastic beanstalk framework.

Additionally, there starts becoming other issues such as starting up Instances that you need the private ip address of or you want to ensure the outside world has no access to it. I understand these are tools that they grabbed from the open source world and organizaed them in such a way to make it an easy setup. However, tools exist for a purpose, and if you are making your own purposes more difficult, there exists an inherent problem

Now the title is what this article is really about. Amazon basically explains some of its services in big paragraphs. Causing me to have to break free for a second and actually read. I then have to dissect what I just read in an organized manner and then create my own personal examples. The Configuration scripts is something that is hanging me up because I believe I’m doing everything correct, however git is not getting installed. From there I look up issues amoungst other things and I still have no idea what the hell is wrong. This leads me to feel like I have to start working with the AMI/Autoscaling situation because I at least can force an AMI to have and do what I want while Elastic Beanstalk isn’t the best wrapper I’ve ever seen.

I’m not saying elastic beanstalk is bad, I just believe it can be better. Similar to javascript, we are given XMLHttpRequest, which is pretty ugly. Add a wrapper that expects a url and a callback and all of a sudden a common use case is easy to use. However, Amazon has the opportunity to make its setup cleaner and easier to use. Elastic Beanstalk is the right direction but I’m left wondering how much time they’ve spent ensuring common use cases were handled properly. Maybe its just me complaining, because for sure right now, I am alone, complaining.

As Issues Increase, Moral Decreases

Currently I’m using c3 for a work project and it reminds me of how the open source world works. When something great and pretty gets created, people want to use it. When issues arise, some are easily fixable, others start involving api changes, others are downright difficult to track down. However, when all is said and done, some issues will take too much time to kink out. However, as dependents increase the demand for a “perfect” library also increases. We start seeing wholes from some many different places

  • When using with a huge input
  • When attempting to optimize with a huge input
  • When using with high number of inputs
  • When attempting to optimize with high number of inputs
  • When a feature was a second thought
  • When mix and matching features
  • When it doesn’t interact cleanly with obvious use cases

There are likely more cases. Now, theres something to be said for “why isn’t anyone helping?” And that’s really what this blog post is about.

I feel so alone

So, put your feet in the shoes of a developer of a successful open source project. Imagine that you didn’t get a job from it but it is used everywhere. Imagine that you want to keep it free. Imagine that almost everyone who comes there posts an issue but almost never helps to solve them. This can get tough. It hurts and I believe it eventually leads to abandonment or resentful yet honest comments to innocent people. How do we avoid this? Thats easier said than done

  • Make sure the code is clean – Something like lodash is a great example because each aspect is in a small little function. However, lodash is also an exception to the rule as it does very little. lodash doesn’t visualize anything, lodash doesn’t transform much that hasn’t already existed. As a result, we should only respect how nice it is to have a very simple goal
  • Make sure the code can be conceptualized – This is something I think we are missing more. A flow chart that point people to the correct files.
  • Make sure your code is as dry as possible – If theres an error in a dry aspect. Then all you have to do is modify that dry code. The more seperate parts that do the same thing, the harder it is to track down when things go haywire
  • Make sure you aren’t doing too much, and if you are organize. – This is in my opinion the most important. Something more difficult is when you are overloading constructors or having a parser/ajax and something else. You can allow your users to do the ajax. To follow directions. In this manner you have to worry less about retreiving from a url and care more at making what you are doing is done well.
  • Tests are Examples – Instead of seeing “examples” generally I like to go to the test folder to see how things work. The reason for this is usually I’m looking for a fringe peice of API that isn’t well documented.

Be Honest

The open source world is hard. But to keep it alive, we don’t need our successful to die out in a fire, we need them to tell the world “hey! I’m over it!”. We need to have people realize that each individual is what it takes to make a project successful. If someone posts an issue, I would believe its ideal to lead them in the direction to fix it themselves. That merge the request. Its simple but makes your life easier and supports the opensource culture as truly “by us for us”.

Yell At Companies

I’m under time constraints (I probably shouldn’t be writing this but rather work 13 hour days 7 days a week… jk ;). However, companies often use aspects and then avoid helping out later. Companies pay programmers and programmers should give back on company time. Otherwise we go down a slippery slope over tons of people use and the creators get no pay nor help.

That being said, I’ve never had an open source project that has become a booming success. Nor have I followed all these rules. So I suppose you can ignore everything I said. Its a thought

ORM’s Dreaming Big: Pt 3 (Big Pappa ORM)

Here is Part 1 and Part 2 if you are interested

Two Orms to date have been very interesting to me.

  • RethinkDB – Pushes events to any listeners. This inherently supports cluster since if a socket attempting to synchronize, it needs to know the changes. Luckilly, the changes will be pushed to all threads.
  • Waterline – What I really like about waterline is that instead of being a storage system, it is just the interface to it. This allows you to have specific storages and maximize their best parts without having to write in different query languages.

Databases have gone in and out of style so fast over the past years. MongoDB, CouchDB, Postgres, Hadoop, MySQL. All of which are competing for the marketshare of being the “database of choice”. That being said, All of them have distinct advantages and disadvantages. Anything SQL also gives you the ability to run a PHP stack without much trouble. Anything JSON allows you to store documents in the format your coding in. Additionally, Redis has shown that moving the global memory of sessions and the like to a single store is very important for clusters. As a result, the Storages increase and unfortunately, the interface increases as well.

Queries, Adapters, The Bazaar and the Cathedral

If you havent read this essay, I think you should. The gist of it is there is a clear difference in the way people write parts and something is created in one huge gulp at a time. Now, I’m in the parts boat because as time continues, there may be new Databases. There may be new fancy things. And as a result, you don’t want your api to change with each database you move to. In a bazaar, you have your fruit farmers, butchers, jewlery, etc. Each is specialized with a solid respect to each person and what they do. In a cathedral, you are given your daily bread and wine on special occassions. This is simple and works however, some cathedral’s bread is better than others. And sometimes you get olive oil with yours. I believe that allowing as many databases to interact with your ORM is superior to forcing a the API to adhear to your Databases design. In this way, you can make the Query Language the best Possible without breaking compatibility with an Databases. As a result, there is a few simple laws I believe he ORM should adhear to

  • One Query Syntax for all databases it interfaces with
  • “Adapters” decompile the query into what the database will actually use and send it

As result, there becomes a dependency. Adapters -> Models. However, Adapters are more abstract, they generally should be reused for multiple databases of the same type. As a result there becomes a further dependency.

Adapters -> Connection to a Database -> Model/Table

Pluggin in, Plugging out and letting everything just Figure itself out

Content management systems are allowing you to design databases in the browser. WordPress has custom PostTypes. Drupal has Schemas. Treeline is making “machines”. However, the most important concept here is that when you make an edit to the a database model, the whole server doesn’t have to shut down on the way there. PHP has the advantage of not existing in memory as a result, each call to the server is essentially a “fresh” instance. NodeJS doesn’t have that kindness. As a result, making sure your orm is designed in such a way that dependencies don’t require the need to Destroy and recreate is of upmost importantce. Something simple such as

orm.addModel("ModelName", modelConfig)
orm.removeModel("ModelName");

Can really make or break what an ORMs capable of. A simplistic Algorithm would be something like this.

util.inherits(ORM,EventEmitter);

ORM.prototype.addModel = function(name, config){
  this.models[name] = new Model(config);
  var connDeps = getConnectionDeps(config);
  var modelDeps = getModelDeps(config);
  allListeners(name,modelDeps,this,"model");
  anyListeners(name,connDeps,this,"connection");
}



function allListeners(name,deps,orm,key){
  var depnum = 0;
  var addlistener = function(){
    depnum++;
    if(depnum === 0){
      orm.emit(
        "add-"+key+"["+name+"]",
        orm[key][name]
      )
    }
  }
  var remlistener = function(){
    if(depnum === 0){
      orm.emit(
        "rem-"+key+"["+name+"]",
        orm[key][name]
      )
    }
    depnum--;
  }

  deps.forEach(function(depname){
    if(!orm[key][depname]){
      depnums--;
      orm.on(
        "add-"+key+"["+depname+"]",
        addListener
      )
    }else{
      orm.on(
        "rem-"+key+"["+depname+"]",
        remListener
      )
    }
  });
  orm.on("destroy-"+key+"["+name+"]",function(){
    deps.forEach(function(depname){
      orm.off(
        "add-"+key+"["+depname+"]",
        addListener
      )
      orm.off(
        "rem-"+key+"["+depname+"]",
        remListener
      )
  });
}

function anyListeners(name,deps,orm,key){
  var depnum = 0;
  var addlistener = function(){
    if(depnum === 0){
      orm.emit(
        "add-"+key+"["+name+"]",
        orm[key][name]
      )
    }
    depnum++;
  }
  var remlistener = function(){
    depnum--;
    if(depnum === 0){
      orm.emit(
        "rem-"+key+"["+name+"]",
        orm[key][name]
      )
    }
  }

  deps.forEach(function(depname){
    if(!orm[key][depname]){
      orm.on(
        "add-"+key+"["+depname+"]",
        addListener
      )
    }else{
      depnums++;
      orm.on(
        "rem-"+key+"["+depname+"]",
        remListener
      )
    }
  });
  orm.on("destroy-"+key+"["+name+"]",function(){
    deps.forEach(function(depname){
      orm.off(
        "add-"+key+"["+depname+"]",
        addListener
      )
      orm.off(
        "rem-"+key+"["+depname+"]",
        remListener
      )
  });
}

With even emitters, you can toggle back and forth with minimal issues. Theres other parts that are important such as…

  • Binding a model to the orm instead of making the call itself
  • Being able to queue requests from a model
  • Throwing errors when things fail

Cluster Support

Cluster support is one of the most important parts about any modern javascript module now adays. If it can’t be run on a cluster, its not fit for production. If its not fit for production, its going to end up being just fgor somebodies side project. From a simple concept, you can add cluster support by relaying events. This simple example is all we really need to ensure we are sending events properly. First off we must figure out what events need to be sent globally. For our cases, we’ll do delete of a model

ORM.prototype.addModel = function(){
  var model = this.figureOutDeps(arguments);
  var self = this;
  model.on("delete", function(instances){
    if(self.isChild){
      process.send({
        type:"orm-rebroadcast",
        event:"model["+model.name+"-delete"
        data:instances
      });
    }
    self.emit(
      "model["+model.name+"-delete",
      instances
    );
  });
}

As you can see, when a delete has happened locally, we then tell the master what has happened. From here the master tells every other worker what has happened

ORM.prototype.asMaster = function(workers){
  var self = this;
  workers.forEach(function(worker){
    worker.on("message", function(msg){
      if(msg.type === "orm-rebroadcast"){
        self.broadcast(msg,worker);
      }
    });
  });
  this.workers = workers;
}

ORM.prototype.broadcast = function(msg,not){
  this.workers.forEach(function(worker){
    if(worker === not) return;
    worker.send(msg);
  });
}

From there we can implement the worker’s listeners

ORM.prototype.asWorker = function(){
  var self = this;
  process.on("message", function(msg){
    if(msg.type === "orm-rebroadcast"){
      self.emit(
        msg.event,
        msge.data
      );
    }
  })
}

There are things that can be done a little bit nicer. For example, having workers tell the master what the want to listen and not listen for. Additionally, We can reimplement this with Redis or any other api because it really isn’t that complicated.

ORM’s Dreaming Big: pt 2 (The instance)

Before we went over the Schema which is about validation, indexes, population and schematypes. Here we’ll go into what people will be most likely using, the instance. So, what is the instance?

An Instance is

  • Holds values you wish to store or have retrieved
  • Something that can be created, requested, updated and deleted
  • Something that has properties which can match conditions

Basically an instance is the actual values you want to store or retrieve. Probably the most important part about the database as without it, well, you have nothing.

Generic yet Important Things

Callbacks

Callbacks can be implemented in one of two ways; callback(err,obj) or Promises.

Constructor(ObjectId, function(err,instance){
  if(err) throw err;
  console.log("this is our instance", instance);
});

Constructor(ObjectId).next(function(instance){
  console.log("this is our instance", instance); 
}).catch(function(err){
  throw err;
});

This is meant to support anything that you want.

Handles and Values

Instances are technically either an ObjectID handle or the Actual Values. They both have the same interface however with ObjectID’s you do not have to load all of the values nor do you have all of the values. While with an instance you do. This is to support as much IO or as little IO as you desire without having to change the interface.

Creating

Creating an instance should be as simple as creating an object in javascript

Standard – Construct and Save
var instance = new Constructor({
  property:"value"
});
instance.property2 = "value2";

instance.save(function(err){
  if(err) throw new Error("creating caused an error");
  console.log("finshed creating");
});

Now, we haven’t gotten into “Constructors” or “Models” however hopefully this sort of syntax is familiar to you. It’s simple. We want to create a new instance, so we construct it. Because values may be added or removed, the object is not saved right after creating. Additionally, its important that this is done asynchrnously. We don’t know when the file will be saved or how the file will be saved, only that the file will be saved.

Calling the Constructor – Less IO

When all of the values are already in the json object, constructing the object is a waste of time and resources.

Constructor({
  property:"value",
  property2:"value2"
}, function(err,objectid){
  if(err) throw new Error("error when creating");
  console.log("finished creating");
});

You may notice that the standard’s callback has no objectid but the create does. This is because if when you’ve succesfully saved, an ObjectID is already set for you in addition, you already have an interface to interact with the Instance. So there is no point in returning anything. While when using create, it will give you the ObjectID handle to provide you an interface to interact with. However, the handle will not have any properties in it so I would suggest you use instances if you want them.

Static Method – Obvious

In addition,  you may also call the static method. This will return the instance.

Constructor.create({
  property:"value",
  property2:"value2"
}, function(err,instance){
  if(err) throw new Error("error when creating");
  console.log("finished creating");
});

Retrieving

Generally the way retrieving will work is through the Constructors static methods. However, we are going for sugar here.

Standard – ObjectID Populating

If we have an ObjectID Handle, we can populate it into an actual Instance. It’s important to note there is a difference between an ObjectID Value and an ObjectID Handle. The value is the bytes/buffer that actually gets indexed. The ObjectID Handle has all the methods of a normal Instance. Generally all ObjectID Values will be transformed into Instances when retrieving an instance. In addition, anywhere you can use an ObjectID Value you can use an ObjectID Handle

objectidHandle.populate(function(err,instance){
  if(err) throw new Error("error in populating");
  console.log("populated the instance");
});
By ObjectID Value – Opposite
//Retreiving
Constructor(objectidValue,function(err,instance){
  if(err) throw new Error("error in retrieving");
  console.log("retrieved the instance");
});

The above is simple. We use our constructor with the object id and it will return an instance. This is the exact opposite as the initial where we send in some values to create an instance and we receive an objectID handle and this will retreive the instance based on the Handle or Value.

Static Method – Obvious
//Retreiving
Constructor.get(objectidValue,function(err,instance){
  if(err) throw new Error("error in retrieving");
  console.log("retrieved the instance");
});

Updating

Standard – save

With your Constructed/Retrieved Object it is the exact same as it was before, simply save.

//Updating
instance.property = "new value";
instance.save(function(err){
  if(err) throw new Error("updating caused an error");
  console.log("finished");
});
Static Method – Obvious

You may also update just by calling the update method of your constructor

Constructor.update(ObjectId, 
  {property:"new value"},
  function(err, instance){
    if(err) throw new Error("error when using update");
    console.log("ran update");
});

Deleting

Deleting is the last part of our crud interface here. As you might Imagine, it’s more of the same

Standard – destroy
//Updating
instance.destroy(function(err){
  if(err) throw new Error("destroying caused an error");
  console.log("finished");
});
Static Method – Obvious

You may also update just by calling the update method of your constructor

Constructor.destroy(ObjectId, function(err, instance){
    if(err) throw new Error("error when using update");
    console.log("ran update");
});

 Property Setting and Getting

Digestors and Verbose

All properties on an instance are actually getter and setter functions.

Object.defineProperty(instance, "propertyname", {
  get: function(){
    return Schema.propertyname.verbose(
      instance._rawValues.propertyname
    );
  },
  set: function(v){
    var ds = Schema.propertyname.digestors;
    var l = ds.length;
    var vv = void(0);
    for(var i=0;i<l;i++){
      vv = ds[i](v);
      if(typeof vv != "undefined") break;
    }
    if(i===l){
      throw new Error("cannot digest ",v);
    }
    instance._rawValues.propertyname = vv;
  }
});

For the getter we are returning the verbose value. For the Setter, we are digesting the value to its raw type.

Marking Dirty Properties and resetting

The first thing that can be done is to mark dirty properties. This also ties in with the way digesters and getters. In addition when setting, we also set which properties are dirty. This is done so that only the dirty values are actually updated

set: function(v){
    var vv = Schema.property.digest(v);
    if( vv == instance._initvalues.property ){
      delete instance._dirty.property
    }else{
      instance._dirty.property = vv;
    }
    instance._values.property = vv;
   }

Instance.prototype.save = function(cb){
  return Instance.update(this.id,this._dirty,cb);
}

Instance.prototype.reset = function(){
  for(var i in this._dirty){
    if(this._dirty[i]){
      this._dirty[i] = false;
      this._values[i] = this._initvalues[i]
    }
  }
}

 

Resync with the sender

At times you may want to ensure your instance is the exact same as the one on the database or the location that sent you the instance. All that needs to be done is resync

Instance.prototype.resync = function(cb){
  var _this = this;
  Instance(this.id,function(err,values){
    if(err) return cb(err);
    for(var i in values){
      _this[i] = values[i];
    }
    cb(void(0), _this);
  });
}
Listening for Updates

You may also use an event emitter that sends “update” event with property name and value

Instance.prototype.syncTo = function(ee){
  var _this = this
  ee.on("update",function(prop,val){
    _this[prop] = val;
  })
};

Dom and QueryString Interactions

And of course, you will need some dom interactions. Idealy, I would use an available library such as qs and serializeObject and deserialize to ensure I don’t mess up. From there I would properly set either the values in the object or the values in the query string or form. In addition its also possibly to bind the Instance to the form by using syncTo.

That is the instance

Perhaps there is too much sugar here. Perhaps not enough in the right areas. I’ve considered streaming as an alterantive plan however, In the end, I believe a simple API is a good api. Perhaps I should prioritize a bit. What is for sure in is all of the static methods, ObjectID.populate, instance saving, instance destroying and using the Constructor, well as a constructor. In addition, the dom/querystring aspects is pretty important since without it we’re back at square one: A decent ORM with refusal to believe the DOM or urls that don’t use JSON exist. Everything else is a bit up in the air.