isn't quite ashamed enough to present

jr conlin's ink stained banana

:: Idiots Guide to Mozilla Sync API

i've been informed that Sync is due for a change and that what you read here may not be outdated in a few months. (grumble.)
Be forewarned.

Recently, i needed a way to sync data for my Firefox add-on between clients. Fortunately, there's a nifty way to do that built into the current versions of Firefox called Sync. Unfortunately, while the documentation is great if you want to learn how sync operates, it's kinda poop if you want to quickly get started using Sync.

Well, "Some see problems, others see opportunities" so as my first go to fix that problem, i'm going to write down how i got Sync working.

The Guts

Ok, before we start delving into code, let's take a moment to talk about what's going on. Sync is a tool that securely exchanges data between two clients. That means you can use it to exchange data between anything you own that can run Firefox. (There are a couple of other browsers out there that can also play along, but for simplicity, let's stick to just firefox for now). It does this by stuffing your data into a well known chunk (marshalling), encrypting it, and sending it via a well known server.

Note, that the server isn't doing much more than relaying your data. That's because it's encrypted and can't really do anything with it right now. (Mozilla is working on making a more "durable" storage tool, but it's not there yet. This is why Mozilla STRONGLY RECOMMENDS you don't think of Sync as a backup service. It is, kind of like the shelf at the ATM is somewhere you can put your wallet. It's useful while the main bit is used, but not a really good long term idea.

Ok, so we've got records that are being exchanged. Those records have a little bit of information associated with them that's useful for syncing, but otherwise they're very simple. Those records go into a "Storage" object that just collects and manages the records, are watched by a "Tracker", and the storage elements get exchanged by an "Engine" that does the actual exchange. The engine watches both for local changes and remote requests for updates. It's worth noting that the exchange isn't instant, but near enough for most usages (e.g. within a few seconds).

Got it? The Tracker notices a change, and asks the Storage element to update the associated Records, then calls the Engine to deliver them.

There, now you understand Sync.

Spelunking

Now that you get Sync (from a tool point of view, at least), i can point out where some of the code lives. You don't need to look at this, but it might be useful if you like to play along at home.
First off, here's where the sync code lives. Most of what you want is in the sync/modules branch.

If you want to see the code i built, you can find the latest version here.

The ground up…

The addon i'm modifying is built off of the Addons SDK. The nice thing is that this means the addon is restartless, the bad thing is that it means i have to do some extra work in order to get things running. In this case, i need to call into Mozilla Core code. To do that, though, you just need to call:

const {Cu} = require('chrome'); // get the Components.utils hook
Cu.import('resource://services-sync/engines.js');
Cu.import('resource://services-sync/record.js');
Cu.import('resource://services-sync/main.js');
Cu.import('resource://services-sync/util.js');

This automagically drags a host of objects into the current namespace. Chances are, if you're wondering where an object is defined, it's within one of these files.

Now, let's work from the ground up, that means building a Record object. Like i said before, a Record is just a marshalling container. The only real restriction is that it should be JSON storable (so fancy JS pointer hacks or methods are not really a good idea).

So, something simple:
function FooRecord(moduleName, recordId) {
CryptoWrapper.call(this, moduleName, recordId);
}
FooRecord.protoType = {
__proto__: CryptoWrapper.prototype; // subclass from CryptoWrapper
_logName: "Record.FNCrypto; // What to use for the log messages
};
Utils.deferGetSet(FooRecord, "cleartext", ["value"]);

As you can see, this subclasses CryptoWrapper (via janky JS subclassing), and then calls a utility function to autobuild the Getter/Setter method, which will store FooRecord.value into "this.cleartext.value". Ok, you probably didn't see that. You'll have to take my word for it. Now, if you had a very complex item where you may not want to exchange every bit of info, you could define a bunch of items in that (where "value" is, and probably with a bit more descriptive labels). This way, you'd send just the bits you need so that things are pleasantly zippy. Since my records are small and not really something you'd want to split up anyway, i opted for a single value.

Now that we have a Record, we need something that can hold them. Let's define the Store. In many respects, this is another Controller layer, in that it calls to whatever you're using to actually store your data (the Model). In this case, think of DB as the model store. (DB is the persistent storage that is available to Add-ons, and is pretty cool too.)


function FooStore(moduleName) {
Store.call(this, moduleName);
}
FNSyncStore.prototype = {
__proto__: Store.prototype,
self: this,
itemExists: function(recordId) {
return DB[recordId] != undefined;
},
createRecord: function(recordId, moduleName) {
var record = new FNSyncRecord(moduleName, recordId);
if(DB[recordId]) {
/* Again, if we had multiple fields, make sure you set them here.
*/
record.keyBundle = DB[recordId];
return record;
}
return undefined;
},
changeItemID: function(oldId, newId) {
DB[newId] = DB[oldId];
delete DB[oldId];
},
getAllIDs: function() {
/* It's important that this return an Object (a Dict/Hash)*/
var recordIds = {};
for (var key in DB) {
/* Only return keys that are actually pointing to values.
* DB is a JS object, so there can be all kinds of cruft in there.
*/
if (key.indexOf('key:') === 0) {
/* The value stored is arbitrary. Only the key name is important.
*/
recordIds[key]=true;
}
}
return recordIds;
},
wipe: function() {
for (var i in self.getAllIDs()){
delete DB[i];
}
},
/* These are meta function calls to normalize data sets
* between this machine and some other.
*/
create: function(record) {
DB[record.id] = record.payload;
},
update: function(record) {
DB[record.id] = record.payload;
},
remove: function(record) {
delete DB[record.id];
}
}

Again, very simple, but a few caveats in there. One thing that can be confusing is that "createRecord" is different than "create". Create Record is more of a "write" function, in that it's called when a record has changed and it needs to be written out to some other device. "Create" is the opposite, that's where a new record needs to be imported onto the local machine. That's pretty much the role of Store.

Ok, we've got the transfer record, we've got the Storage/controller stuff, now we need to look at the Tracker to spot local changes. That's the role of Tracker.
function FNSyncTracker(moduleName){
Tracker.call(this, moduleName);
trackerInstance = this;
}
FNSyncTracker.prototype = {
__proto__: Tracker.prototype,
track: function(recordId){
/* Add the record to the list of items that have changed.
*/
this.addChangedID(recordId);
/* Any dirty records need to be propagated as soon as possible.
* thus the immediate "100"
*/
this.score = 100;
}
}

Really. That's about it. Basically, once data has changed on your side, you indicate to the Tracker "Hey, this changed!". You'll also want to set the score for how important it is to sync this data (0 == ignore this, 100 == OMFG THIS NEEDS TO BE DONE LIKE AN HOUR AGO!). You can set observers to help do this, or just call directly. i took the second option because i know exactly when i need to update and it's kind of important to do it after new record creation.

And finally, we're at the top level, the Engine:

function FooEngine() {
// Defining "moduleName" here so that it's easy to figure out where
// it comes from. This is case insensitive.
var moduleName = "Foo";
Weave.SyncEngine.call(this, moduleName);
// turn the engine on
this.enabled = true;
}
FNSyncEngine.prototype = {
__proto__: Weave.SyncEngine.prototype,
_storeObj: FooStore,
_recordObj: FooRecord,
_trackerObj: FooTracker,
version: 1
};

Hey, so that's where the Module name comes from! Module name is used for a number of things, including logging and tracking. Unfortunately, it doesn't show up on the list of active sync engines on the options panel, yet. It does mean that it has to be a fairly unique, yet human readable string, so "Foo" is probably sub-optimal. The other thing you'll need to do is make sure that the engine is enabled (this caught me hard).

You'll also need to define links to the objects you've built, and the version you're using.

And, you're done!

That's it. Granted, you'll want to test this, and debug it, and all the other nonsense, but from a code sense, that's all you'll really need.

By the way, since i mentioned debugging, i want to chime in on the built in tools for Aurora and Nightly. Sadly, there's still no step-debugger, but here's a stupid trick that is INCREDIBLY useful:

  1. go to about:config
  2. Open a Web Console [Ctrl+Shift+K]

You now have a web console that can run Javascript with the Browser Chrome. That means that you can run things like Components.util.import('resource://services-sync/utils.js'); Utils.sha1('foo'); and get a pretty SHA1 hash of foo on your screen. And yeah, you can do the same trick with any other Mozilla core function you want to play with.

So, now you know, and knowing is half the battle.
The other half involves blue and red lasers and guys with stupid nicknames.

    What do you think, sirs?

    :: The x500 DRINKing Game

    In a previous life, i worked for a company that was building an x500 MTA. For those not in a fetal position beneath their desk at the very mention of that phrase, an MTA is a Mail Transfer Agent. It's what's responsible for getting your complaint about the new Timeline to zuck@facebook.com. x500 was the "Improved" mail addressing format that didn't presume a user at a site, but instead qualified it by a list of things that described an individual, from least specific to most (e.g.
    Country: US
    Province: California
    City: Mountain View
    Company: Widgetco
    Floor: 3
    FamilyName: Doe
    SurName: John
    DNAFingerprint: ATTACAG...

    etc.)

    (i'll beg your forgiveness if the field names are not correct. The therapy helps blot that crap from my mind.)

    What's important here is that x500 allows fairly arbitrary fields to be inserted into a given record. The MTA only has to pay attention to the ones that it recognizes, so you could build one that only responds to, say, "Company, FamilyName" if you're a place with a dozen, unrelated employees. The rest of the spec is all optional. This lead to all sorts of interesting abuses, including one of the things i was working on that involved storing a cryptographic key inside of the record that tied back to the authorizing provider, allowing for a verification path to the central root authority, meaning that your headers were often far larger than the actual content of the message and that had to be magically stuffed through a 300baud modem in a timely manner.

    Still, flexibility was a strong selling point for x500. So much so, that the example included the frivolous category of "FavoriteDrink", and provided a helpful collection of alcoholic concoctions for the various fictional individuals to imbibe. HaHa, what merriment shall be had!

    Only, it was in a spec, so that it became codified.

    Now, a few decades later, i fully expect that there are more than a few unit test cases for determining validity of records based on "drink", and some very confused intern wondering what the hell that field has to do with figuring out how to partition their machines. i have been that intern. It was not fun.

    This is why i tend to be a little cautious when folks talk about "loose data definitions" in applications, unless they REALLY MEAN IT. As in, they define methods for how data can be stored, but do not require specific elements to be defined. The application is therefore required to determine if a record can be used or not. That tends to shoot a lot of holes into peoples designs, and i'm ok with that. You can't realistically expect anyone else to use something where you hand wave over something as important as "The Data You Use".

    Well, unless you're also providing all the alcohol that they'll need to deal with that stuff.

    :: Communicate With All The Things

    Not too long ago, i heard a story about how a guy was making organs that tweet. As in building a heart that can provide monitoring and status information online. Granted, folks tend to vent their spleens when on the internet, but this could be a great deal more literal.

    Bad jokes aside, when i heard the report, the first thing i thought of was "So, what happens when Twitter goes away?"

    i get the concept at hand here, and what they are talking about. "Tweeting" means posting short messages. It's what folks understand. Like xeroxing a kleenex you hoovered up, still, there are a growing number of services that provide that sort of messaging with little intent of being used by a wider audience. Why? Well, aside from the OAuth handshake, there's something innately appealing about the idea of being able to post info to a simple URL for yourself.

    That's definitely the idea behind the "Internet of Things" you occasionally hear about. A web of devices talking to each other (and you). People are thinking about using sites like Twitter and Facebook as the channels of communication, but i'm not really comfy with the idea of those companies knowing my energy consumption patterns and glucose levels. Or, for that matter, my credit and insurance company getting a hold of them without my approval.

    So that's kind of the idea behind the stuff Jeff and i are working on, Notifications. In short, it's a simple way for sites (or devices) to post short messages to a URL and have them get to you. What's more, you have control over those messages, and can silence or drop a site easily. Devices also have it pretty easy and can either post stuff just to a URL, or pass them through an encryption filter that prevents the carrier from being able to read the contents.

    Internally, i've been kinda selling Notifications as being "Twitter for Mozilla", but i'm not sure that's right. Really, it's "Send and forget" for stuff that needs to talk to other stuff. Even better, is that you can run your own server. Heck, you can even run your own client that blasts these messages to Facebook or Twitter as well. It's not tied to Firefox at all, and that's the beauty of it.

    It's simple, secure, and under your complete control. i think that's kind of cool, and hopefully you will as well.

    :: Talk to Your Parents About Privacy

    This evening Anne Marie got a call from a wireless phone from Boston. The gentleman on the other end of the line asked for "a Conlin", and started asking questions about the Conlin family. Anne Marie (clever girl) knew to hand me the phone.

    i greeted the gent and he asked if we had family from a certain city. i replied that i wasn't going to answer that question. He pressed and i replied "Sorry, but i don't give information like that over the phone."

    "Why not?" he asked, somewhat concerned. "Who'd want to know?"

    "Well, i could give you a few hundred reasons why i wouldn't give someone info about my family." i replied helpfully.

    "Look, i'm a lawyer from Boston…"

    "… and i'm a rocket scientist from Florida. i have no way to prove what you're saying as you have no way to prove what i'm saying." (Frankly, if this guy was a lawyer, he's a pretty bad one.)

    "i'm just trying to find my sister…" and then he gave me a rather long winded story about himself, his sister and looking for family. i listened politely, informed him that i could offer no help, and then offered a few additional things that might help him out.

    He yelled at me to calm down. i know i can speak quickly, but i was absolutely calm. Heck, i was in complete control of the situation and happy to assist in any manner that would not compromise myself or my family. There's lots of Conlins in Michigan, if he was estranged from his folks, chances are someone up there might have more history he could chase down.

    Eventually he hung up, and i did a bit of research on my own. i couldn't find any lawyer in current practice or retired in Boston that matched his name. The cellphone number he provided is less than a year old (some idiot posted it on Facebook about a year ago) and there's scant information that matched any other part of his story. (e.g. high school or college records that match his or his "sister's" name for the approximate years they might have attended. Hell, he even managed to miss on the name of the high school he was supposed to have attended.) For what it's worth, i'm chalking it up as a scammer.

    See, that's the thing about personal information. If my wife wasn't married to a paranoid info junkie like me, she'd probably happily give this guy all sorts of information about my family, that he could then use to commit all sorts of dastardly things.

    People's first action is to try and help. It's someone looking for family, meaning they might be a long lost great cousin. As the monkey said, "Why would anyone care who wasn't family?"

    All that aside, even if he was legit, trying to find his sister by cold calling people in California might explain why the family would have nothing more to do with him.

    :: #IPA Leaves a Bitter Taste

    First off, i Am Not A Lawyer, nor do i play one on TV. i am about as qualified to speak with any form of intelligence on the subject of the legal concept of Intellectual Property as most lawyers are unable to grasp the concepts of software design. In short: 't ain't my thing.

    Setting that aside, however, Twitter's recent announcement of the Innovators Patent Agreement kinda leaves me feeling odd. In many respects, they're "Doing It Right®". They're establishing a patent policy where folks creating the patents own the ideas, the patents can only be used defensively, the policy remains in effect even if the patents are later sold or transferred, and they're even hashing all this out on github.com. Considering the crapola state of software patents and how they're stifling innovation, it's a good step toward solving this sort of crap.

    But, then there's part of my brain that speaks up. You see, you can't patent things that are in the public domain. In fact, you can modify, enhance or transform a concept, but the patent only applies to that modification (the same is true if you take an existing patent). Even then, your idea has to be a substantial improvement over the previous idea, and not just painting it blue or something.

    So, why patent it at all?

    Patents exist for one reason. To prevent you from doing something. If i can patent something, i can do whatever for the period of the patent, but if you want to do it, i have to let you. Most times, i'll let you if you pay me enough, but that's not always the case. Sometimes i won't grant you the right regardless of offers of compensation just out of spite. Right now, companies are being bought and sold just for their portfolio, which is immediately used against companies that are doing the same thing. It's an arms race.

    If twitter, or an engineer were to release an idea into the public domain, it becomes prior art. That's why so many mobile devices have multi-touch (since the idea came out in 1985). Apple owns some additional aspects of how that interface is used within their UI, but that's about it.

    i'm honestly curious about this. Part of this strikes me as painting flowers on warheads. These are still patents. They still exist to prevent you from doing things. If i wanted to create a library that uses concepts contained in these patents, that library is still subject to the holder enforcing those patents and either requesting my library to be removed or filing damages. That's what patents do. Yeah, i could make a very nice looking oven out of that crate of bullets, but i wouldn't want to make a pizza in it. i'd still want to use ideas and methods that are in the public domain so i could avoid those issues. It's why Ogg, bzip and png exist.

    i'm just not sure what to make of this. It's a bit like declaring a Mutual Assured Destruction policy after Mad Max has been driving around for a few decades.

    If you want this stuff to be freely used, why not make it free?

    Three patent attorneys look at the IPA in depth, and ask some of the same questions. Good read to see what points they raise.

    As i noted to @municode, i'm not against the #IPA, i'm just skeptical. Patents have trained me to look deeper than just what someone says.

    Update2: Andy Baio writes his opinion of the #IPA. Also an illuminating read.

    Blogs of note
    personal that's my blog
    (The Official Blog of the Internet)
    memoirs of hydrogen guy matthew shepherd (quebec) rhapsodic.org Henriette's Herbal Blog lynne ydw i slumbering lungfish
    geek jeremy z
    (The Official Website of the Internet)
    dave's picks ultramookie Josh Woodward derek balling
    news ars technica search engine watch

    Powered by WordPress
    Hosted on Dreamhost.