
A lot of people are aware of the JSLint javascript syntax parser, and if they're not, they should be! JSLint, created by Mr. Douglas Crockford is a wonderful utility that can save you a lot of grief when you're writing javascript. Why? Well, because javascript is designed to be run behind the scenes by your web browser. In order to keep the experience as seamless as possible for the end-user, the default method of error reporting is... nothing. This makes a lot of sense, actually, because javascript needs to be given all sorts of leeway so that it can degrade properly as well as keep the user from becoming confused. Hence, in order to get at your javascript, you need to install something like firebug, which integrates with your browser itself.
Firebug is nice enough, but it's geared more towards step-through and data inspection, and it has somewhat rudimentary support for javascript syntax checking and, it really helps to catch these sorts of things before they get to the browser.The trick is, of course, that javascript is "flexible" to say the least, and some of these errors can be quite pernicious and hard to catch. For instance, this causes problems for some browsers:
var myPerson = {
name: "Andy Walker",
init: function() {
alert('YAWN!');
},
Height: '180cm',
Weight: '99.5kg',
};
Whereas this is perfectly fine:
var myPerson = {
name: "Andy Walker",
init: function() {
alert('YAWN!');
},
Height: '180cm',
Weight: '99.5kg'
};
The problem? That final comma in the object literal. Now, try finding that in 1500 lines of code with nothing to go on.
And then there's for/in which, while a very useful tool, is fraught with danger. Here's an example about two seconds of searching turned up online:
// function to initialise the login stuff
function initLogin () {
var login = ["login-username", "login-password"];
for ( var i in login ) {
alert(login[i]);
}
}
This code, the poor programmer complained, was outputting the first two items "just fine", but that it suddenly "starts going mad and alerts a whole load of random functions". Oh noes!
Ignoring for a moment that it is utterly retarded to debug this way, what could have been happening? Well, it turns out that another programmer had been including a framework into each page which, in turn, was adding inheritance he wasn't aware of. Easy enough to figure out if you know what you're doing and IF you have a visual cue like this that tells you you're getting more than you expected! [for more information on for/in check out this page].
Gosh, wouldn't it be nice to spot some of this stuff ahead of time? By running JSLint on all of our javascript files as part of our development process, we can find these nasty bugs before they start! A flexible way to do this would be to use a commandline utilty that can take a file as input and then output any problems it finds. Unfortunately, JSLint is also written in javascript, which makes it kind of difficult to leverage from the commandline.
There are a few different methods to run Javascript at the commandline, each of which has their own little quirks, espcially since JSLint needs to somehow access the file it's going to parse, and javascript wasn't really designed to be a console scripting language.
So, what are our options? Well, it turns out there are four of them, each with their own problems:
Rhino is, like the site says: "an open-source implementation of JavaScript written entirely in Java." Mr Crockford is aware of this option, and has thoughtfully provided a version of JSLint that can be run through Rhino directly from the commandline. Initially, this seems like a pretty good solution, because Rhino is written in Java, this also makes it portable. Unfortunately, this is also a drawback. As dom (who hates software) has noted, it's excessive to fire up the JVM each time we want to scan a javascript file, and Rhino isn't really intended to run quick one-off scripts like this. It's more for embedding the power and flexibility of jacascript into java applications. Imagine extending this solution to a case where you want to scan through 50-100 or more files as part of a nightly smoke-test or something, and it starts to look less and less promising.
WSH, or the Windows Script Host, is actually a pretty spiffy utility that lets you run a number of different types of scripting languages from the commandline and can help you automate various tasks on your Windows system. Unfortunately, this leaves mac and *nix users out in the cold, and I don't think Microsoft is going to port it any time soon (though there is a version of JSLint for WSH).
NJS, according to its sourceforge site is "an independent implementation of the JavaScript language developed by Netscape and standardized by ECMA." Hey, neat! This sounds like a great option! Let's just click on the site and downlo—oh. Shit. Looks like the site is down. Looks like it's been down for a long time. Looks like this isn't an option after all.
Spidermonkey, according to the terse little blurb on its site, is "the code-name for the Mozilla's C implementation of JavaScript.". Well, this seems promising. This is the engine they use in the browser, after all, so it has to be as fast as possible. So... that's nice. And, it looks like there are versions for linux, BSD, and even Windows! (sorta, but you could always build your binary or use WSH).
So, okay, it's fast and portable, what's the catch? Well, the catch is that spidermonkey doesn't have any kind of file-reading enabled by default. The Rhino version of JSLint leverages Rhino's readFile() extension to javascript, in order to parse the file into a variable, but Spidermonkey's corresponding File type isn't enabled in most distributions (certainly not the standard linux one) because it can potentially cause security problems and isn't well tested. So, Spidermonkey works, in theory, but it has no facility to read files from the disk, so it seems useless to us. Is there anything that can be done?
Well, there have been a few attempts to get around this, but they involve weirdness in passing the file into another javascript script and they also require you to parse html output and, in general, can be frustrating, so I thought I'd try and find a better way.
Well, it turns out that, while Spidermonkey does not support the File object, it DOES support reading in from stdin with readline(), which is a start. The problem with this initially is that readline() is intended for interactive input, meaning prompted and one-line-at-a-time. So, while this provides us with a method to get something into the interpreter, there's no easy way to tell when the input is complete or to simply say "read this entire file X into this variable Y. The way I discovered to get around this is to use a simple loop like this:
var input="";
var line="";
while (line=readline()){
input += line;
input += "\n";
}
print (input);
Which works... sort of. Sure, I have to add a newline to replace the newline that's chopped off by the shell interaction, but it reads the file in. Just not all the way. The problem is that the test (line=readline()) fails when readline is nothing.. which happens the first time a blank line comes out of the file, so when I run:
# js testread.js < myfile
or
# cat myfile | js testread
I only get the first few lines of my file up until the newline.
To solve this I had to figure out some way to read the blank lines in the file and still terminate when the file was done. The way I figured it out is actually really simple: I just wait until readline() returns nothing 10 times in a row:
var input="";
var line="";
var blankcount="0";
while (blankcount < 10){
line=readline();
if (line=="")
blankcount++;
else
blankcount=0;
if (line=="END") break;
input += line;
input += "\n";
}
input = input.substring(0, input.length-blankcount);
The number 10 itself is rather arbitrary and it could be anything, really, but I wanted a number that was high enough that it probably wouldn't occur in any kind of valid context, yet small enough so that it would be almost instantaneous to parse. I could probably even have gotten away with 5, but I could conceivably see some over-zealous newbie separating blocks of code with this number of newlines, so 10 seemed like a good number.
This plays off of the fact that readline() will just continuously return nothing when it's reading from an empty pipe (like if you pipe to it or redirect to it as above). I added in the if (line=="END") break; so that you could conceivably run it directly from the commandline and then type Javascript into it or paste to it and then tell it that you were done directly, rather than hitting enter 10 times.
A nifty side effect of this is that you can use it in a couple of different ways directly from the commandline. Also, Spidermonkey (unlike Rhino) can be used as a script interpreter, so you can just take the modified JSLint and slap #!/usr/bin/js on the beginning, make it executable, and run it or pipe things to it directly, just like any other script.
Turns out, Spidermonkey isn't just faster than Rhino, it's a lot faster (at least to start up). Let's try it on a 556-line javascript file I have:
$ time rhino ~/bin/jslint-rhino.js MyFile.js jslint: No problems found in MyFile.js real 0m2.143s user 0m5.292s sys 0m0.140s $ time cat MyFile.js | ~/bin/jslint jslint: No problems found. real 0m0.511s user 0m0.500s sys 0m0.016s $
Which is pretty impressive, but let's simulate it with 50 files, as if we were scanning an entire tree as part of a build acceptance test or something:
$ time for i in `seq 1 50` ; do rhino ~/bin/jslint-rhino.js MyFile.js ; done jslint: No problems found in MyFile.js jslint: No problems found in MyFile.js jslint: No problems found in MyFile.js . . . jslint: No problems found in MyFile.js real 1m50.899s user 4m54.166s sys 0m6.184s $ time for i in `seq 1 50` ; do cat MyFile.js | ~/bin/jslint ; done jslint: No problems found. jslint: No problems found. jslint: No problems found. . . . jslint: No problems found. real 0m25.727s user 0m25.330s sys 0m0.456s $
Gah! Almost 20 times as long!
Of course, benchmarking is evil and unrealistic, but when the spread is this big, there's usually something to it.
So, that's enough of my little adventure. Here's the modified version of JSLint that I came up with (which is really just a small hack, but it works and it's faster than the accepted alternative and doesn't require you to install any bulky frameworks to get your job done).
It could certainly use some polishing and tweaking to make it behave more like a good console app should. My next step is to wrap it in a simple perl script that maps the different options to commandline arguments and passes them into the script. This will enable more fine-tuned control over what JSLint checks for, as well as making it behave more like a modern console app and less like a pager. Expect those soon, but if you beat me to it, feel free to email!
Enjoy!