Doug Crockford gave a talk at the University of Waterloo last night as part
of the Yahoo!
Hack-U (University Hack Day), on the same topic as his new book, JavaScript: The Good Parts. I was lucky enough to get a seat, and have tried to
condense my nine pages of notes into an overview of the highlights of his talk. Though I've
rewritten parts, all of the content and wisdom is Doug's -- I'm just the
humble scribe. Of course, any errors of transcription are mine.
Introduction
Believe it or not, JavaScript has good parts! It's an odd language, because
it contains some of the best and some of the worst ideas in programming
language design, and has managed to become both the most popular and most
reviled programming language out there.
Of all languages, JavaScript probably has the broadest range of skills
among its users. It appeals to both computer scientists and cut'n'paste
beginners with no clear idea of what they are doing. It's pretty much the only
language that people use without ever learning! That's both the cause of a lot
of the awful code out there, and an astonishing testament to the fact that
it's actually possible to do that.
JavaScript is not what makes in-browser programming awful -- it's the DOM.
Any language would be painful if you had to use it to interact with the DOM.
It's also what makes things slow and inefficient.
The history of the language is incredibly diverse: it's been influenced by
Scheme (lambdas, loose typing), Self (prototypal inheritance, dynamic
objects), Java (syntax), and Perl (regular expressions).
The Bad Parts
Global variables: This is bad for all the same reasons as
in other languages, but in addition, JavaScript will implicitly declare your
typo-ed identifiers, and silently carry on.
+ both adds and concatenates: You can get away with this
in Java because of the type restrictions, but in JavaScript you'll blithely
try to add a number to, say, a form input which looks like a number but is
actually a string.
Semicolon insertion: This seems like a nice,
beginner-friendly feature, at first. It's implemented by the parser running
along until it hits a syntax error, at which point it rewinds a little,
inserts a semicolon in a likely place, and tries again. This should scare
you.
typeof: What's the typeof an
object? Object. Of an array? Object. Of null? Object.
with and eval: Security
implications are bad. eval is probably the single worst misused
feature. If you find that you want to use it, step away from the
computer
.
Fake arrays: They're actually hash tables, which is OK if
that's what you want, and if that's what you call them.
== and != do type coercion. Unfortunately nobody can
figure out exactly when or how. For example, 0 == '0' is true,
but false == 'false' is false, and '' == '0' is
false, but 0 == '' is true. Thankfully, you can just always use
===.
false, null, undefined, NaN: These are all almost the
same, but not quite.
Bad Genetics
There is also a good deal of bad behaviour that's inherited and shared with
other languages: block-less statements (e.g. one-line if and so
on); expression statements (lone expressions on a line that will be evaluated
and discarded); IEEE floating point (0.1 + 0.2 != 0.3); ++ and -- (leads
developers into clever
behaviour); and fall-through of
switch blocks.
Doug had an amusing anecdote about an episode in the development of JSLint:
A user contacted him suggesting that fall-through of cases be flagged as bad
behaviour. Dough replied with an explanation of the elegance of nicely
structured, intentional fall-through, which convinced the user to retract his
feature request. In the user's response, in addition to withdrawing the
feature request, he reported another bug. When Doug investigated it, it turned
out to be... yup, unwanted switch fall-through. At that moment, Doug says,
I was enlightened
.
Good Parts
JavaScript was the first really mainstream language with lambda and
first-class functions, which other languages are now adopting. This makes JS
an influential language!
Dynamic objects are simple containers that can grow or shrink, and since
they are based on prototypal inheritance, they aren't limited to just being
instances of a class. This is a strictly more powerful object model, but it
takes some getting used to for most people.
Loose typing is one of the controversial parts, which some people would
consider one of the bad things. However, Doug's conclusion is that the added
expressiveness and ease of use is well worth it, since the kind of bugs
avoided by strict typing are usually easy to fix anyway.
Gotchas
Globals
Consider the code:
var names = ['zero', 'one, 'two', ...]
var digit_name = function (n) {
return names[n];
}
Though it works, it makes use of the nasty, global, names, which
could lead to all kinds of nonsense. We could move the variable into the scope
of the function instead, but that would be rather inefficient. Instead, try
this:
var digit_name = (function() {
var names = ['zero', 'one', 'two', ...];
return function (n) {
return names[n];
};
})();
This is an example of one of the good parts: closures. We can define the
variable just once, and then have our function close over it, preserving
state. The trailing () at the very end cause the anonymous function to be
executed right away, binding the returned function to our variable. This is
awesome.
Style Isn't Subjective
Brace positioning is more or less a holy war without any right
answer -- except in JavaScript, where same-line braces are
right and you should always use them. Here's why:
return
{
ok: false;
};
return {
ok: true;
};
What's the difference between these two snippets? Well, in the first one,
you silently get something completely different than what you wanted. The lone
return gets mangled by the semicolon insertion process (remember that from the
list of Bad Parts?), becomes return; and returns nothing. The rest
of the code becomes a plain old block statement, with ok: becoming
a label (of all things)! Having a label there might make
sense in C, where you can goto, but in JavaScript, it makes no
sense in this context. And what happens to false? It becomes one of
those expression statements mentioned in the Bad Parts: it gets evaluated and
completely ignored. Finally, the trailing semicolon -- what about that? Do we
at least get a syntax error there? Nope: empty statement, like in C.
Use same-line braces, folks.
JSLint
JSLint defines a professional subset of JavaScript, and imposes programming
discipline. You should do everything it tells you, even if it hurts your
feelings. Doug says JSLint is smarter about JavaScript than I am, and
probably smarter than you are too
.
History and Future
AJAX and the resurgence in popularity of JavaScript could have happened way
earlier, but Netscape 4 and the other browsers of the time we so
awful. Netscape 4 was a crime against humanity
. IE 6 was the
best browser in the world -- and think of just how bad it is.
However, all that may have been good for JavaScript: had anything happened,
it would have been thrown out and replaced with something much better!
JavaScript would have died with Netscape if not for Microsoft diligently
duplicating it, bugs and all.
Perhaps the very best part of JavaScript: stability! No new design errors
since 1999! Also, no new versions.
Thankfully, ECMAScript Fifth Edition is in the works (and is actually
readable), with nice features like support for object hardening and a strict
mode (invoked with "use strict";, which is an expression
statement under older versions).
Unfortunately, we're still waiting on implementations. Microsoft will
likely have the first working version, but they won't ship it until whenever
IE 9 comes out. Mozilla seem to just be waiting to see what Microsoft does,
and they'll react to that. Apple can't comment
on future products.
Google will just do whatever Apple does (UPDATE: Now that I am an intern at
Google, I should probably clarify that that's Doug Crockford's statement; I
know nothing about any Google plans on the matter).
The Really Good Parts
If you use JavaScript, you have a potential audience of billions. It's the
most widespread -- and despite the bugs, the most cross-platform -- system you
can use.
It is possible to write really good code. In fact, it is mandatory
if you want to maintain sanity.
If you avoid the bad parts, it works really well. It's not just usable and
pleasant; there is brilliance in it.
Misc. Q&A
At the Q&A afterwards, there were a few interesting gems:
- The people in charge of the language (ECMA), and the people in charge of
the DOM (the W3C), have never had a joint session or meeting, but he's
trying to change that.
- He thinks the DOM is awful, and HTML 5 is taking it in
exactly the
wrong direction
.
The Book
I'm going to wrap this up the same way he did: with a plug for his new
book, JavaScript: The Good Parts (Amazon.com, Amazon.ca).
If you do any JavaScript development, get a copy! It contains all of the above
wisdom, and much more.
Now excuse me; I'm off to do some JavaScript.
Makefiles are the granddaddy of build systems. Though falling out of favour relative to more modern systems like SCons and ant, make is still the lingua franca of software builds, particularly in the C and C++ parts of the open source world. Because of this, it is imperative to have at least a basic understanding of makefiles and their use.
There are plenty of tutorials introducing the fundamentals of makefile syntax, and a handful that show off some advanced features. There are very few, however, that actually show how to write a useful makefile, or that introduce makefile conventions and patterns. For me, this meant that writing makefiles became an arduous process of stringing together snippets from various places, and hoping they interoperated harmoniously. Frustratingly, I'd often learn of a new feature months later and rip out half of the file and replace it with a single line. Worst of all, I had no idea if what I was doing was conventional
or even passable as a serious makefile.
I therefore want to put out this guide to basic makefile usage and conventions, and in the process, develop a basic makefile template that can be used for most small projects or as a starting point for more elaborate build systems. The resulting makefile will also roughly adhere to the GNU makefile conventions, but only where it makes sense for a small project and where support is not too onerous. For the purposes of the guide, we'll be writing a makefile for a C program, but the ideas are easily applicable to other languages. So if you'll oblige me by firing up your text editors, I'll get started.
Build Variables
At the top of our makefile, we will want to declare the variables used in the build process. Keeping everything in variables allows for easy modification of multiple build rules at once, as well as exporting variables from higher-level makefiles in the case of recursive builds (ones where this makefile is just building a particular component). For portability, we can start out by declaring our SHELL and compiler. These two variables are among Make's many special names, and are used implicitly in certain situations, so it is good practice to specify them. We can do so with the following snippet:
SHELL = /bin/sh
CC = gcc
Next we'll define variables for the actual compiler flags used for building. My personal system is to break these up into four parts: FLAGS, used for mandatory flags without which the project will not build, RELEASEFLAGS and DEBUGFLAGS, for public release and debugging flags, respectively, and CFLAGS for user-defined C compiler settings. This last one is a standard that some users like to define for themselves, and so it should always use that name. I use it for things that are not essential but that I would always like to have in place when building. For this makefile, I've defined these variables as follows:
FLAGS = -std=gnu99 -Iinclude
CFLAGS = -pedantic -Wall -Wextra -march=native -ggdb3
DEBUGFLAGS = -O0 -D _DEBUG
RELEASEFLAGS = -O2 -D NDEBUG -combine -fwhole-program
You can see above the usage of the different variables. Without the FLAGS settings (which specify to use the GNU variant of the C99 standard, and to look for #included files in the include
directory, respectively) the hypothetical code would likely not compile correctly (logically, we assume that this hypothetical code does in fact keep header files in that location, and does make use of C99 extensions). The debug and release flags, on the other hand, contain various optimization directives and declarations: the NDEBUG definition causes assert()s to be taken out (among other things); the combine and fwhole-program flags instruct GCC to assume that the files it is working on comprise the whole program, and to optimize accordingly (this only works for C at present); and the O number specifies the level of optimization to apply. Finally, CFLAGS holds user-optional choices, as promised. In this example, I have chosen to make the compiler very strict about errors and warnings (pedantic, Wall, Wextra), instructed it to tune the output program for my specific machine architecture, and finally asked for the inclusion of copious amounts of GDB-specific debugger information. For maximum portability you should not assume GDB, but in practice it is fine for me.
Now let's define some variables to hold important files related to the build. We'll need the name of the program we're building, which I've called foomatic-widget
, a list of source files, header files, common headers on which all files depend, and the object files that our sources will compile to. The application name and common headers we can just specify, but keeping track of all our source and header files could be a pain. I've therefore used a Make feature where we can call out to the shell, in this case to get a list of all files ending in .c
and .h
. Likewise, the list of object files is built by taking all the source files, and replacing their extensions with .o
. This all looks like this:
TARGET = foomatic-widget
SOURCES = $(shell echo src/*.c)
COMMON = include/definitions.h include/debug.h
HEADERS = $(shell echo include/*.h)
OBJECTS = $(SOURCES:.c=.o)
Finally, we define some paths used for installing our program in a more permanent fashion. By convention, the DESTDIR variable is used, even though we don't declare it, as this allows the user to test installation to any directory by specifying a DESTDIR on the command-line. These variables are defined this way:
PREFIX = $(DESTDIR)/usr/local
BINDIR = $(PREFIX)/bin
Build Targets
Now we get on to the main business of Make: building things. Make uses the concept of targets to represent sets of instructions that you want it to run. The first target listed is the default one used if Make is invoked without specifying a target. Otherwise, you can run a different one with make targetname. By convention, the all
target builds the project fully, and is the default. Targets can also have prerequisites: targets that will be processed prior to the current one, or files that, if changed, will cause the target to be rerun. If the target itself is a file, Make intelligently determines whether it needs to be rerun based on its prerequisites' times of last modification.
I typically just make all
depend on the actual name of the executable, defined in $(TARGET) above. This ensures that the executable is built if you simply run Make. Optionally, you can also define other targets as prerequisites; for example, I often include a run of the indent or cppcheck utilities, depending on the nature of the project.
Now we have to let Make know how to build the $(TARGET) that all depends on. We do this by defining it as a new target, which depends on the $(OBJECT) files of each component, as well as the common headers. This target, however, actually contains a rule on how to build it, which will be run when all the prerequisites have been satisfied. This rule is on a new line, and must be indented with tabs. This is a common pitfall, though many editors will make sure that you don't accidentally use spaces here unless you really want to. The rule simply consists of a call to our compiler, defined above, with all the flags that we also defined, and a list of the object files to link. The first two rules then look like this:
all: $(TARGET)
$(TARGET): $(OBJECTS) $(COMMON)
$(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -o $(TARGET) $(OBJECTS)
Now, you may have noticed that we're building with the debug settings. How then, do you produce something for day-to-day usage? Why, with another target, invoked with make release and looking like this:
release: $(SOURCES) $(HEADERS) $(COMMON)
$(CC) $(FLAGS) $(CFLAGS) $(RELEASEFLAGS) -o $(TARGET) $(SOURCES)
You may also, later in the development cycle, wish to compile your program with profiling information. The way I've implemented this functionality is with another feature of Make, namely modifying variables. The first target below causes the CFLAGS variable to include a profiling option, and then the actual target causes the application to be built with the new set of flags.
profile: CFLAGS += -pg
profile: $(TARGET)
Administrative Targets
We should also define some administrative
targets, which will let us move files around or remove them as needed. A subset of the ones suggested by the GNU Makefile conventions are below:
install: release
install -D $(TARGET) $(BINDIR)/$(TARGET)
install-strip: release
install -D -s $(TARGET) $(BINDIR)/$(TARGET)
uninstall:
-rm $(BINDIR)/$(TARGET)
clean:
-rm -f $(OBJECTS)
-rm -f gmon.out
distclean: clean
-rm -f $(TARGET)
The install and install-strip targets provide us with a mechanism to put our final built binary in some appropriate path, as defined in the environment variables above, and using the standard install utility (the naming is a bit confusing: we have both an install
Make target and a system utility). The latter option strips debugging symbols from the binary in the process. Both targets depend on the release target, so we can expect that to be built as per the process described above. Uninstall provides the reverse functionality.
The two cleaning-related options are also standard; they differ only in that distclean restores the directory to the pristine state it would be distributed in, i.e. the compiled binary is also removed. The commands in these targets are preceded by a minus sign, telling Make to continue even if the command yields an error (like if the files don't exist).
With these targets in place, we should also take a moment to consider what would happen if we were to actually create a file named, for example, release
or install
. Make would start deciding whether to run these targets based on the freshness of those files -- clearly not the behaviour we want. We can work around this by defining these targets as PHONY, which tells make to always execute them (solving our problem) and to not bother searching for prerequisites (slightly improving performance). We do this as follows:
.PHONY : all profile release \
install install-strip uninstall clean distclean
Objects
Our application target above depends on a whole bunch of object files. We could list them all individually, or we could allow Make to build them implicitly (it's pretty smart and can mostly figure it out), but we can do even better. We can define a wildcard rule that will match all object files, and build them just the way we want. We could also define one or two object files individually, if they were special cases for some reason.
This wildcard rule makes use of a few special variables. The first one you'll see is %.o. That is the actual wildcard that matches object files. We can use a similar syntax to make it depend on the right source file as a prerequisite. We also need to know about the $@ and $< variables, which refer to the current target and the first prerequisite, respectively. The rule can then be built like this:
%.o: %.c $(HEADERS) $(COMMON)
$(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -c -o $@ $<
You may have noticed that the above rule has all header files as a prerequisite. This is to be on the safe side, in case other parts of the program that are relevant to that file were changed. Depending on the size of your project, that may represent a significant amount of time wasted needlessly. If you're not averse to some really gruesome syntax, and want to rectify the problem, and if you're using GNU Make only, you can do better.
Using a feature of GNU Make known as second expansion
, you can dynamically determine the specific headers to care about by calling out to GCC with the -MM option, which makes it list the headers included by a particular file. Second expansion allows us to evaluate variables a second time, later on in their lifecycle, where the surrounding context may have changed. For details on the deep magic
going on here, consult the actual manual, but you should be able to get a rough idea of what's going on from the following implementation:
.SECONDEXPANSION:
$(foreach OBJ,$(OBJECTS),$(eval $(OBJ)_DEPS = $(shell gcc -MM $(OBJ:.o=.c) | sed s/.*://)))
%.o: %.c $$($$@_DEPS)
$(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -c -o $@ $<
The final product
Our shiny new makefile is reproduced below in its entirety:
SHELL = /bin/sh
CC = gcc
FLAGS = -std=gnu99 -Iinclude
CFLAGS = -pedantic -Wall -Wextra -march=native -ggdb3
DEBUGFLAGS = -O0 -D _DEBUG
RELEASEFLAGS = -O2 -D NDEBUG -combine -fwhole-program
TARGET = foomatic-widget
SOURCES = $(shell echo src/*.c)
COMMON = include/definitions.h include/debug.h
HEADERS = $(shell echo include/*.h)
OBJECTS = $(SOURCES:.c=.o)
PREFIX = $(DESTDIR)/usr/local
BINDIR = $(PREFIX)/bin
all: $(TARGET)
$(TARGET): $(OBJECTS) $(COMMON)
$(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -o $(TARGET) $(OBJECTS)
release: $(SOURCES) $(HEADERS) $(COMMON)
$(CC) $(FLAGS) $(CFLAGS) $(RELEASEFLAGS) -o $(TARGET) $(SOURCES)
profile: CFLAGS += -pg
profile: $(TARGET)
install: release
install -D $(TARGET) $(BINDIR)/$(TARGET)
install-strip: release
install -D -s $(TARGET) $(BINDIR)/$(TARGET)
uninstall:
-rm $(BINDIR)/$(TARGET)
clean:
-rm -f $(OBJECTS)
-rm -f gmon.out
distclean: clean
-rm -f $(TARGET)
.SECONDEXPANSION:
$(foreach OBJ,$(OBJECTS),$(eval $(OBJ)_DEPS = $(shell gcc -MM $(OBJ:.o=.c) | sed s/.*://)))
%.o: %.c $$($$@_DEPS)
$(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -c -o $@ $<
# %.o: %.c $(HEADERS) $(COMMON)
# $(CC) $(FLAGS) $(CFLAGS) $(DEBUGFLAGS) -c -o $@ $<
.PHONY : all profile release \
install install-strip uninstall clean distclean
For more detailed documentation, consult the GNU Make Manual or the GNU Makefile Conventions document.
A few days ago there were reports that Korea, already a leader in
telecommunications infrastructure, would be pursuing plans to provide 1 Gbps
Internet connectivity across the country by 2012. An excerpt from the Slashdot
summary:
The entire country is gearing up to have 1 Gbps service by
2012, or at least that is what the Korea Communications Commission (KCC) is
claiming. 'Currently, Koreans can get speeds up to 100 Mbps, which is still
nearly double the speed of Charter's new 60 Mbps service. The new plan by the
KCC will cost 34.1 trillion ($24.6 billion USD) over the next five years. The
central government will put up 1.3 trillion won, with the remainder coming
from private telecom operators.
Now, whenever facts like this are mentioned, people ask why we in Canada
and the US are stuck with paltry two to ten Mbps connections that also suffer
from ISP bandwidth throttling and traffic shaping policies. Usually at least one
response points out that the US and Canada are vastly larger countries, and it
is therefore not economically feasible to cover the entire country in high-speed
fibre-optic links. An unusually mild example is this comment to the Slashdot story:
Korea is roughly 1/100th the size of the US. If we estimate a
similar plan in the US based on size only, it would cost $2.46 trillion USD.
The Korean government is paying 1.3 trillion of the 34.1 total (or roughly
4%). If the US government did something similar, it would be about $100
billion USD.
Population, not area

Although the above argument is technically correct, it confuses coverage of
landmass with coverage of people. The fact is, there is no
need to provide high speed internet to vast tracts of US and Canadian
wilderness, or even rural, regions. There are inhabited areas in both
countries that have no broadband connectivity whatsoever, and likely more than
a few villages that lack even dial-up. The point of expanding the capabilities
of North American Internet infrastructure is not to provide
everywhere with high-speed connections, but to provide them to as
many people as possible. Focussing on the densely populated metropolitan
centres of both countries reveals what a specious argument comparing areas
is.
First, some background statistics to frame the discussion: The area of
South Korea is almost exactly 100,000 square km. The US and Canada cover
approximately 9,826,600 and 9,984,700 square km, respectively. The estimated
population of the US is a shade under 306 million, while Canada is home to 33
and a half million souls. The GDP of Korea is just under one trillion US$; the
US's a bit more than 14 trillion, and Canada's is almost exactly one tenth of
that, at 1.4 trillion.
If the US government and telecoms would invest in providing a similar level
of coverage to just the five most populated cities and surrounding areas (New
York, Los Angeles, Chicago, Dallas-Fort Worth, Philadelphia), it would
represent an area of 85,966 square kilometres (so, well under the area of
Korea), and would provide coverage to 53,189,247 people. Furthermore, there
are a number of areas that I suspect state governments and even local
corporations would be willing to help finance the buildout; San Diego, Irvine,
and San Francisco come to mind, as do Washington D.C. and Seattle. On top of
that, if we use GDP as a very rough measure of the relative investment
potential of the two nations, it seems clear that the US should be able to
afford an investment around 15 times as large in the first place. Adding up
all these factors, it's clear that the US could easily afford to extend
coverage well beyond those five areas, and provide coverage to many millions
more, as well as most of the country's technology hubs.
In Canada, the situation is even more extreme. The top five metropolitan
areas (Toronto, Montreal, Vancouver, Ottawa, and Calgary), cover just 24,687
square kilometres, and contain just over 13 million of Canada's 33 and a half
million inhabitants. In other words, almost 40% of the population in less than
a quarter of South Korea's area. Extending coverage to the top ten
municipalities would likely produce quickly diminishing returns, but would
probably still encompass less territory than the Korean plan, while providing
coverage to over half the population. Given that Canada's GDP is roughly 1.5
times that of South Korea, the proportional size of the investment would be
even smaller.
No need to go overboard
Now, 1 Gbps may be an investment in the future, but in this context one
must certainly mean the distant future; for the fact is that 1 Gbps is not
just extremely fast, it is gratuitously fast. To put it in
perspective, a network connection of that speed would be able to
simultaneously carry between 50 and 200 HDTV channels (depending on quality
and compression). An investment in Canada or the US to provide connectivity at
100 Mbps (the current Korean high-end class of connectivity) would require a
much lower cost, while still providing connections 10 to 50 times faster than
the current residential standard of 2 to 10 Mbps. I'd settle for that. So why
doesn't it happen?