On Javascript Email Obfuscation
These techniques all use Javascript to first synthesize the mailto: link and then dynamically insert it into the document. They all assume that address extractors cannot evaluate Javascript. My suspicion is that currently is true, but surely will change. Someday, some spamware author will grab the HTML rendering engine from Mozilla and integrate it into an address extractor. I think Javascript encoding offers acceptable protection in the near-term, but eventually will fail.
I don't have a problem with that. The arms race metaphor is apt. A method that defends against current modes of attack is valuable. It offers protection for some amount of time, and drives up the cost and difficulty of spam.
My problem with these techniques is they demand fully functional and enabled Javascript, and do not degrade gracefully in environments in which that is not true. I'd like, for instance, to be sure that blind people using a reading machine can access my web page completely. If I used one of these techniques, I'm not sure they could.
I think these solutions all take the wrong approach. They all try to make the information inaccessible without Javascript. I think a better approach is to present the information in some obfuscated fashion that is easy for people to understand but hard for address extractors to process, and then offer a Javascript mechanism to de-obfuscate the address.
Here is what I mean. Imagine a web page with a link such as:
<a href="mailto:spamkiller_chip_at_remove-this_dot_unicom_dot_com_dot_nowhere">mail me</a>
In fact, here it is: mail me
Go ahead and click on it (if your browser supports mailto: links). What you see in your window is ugly, but you probably could figure it out. An address extractor, however, probably couldn't. Sure, it would be easy enough to encode rules to process the dot and at keywords, but how could the address extractor distinguish between the "noise words" (such as spamkiller) and the "signal words" (such as chip).
Now, let's take this one step further. Imagine a Javascript procedure that would scan the loaded document, locate the links, de-obfuscate them and re-write the document with clear addresses. I can program the list of "noise words" into the script, so it can do what an automatic extractor cannot. When somebody clicks on the link, it opens a window to mail chip
unicom [dot] com, with all the mumbledygook removed.
In a forthcoming article, I'll post a Javascript tool that does exactly this.
Trackbacks
Trackback URL for this post: http://www.unicom.com/trackback/176
Delicious
Digg
Reddit












