On Javascript Email Obfuscation


These techniques all use Javascript to first synthesize the mailto: link and then dynamically insert it into the document. They all assume that address extractors cannot evaluate Javascript. My suspicion is that currently is true, but surely will change. Someday, some spamware author will grab the HTML rendering engine from Mozilla and integrate it into an address extractor. I think Javascript encoding offers acceptable protection in the near-term, but eventually will fail.

I don't have a problem with that. The arms race metaphor is apt. A method that defends against current modes of attack is valuable. It offers protection for some amount of time, and drives up the cost and difficulty of spam.

My problem with these techniques is they demand fully functional and enabled Javascript, and do not degrade gracefully in environments in which that is not true. I'd like, for instance, to be sure that blind people using a reading machine can access my web page completely. If I used one of these techniques, I'm not sure they could.

I think these solutions all take the wrong approach. They all try to make the information inaccessible without Javascript. I think a better approach is to present the information in some obfuscated fashion that is easy for people to understand but hard for address extractors to process, and then offer a Javascript mechanism to de-obfuscate the address.

Here is what I mean. Imagine a web page with a link such as:

<a href="mailto:spamkiller_chip_at_remove-this_dot_unicom_dot_com_dot_nowhere">mail me</a>

In fact, here it is: mail me

Go ahead and click on it (if your browser supports mailto: links). What you see in your window is ugly, but you probably could figure it out. An address extractor, however, probably couldn't. Sure, it would be easy enough to encode rules to process the dot and at keywords, but how could the address extractor distinguish between the "noise words" (such as spamkiller) and the "signal words" (such as chip).

Now, let's take this one step further. Imagine a Javascript procedure that would scan the loaded document, locate the links, de-obfuscate them and re-write the document with clear addresses. I can program the list of "noise words" into the script, so it can do what an automatic extractor cannot. When somebody clicks on the link, it opens a window to mail chip [at] unicom [dot] com, with all the mumbledygook removed.

In a forthcoming article, I'll post a Javascript tool that does exactly this.


Comments have been closed for this entry.

re: On Javascript Email Obfuscation

Sounds like a really cool idea.

I'd really like to hear/read your thoughts on TMDA systems. I suspect I already know where you are as we've talked about whitelists before.

re: On Javascript Email Obfuscation

Two things:

1. It's possible to use NOSCRIPT in conjunction with a JS-generated e-mail address to insert a human-comprehensible address that a spambot is less likely to pick up on (as I showed in the sample code posted on the previous entry).

2. If spambots are going to start interpreting JS, which seems possible, how is dynamic de-munging safer than dynamic generation?

re: On Javascript Email Obfuscation

Adam, I'm not arguing point two. Once address extractors add Javascript interpreters, both methods are similarly susceptible to failure.

re: On Javascript Email Obfuscation

I long ago gave up on attempting to keep my email address secure. It doesn't help that I use a domain that others like to use for pretend addresses. Still, I went to great efforts to conceal a different, personal address from the public web, giving it only to those I know in person. Thanks to one of these people getting a Windows virus, even it has been compromised and now gets dozens of spams daily.

SpamBayes is the only thing preventing suffocation.

re: On Javascript Email Obfuscation

This morning, I was given cause to wonder if javascript-based address obfuscation has already been defeated.

I received a form letter from the creepily "helpful" folks at internetseer.com, telling me that one of my links had gone bad. I am pretty sure my e-mail address cannot be found anywhere on my site in the clear, but they found it.

It is possible that a real human being did click through.

re: On Javascript Email Obfuscation

It's true that a spammer could incorporate a JavaScript interpreter in their tools.

The next move would be for developers to publish links that went to infinite loops, infinite recursion, and so on, in order to cripple the spammers' software. You'd tuck these links away inside a display:none DIV or something similar, so normal users wouldn't click on them.

I talked to the author of the Hiveware Enkoder and he was chuckling about his ability to do something like this.

Of course, the spammers could work around this, but my point is that it's not difficult to stay a step ahead.

re: On Javascript Email Obfuscation

in response to brian's question:

TMDA type spam blockers (really more of a challenge/response authenticator) are more effective at killing spam than filters. However, the problem with TMDA type blockers is that your system actually receives all the spam... it just doesnt deliver it to your inbox until the email is validated. The spam is still stored on your machine, and depending on your system, this can be quite taxing. If you're thinking of using a TMDA type blocker, i'd suggest implementing some sort of simple bayesian filter (spam assassin is good) to prefilter email. This will keep the load on TMDA low and the combination filter/delivery agent will almost guarantee a spam free inbox.

re: On Javascript Email Obfuscation

Okay, you've convinced me. I'm adding a javascript option to my email obfuscation tool. Actually, I added a short time after your above post.

Results? No spam yet ... I say yet ...