420 Creative - Portland Web Design Studio

The wrong way to create web forms

Oct 13 2007

Angie Herrera

Web Development

No thanks to the proliferation of spammers out in cyberspace, it's become a necessity to obfuscate email addresses on websites to avoid adding to our inbox clutter. Some businesses/website owners leave email addresses off their sites altogether and instead opt for a web form. That's not a bad approach at all, but there's a right way to do it and a wrong way. ### How your email address is harvested Before we get into the wrong way to create forms, let's back up and make sure we understand how your email address is "harvested". Spammers use automated "spiders", "spambots" or "crawlers" to grab email addresses in large quantities off of thousands of websites at a time. They also guess at obvious email addresses, banking on the likelihood that you have one of them. For instance, info@yourdomain.com or contact@yourdomain.com These spiders basically crawl your website and search for email addresses. Specifically, they're looking for the exact email address syntax -- a prefix, plus "@", and your domain/URL. It's easy for some of us to forget though that spammers' automated crawlers don't read what we read on the screen. They read the code that forms what we see. ![Webforms 1](/assets/img/blog/webforms_1.jpg) So placing an email address on a site sometimes looks like this: yourname@yourdomain.com What we see, as we're browsing a site, is this: yourname@yourdomain.com However, in the code, that lists an email address _twice_. So web designers will sometimes do this: Email Your Name The difference there is what we see on the front end: Email Your Name The bad news is that your email address is still readable in the code and can therefore be scooped up by spambots. So it's now common to "scramble" the email address in the code to prevent it from being harvested by spammers through the use of scripts. It's not 100% full-proof but it does help. ### Forms are still code Because of the vulnerabilities that posting your email address on your site has, including a contact form has become a more common approach. However, like with anything you see on a web page, it's still code. ![Webforms 2](/assets/img/blog/webforms_2.jpg) _Web form code from our website_ The way the form's information gets to a recipient isn't trivial and it somewhat depends on the script that the form is using. But therein lies the problem. Sort of. Forms should use solid scripts. Not only should they hide email addresses within the code, they need to prevent other more complex "hacks". The problem lies with the inexperienced or not very knowledgeable web designer. The old school way of sending a form's contents to the site owner used to be simply adding the email address to the action="" attribute. (That attribute tells the form what to do when a user clicks the submit button.) Scripts completely remove the email address from the action attribute and place it somewhere else, _usually_ hidden from spambots (and curious people looking at source code). ### The wrong way That, in a nutshell, is how forms _should_ work. It's a bit more complex than that but you probably get the idea. So it boggles my mind that in 2007 I still will run across this sort of thing: ![Webforms 3 Small](/assets/img/blog/webforms_3_small.jpg) _Click for a larger view_ If you take a close look you'll see that there's no email address in the action attribute. However, this web designer has added several "hidden input fields". One of them (highlighted in yellow) contains an email address in the value="" attribute (it's been blurred by us on purpose). A hidden input field only hides it from our view when we're looking at the form through a web browser. **It does not hide it from any kind of site crawler, including spambots.** If you look at a page's source code and see an email address anywhere in the code **it is susceptible to being harvested** by spambots. This form, therefore, does the site's owner **nothing** to protect his/her email address. ### The right way The right way is simple. Web designers (those that code, not necessarily the ones that only create the design/layout) _need to stay informed of code trends and vulnerabilities_. In this particular case finding a good, solid script (whether JavaScript or PHP or what have you) would have prevented not just the vulnerability in the form, but this whole article. I'm not saying we need to know how to solve every problem; but we do need to know what the problems are. Knowing these things and staying ahead of the curve is what makes us good at what we do and more importantly, gives us credibility that leads to trust from clients.