Validating Common Form Input - Part 6
Published on 28th of September 2009. Copyright Tavs Dokkedahl. Displayed 1985 time(s)Validating website URLs
When validating website adresses only validate the domain part. Any filename appearing after the domain can contain almost any character making it pointless to set up rules for validation.
So we are looking for a string with the format www.example.com. The example name can consist of lowercase letters a - z, digits and the hyphen (-). A URL can have a length of 1 to 255 characters.
1 function validate(form) { 2 // Shortcut to save writing 3 var url = form.elements.website.value; 4 // Check basic domain name 5 var rgx = /^\s*www\.[a-z\d\-]{1,255}\.com\s*$/; 6 if(!rgx.test(url)) 7 return false; 8 return true; 9 }
A domain name is of cause not restricted to .com. It can be any UN country code a long with other names like .gov, .aero, .edu etc. New names are added from time to time so it makes more sense to just check for at least 2 and at most 6 letters (.museum currently being the longest)
1 function validate(form) { 2 // Shortcut to save writing 3 var url = form.elements.website.value; 4 // Check basic domain name 5 var rgx = /^\s*www\.[a-z\d\-]{1,255}\.[a-z]{2,6}\s*$/; 6 if(!rgx.test(url)) 7 return false; 8 return true; 9 }
A URL need not start with www. In fact it can be anything consisting of the same class of characters as the domain name itself. From what I understand any subdomain (like www) can be 63 characters long. Addding this we get
1 function validate(form) { 2 // Shortcut to save writing 3 var url = form.elements.website.value; 4 // Check basic domain name 5 var rgx = /^\s*[a-z\d\-]{1,63}\.[a-z\d\-]{1,255}\.[a-z]{2,6}\s*$/; 6 if(!rgx.test(url)) 7 return false; 8 return true; 9 }
But URLs can also have multiple subdomains as in student.en.stanford.edu. If you want to allow for such multiple sub domains you can write
1 function validate(form) { 2 // Shortcut to save writing 3 var url = form.elements.website.value; 4 // Check basic domain name 5 var rgx = /^\s*([a-z\d\-]{1,63}\.)*[a-z\d\-]{1,255}\.[a-z]{2,6}\s*$/; 6 if(!rgx.test(url)) 7 return false; 8 return true; 9 }
This way you can have as many as you want or none at all as example.com is also a valid domain name.
Finally some users will enter http:// so we should allow for this to be optional
1 function validate(form) { 2 // Shortcut to save writing 3 var url = form.elements.website.value; 4 // Check basic domain name 5 var rgx = /^\s*(http\:\/\/)?([a-z\d\-]{1,63}\.)*[a-z\d\-]{1,255}\.[a-z]{2,6}\s*$/; 6 if(!rgx.test(url)) 7 return false; 8 return true; 9 }
URLs can be even more complicated but the above should cover all but the most exotic cases. As a general rule I would go for a domain name of the form subdomain.domain.tld with the optional http:// and making the subdomain optional.
For the last part of this tutorial we will look at email validation
| « Part 5 | Part 7 » |

Hello, thanks for the useful explaination! It has been really useful to me for its functionality and to learn a bit more about the powerful world of the regular expressions. However, in line 5, from the 3rd example onwards, there is a extra \"]\" after {1,63} makes your script not recognize subdomains properly :-) Regards,
@ Albert: Thanks. I corrected the mistake.