I have to validate email as per RFC 2822. Can someone please give me a General Function OR Regular Expression to validate Email as per
RFC 2822?
( Actually I am looking for both the things, a vb.net Function to validate email as well as
Regular Expression, but please provide me whichever is possible. )
Umm..I think you may be underestimating the complexity of what you are asking. To "validate" a message against the RFC 2822 spec, you have to read through the message, parse out all the header fields, make sure the header fields are valid headers, and then
parse the data for each respective field to make sure it is valid (This includes character set, legal/illegal characters, and whether or not the data for that header field is an allowable value). There are around 25 header fields or so that you will have to
check (although less than half of them are commonly used I believe).
There is no one function or regular expression that will do this. If you read through the spec you will see that there is nothing overly complicated about it, but it will require several hundred lines of code at a minimum. If you are looking for a quick
solution you may be better off investigating some 3rd party products.
Maybe you're wondering why there's no "official" full-proof and fool-proof regex to match email addresses. Well, there is an official definition, but it's hardly full-proof.
The official standard is known as
RFC 2822. It describes the syntax that valid email addresses must adhere to. You can (but you shouldn't--read on) implement it with this regular expression:
This regex has two parts: the part before the @, and the part after the @. There are two alternatives for the part before the @: it can either consist of a series of letters, digits and certain symbols, including one or more dots. However, dots may not appear
consecutively or at the start or end of the email address. The other alternative requires the part before the @ to be enclosed in double quotes, allowing any string of ASCII characters between the quotes. Whitespace characters, double quotes and backslashes
must be escaped with backslashes.
The part after the @ also has two alternatives. It can either be a fully qualified domain name (e.g. regular-expressions.info), or it can be a literal Internet address between square brackets. The literal Internet address can either be an IP address, or
a domain-specific routing address.
The reason you shouldn't use this regex is that it only checks the basic syntax of email addresses.
john@aol.com.nospam would be considered a valid email address according to RFC 2822. Obviously, this email address won't work, since there's no "nospam" top-level domain. It also doesn't guarantee your email software will be able to handle it. Not
all applications support the syntax using double quotes or square brackets. In fact, RFC 2822 itself marks the notation using square brackets as obsolete.
We get a more practical implementation of RFC 2822 if we omit the syntax using double quotes and square brackets. It will still match 99.99% of all email addresses in actual use today.
A further change you could make is to allow any two-letter country code top level domain, and only specific generic top level domains. This regex filters dummy email addresses like asdf@adsf.adsf. You will need to update it as new top-level domains are added.
So even when following official standards, there are still trade-offs to be made. Don't blindly copy regular expressions from online libraries or discussion forums. Always test them on your own data and with your own applications.
I used last one [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|gov|biz|info|name|aero|biz|info|jobs|museum)\b
and it worked fine except in one case.
If I write my email in Uppercase, Its telling me invalid Email. I think as per those standards, it allows both Uppercase and Lowercase.
Can you tell me what should I add to allow Uppercase characters ?
niravparekh
Member
704 Points
427 Posts
How to validate Email as per RFC 2822 standards?
Jul 12, 2007 06:12 PM|LINK
Hi,
I have to validate email as per RFC 2822. Can someone please give me a General Function OR Regular Expression to validate Email as per RFC 2822?
( Actually I am looking for both the things, a vb.net Function to validate email as well as Regular Expression, but please provide me whichever is possible. )
Thanks in advance,
Nirav
dvallone
Participant
1606 Points
306 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 06:42 PM|LINK
Umm..I think you may be underestimating the complexity of what you are asking. To "validate" a message against the RFC 2822 spec, you have to read through the message, parse out all the header fields, make sure the header fields are valid headers, and then parse the data for each respective field to make sure it is valid (This includes character set, legal/illegal characters, and whether or not the data for that header field is an allowable value). There are around 25 header fields or so that you will have to check (although less than half of them are commonly used I believe).
There is no one function or regular expression that will do this. If you read through the spec you will see that there is nothing overly complicated about it, but it will require several hundred lines of code at a minimum. If you are looking for a quick solution you may be better off investigating some 3rd party products.
niravparekh
Member
704 Points
427 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 07:59 PM|LINK
Hi,
I think you misunderstood one thing. I just want to validate Email Address, only Email Address.
I think thats my mistake that i didnt write "Email Address" only. Sorry for the confusion.
ok, So I just want to validate Email Address e.g. nirav@nirav.com
dvallone
Participant
1606 Points
306 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 08:13 PM|LINK
Ooh, that is much, much different (and easier). [:)]
Try this link:
http://msdn2.microsoft.com/en-us/library/ms998267.aspx
niravparekh
Member
704 Points
427 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 08:21 PM|LINK
Hi,
Thanks for this but Actually I want to make sure that this validation is as per RFC 2822 standards.
Thats why I wanted either a Function or Regular Expression to validate Email Address as per RFC 2822.
Is the expression on given link is validating as per this standard?
niravparekh
Member
704 Points
427 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 08:47 PM|LINK
I have to validate using these standards.
According to RFC 2822, the local-part of the address may use any of these ASCII characters:
! # $ % & ' * + - / = ? ^ _ ` { | } ~Here is the link stating validations for Email Address : http://en.wikipedia.org/wiki/Email_address
Above is for local-part and also there is some for Domain name validations.
I am in need of a vb.net Function or Regular expression to validate Email Address as per these standards.
dvallone
Participant
1606 Points
306 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 08:56 PM|LINK
This will give you the info you need
The Official Standard: RFC 2822
Maybe you're wondering why there's no "official" full-proof and fool-proof regex to match email addresses. Well, there is an official definition, but it's hardly full-proof.
The official standard is known as RFC 2822. It describes the syntax that valid email addresses must adhere to. You can (but you shouldn't--read on) implement it with this regular expression:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
This regex has two parts: the part before the @, and the part after the @. There are two alternatives for the part before the @: it can either consist of a series of letters, digits and certain symbols, including one or more dots. However, dots may not appear consecutively or at the start or end of the email address. The other alternative requires the part before the @ to be enclosed in double quotes, allowing any string of ASCII characters between the quotes. Whitespace characters, double quotes and backslashes must be escaped with backslashes.
The part after the @ also has two alternatives. It can either be a fully qualified domain name (e.g. regular-expressions.info), or it can be a literal Internet address between square brackets. The literal Internet address can either be an IP address, or a domain-specific routing address.
The reason you shouldn't use this regex is that it only checks the basic syntax of email addresses. john@aol.com.nospam would be considered a valid email address according to RFC 2822. Obviously, this email address won't work, since there's no "nospam" top-level domain. It also doesn't guarantee your email software will be able to handle it. Not all applications support the syntax using double quotes or square brackets. In fact, RFC 2822 itself marks the notation using square brackets as obsolete.
We get a more practical implementation of RFC 2822 if we omit the syntax using double quotes and square brackets. It will still match 99.99% of all email addresses in actual use today.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
A further change you could make is to allow any two-letter country code top level domain, and only specific generic top level domains. This regex filters dummy email addresses like asdf@adsf.adsf. You will need to update it as new top-level domains are added.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|gov|biz|info|name|aero|biz|info|jobs|museum)\b
So even when following official standards, there are still trade-offs to be made. Don't blindly copy regular expressions from online libraries or discussion forums. Always test them on your own data and with your own applications.
niravparekh
Member
704 Points
427 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 09:21 PM|LINK
wow, Really thanks for this one.
I used last one [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|gov|biz|info|name|aero|biz|info|jobs|museum)\b
and it worked fine except in one case.
If I write my email in Uppercase, Its telling me invalid Email. I think as per those standards, it allows both Uppercase and Lowercase.
Can you tell me what should I add to allow Uppercase characters ?
Thanks again,
Nirav
dvallone
Participant
1606 Points
306 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 12, 2007 09:38 PM|LINK
Just change [a-z0-9 to [A-Za-z0-9 whenever it occurs. That will include upper case characters as well.
Edit: You can also just convert the email string you are validating to lower case
niravparekh
Member
704 Points
427 Posts
Re: How to validate Email as per RFC 2822 standards?
Jul 13, 2007 07:28 PM|LINK
Really thanks for all of your helps.
It looks perfect now.
I really appreciate your help.
Nirav