共計 11215 個字符,預計需要花費 29 分鐘才能閱讀完成。
這篇文章主要介紹 Linux 中如何實現驗證郵件地址的正則表達式,文中介紹的非常詳細,具有一定的參考價值,感興趣的小伙伴們一定要看完!
郵件地址的規范來自于 RFC 5322 。有一個網站 emailregex.com 專門列出各種編程語言下的驗證郵件地址的正則表達式,其中很多正則表達式都是我聽說過而從未見過的復雜 mdash; mdash; 我想說,做這個網站的程序員是被郵件驗證這件事傷害了多深??!
其實,在產品環境中,一般來說并不需要這么復雜的正則表達式來做到 99.99% 正確。一般來說,從執行效率和測試覆蓋率來說,只需要一個簡單的版本即可:
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
那么下面我們來看看這些更嚴謹、更復雜的正則表達式吧:
驗證郵件地址的通用正則表達式(符合 RFC 5322 標準)
(?:[a-z0-9!#$% *+/=?^_`{|}~-]+(?:\.[a-z0-9!#$% *+/=?^_`{|}~-]+)*| (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])* )@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
由于各種語言對正則表達式的支持不同、語法差異和覆蓋率不同,所以,不同語言里面的正則表達式也不同:
Python
這個是個簡單的版本:
r (^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)
Javascript
這個有點復雜了:
/^[-a-z0-9~!$%^ *_=+}{\ ?]+(\.[-a-z0-9~!$%^ *_=+}{\ ?]+)*@([a-z0-9_][-a-z0-9_]*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(:[0-9]{1,5})?$/i
Swift
[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,6}
PHP
PHP 的這個版本就更復雜了,覆蓋率就更大一些:
/^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22))(?:\.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))$/iD
Perl / Ruby
對與 PHP 的版本,Perl 和 Ruby 表示不服,可以更嚴謹:
(?:(?:\r\n)?[ \t])*(?:(?:(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[\t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*)|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^() @,;:\\ .\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[\t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*))*)?;\s
Perl 5.10 及以后版本
上面的版本,嗯,我可以說是天書嗎?反正我是沒有解讀的想法了。當然,新版本的 Perl 語言還有一個更易讀的版本(你是說真的么?)
/(?(DEFINE)(? address (? mailbox) | (? group))(? mailbox (? name_addr) | (? addr_spec))(? name_addr (? display_name)? (? angle_addr))(? angle_addr (? CFWS)? (? addr_spec) (? CFWS)?)(? group (? display_name) : (?:(? mailbox_list) | (? CFWS))? ;(? CFWS)?)(? display_name (? phrase))(? mailbox_list (? mailbox) (?: , (? mailbox))*) (? addr_spec (? local_part) \@ (? domain))(? local_part (? dot_atom) | (? quoted_string))(? domain (? dot_atom) | (? domain_literal))(? domain_literal (? CFWS)? \[ (?: (? FWS)? (? dcontent))* (? FWS)?\] (? CFWS)?)(? dcontent (? dtext) | (? quoted_pair))(? dtext (? NO_WS_CTL) | [\x21-\x5a\x5e-\x7e]) (? atext (? ALPHA) | (? DIGIT) | [!#\$% *+-/=?^_`{|}~])(? atom (? CFWS)? (? atext)+ (? CFWS)?)(? dot_atom (? CFWS)? (? dot_atom_text) (? CFWS)?)(? dot_atom_text (? atext)+ (?: \. (? atext)+)*) (? text [\x01-\x09\x0b\x0c\x0e-\x7f])(? quoted_pair \\ (? text)) (? qtext (? NO_WS_CTL) | [\x21\x23-\x5b\x5d-\x7e])(? qcontent (? qtext) | (? quoted_pair))(? quoted_string (? CFWS)? (? DQUOTE) (?:(? FWS)? (? qcontent))*(? FWS)? (? DQUOTE) (? CFWS)?) (? word (? atom) | (? quoted_string))(? phrase (? word)+) # Folding white space(? FWS (?: (? WSP)* (? CRLF))? (? WSP)+)(? ctext (? NO_WS_CTL) | [\x21-\x27\x2a-\x5b\x5d-\x7e])(? ccontent (? ctext) | (? quoted_pair) | (? comment))(? comment \( (?: (? FWS)? (? ccontent))* (? FWS)? \) )(? CFWS (?: (? FWS)? (? comment))*(?: (?:(? FWS)? (? comment)) | (? FWS))) # No whitespace control(? NO_WS_CTL [\x01-\x08\x0b\x0c\x0e-\x1f\x7f]) (? ALPHA [A-Za-z])(? DIGIT [0-9])(? CRLF \x0d \x0a)(? DQUOTE )(? WSP [\x20\x09])) (? address)/x
Ruby (簡單版)
Ruby 表示,其實人家還有個簡單版本:
/\A([\w+\-].?)+@[a-z\d\-]+(\.[a-z]+)*\.[a-z]+\z/i
.NET
這樣的版本誰沒有啊 mdash; mdash;.NET 說:
^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
grep 命令
用 grep 命令在文件中查找郵件地址,我想你不會寫個若干行的正則表達式吧,意思一下就行了:
$ grep -E -o \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b filename.txt
SQL Server
在 SQL Server 中也是可以用正則表達式的,不過這個代碼片段應該是來自某個產品環境中的,所以,還體貼的照顧了那些把郵件地址寫錯的人:
select email from table_name where patindex (%[ , !+=\/() ]% , email) 0 -- Invalid characters or patindex ([@.-_]% , email) 0 -- Valid but cannot be starting character or patindex (%[@.-_] , email) 0 -- Valid but cannot be ending character or email not like %@%.% -- Must contain at least one @ and one . or email like %..% -- Cannot have two periods in a row or email like %@%@% -- Cannot have two @ anywhere or email like %.@% or email like %@.% -- Cannot have @ and . next to each other or email like %.cm or email like %.co -- Camaroon or Colombia? Typos. or email like %.or or email like %.ne -- Missing last letter
Oracle PL/SQL
這個是不是有點偷懶?尤其是在那些“復雜”的正則表達式之后:
SELECT email FROM table_nameWHERE REGEXP_LIKE (email, [A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,4}
MySQL
好吧,看來 *** 也一樣懶:
SELECT * FROM `users` WHERE `email` NOT REGEXP ^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$
以上是“Linux 中如何實現驗證郵件地址的正則表達式”這篇文章的所有內容,感謝各位的閱讀!希望分享的內容對大家有幫助,更多相關知識,歡迎關注丸趣 TV 行業資訊頻道!