Regular expressions are one of those skills that separates productive developers from everyone else. You either know the pattern off the top of your head, or you spend 15 minutes on Stack Overflow piecing together something that half-works.
This guide is your permanent reference. We have collected 20 regex patterns that cover the most common validation and extraction tasks you will face in real-world development. Every pattern has been tested, every edge case is explained, and every gotcha is called out so you do not ship broken validation to production.
Better yet, you can test every single pattern right now without installing anything.
▶ Open Regex Tester — Test These Patterns LiveHow to Use This Guide
Each pattern below follows the same structure: the regex itself in a code block, a plain-English breakdown of what each part does, example matches (and non-matches), and common pitfalls. Patterns are grouped by category so you can jump to what you need.
A quick note on notation: all patterns are shown without delimiters. If you are using JavaScript, wrap them in forward slashes (/pattern/flags). In Python, pass them as raw strings (r"pattern"). In PHP, use delimiters like #pattern#flags.
Pro tip: Bookmark this page. These are the patterns you will keep coming back to — and having them all in one place with explanations saves hours over a career.
1. Email Address Validation
The single most common regex question on every developer forum. Here is the practical version that handles 99.9% of real email addresses:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Breakdown:
^[a-zA-Z0-9._%+-]+— Local part: letters, digits, dots, underscores, percent signs, plus signs, hyphens (one or more)@— Literal at sign[a-zA-Z0-9.-]+— Domain: letters, digits, dots, hyphens\.[a-zA-Z]{2,}$— TLD: dot followed by two or more letters
Matches: user@example.com, john.doe+work@company.co.uk, admin@sub.domain.org
Does not match: user@.com, @example.com, user@com
Gotcha: The full RFC 5322 email spec is absurdly complex (it allows quoted strings, comments, and even IP address literals in the domain). Do not try to implement the full spec with regex. This pattern covers every email address you will actually encounter in production. For critical systems, send a verification email instead of relying solely on regex.
2. URL Matching (HTTP/HTTPS)
^https?:\/\/(www\.)?[a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)$
Breakdown:
^https?:\/\/— Protocol:httporhttpsfollowed by://(www\.)?— Optionalwww.prefix[a-zA-Z0-9@:%._+~#=]{1,256}— Domain name (up to 256 chars)\.[a-zA-Z0-9()]{1,6}— TLD (up to 6 chars)\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)$— Optional path, query string, fragment
Matches: https://example.com, http://www.site.co.uk/page?q=1, https://app.domain.io/path/to/resource
Does not match: ftp://files.example.com, example.com (no protocol), http:// (no domain)
Gotcha: If you are working in JavaScript, consider using the URL constructor with a try/catch block instead. It handles edge cases like internationalized domain names (IDNs) that regex cannot practically cover.
3. IPv4 Address
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
Breakdown:
25[0-5]— Matches 250-2552[0-4]\d— Matches 200-249[01]?\d\d?— Matches 0-199\.){3}— Repeated three times with dots- Final octet without trailing dot
Matches: 192.168.1.1, 0.0.0.0, 255.255.255.255
Does not match: 256.1.1.1, 192.168.1, 192.168.1.1.1
Gotcha: A simple \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} will match invalid addresses like 999.999.999.999. Always validate each octet is in the 0-255 range as shown above.
4. IPv6 Address (Simplified)
^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$
This matches the full (non-abbreviated) form of IPv6. For production use where you need to handle the :: shorthand notation, use a library instead of regex.
Matches: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
Does not match: 2001:db8::1 (abbreviated form), 192.168.1.1 (IPv4)
5. Phone Numbers (International)
^\+?(\d{1,3})?[-.\s]?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9}$
Breakdown:
^\+?(\d{1,3})?— Optional plus and country code (1-3 digits)[-.\s]?— Optional separator (dash, dot, or space)\(?\d{1,4}\)?— Optional parentheses around area code[-.\s]?\d{1,4}[-.\s]?\d{1,9}$— Remaining digits with optional separators
Matches: +1-555-123-4567, (555) 123-4567, +44 20 7946 0958, 5551234567
Gotcha: Phone number formats vary wildly across countries. For production applications, use a library like Google's libphonenumber. This regex is a solid starting point for loose validation, but it will not catch all invalid combinations.
6. Password Strength (Minimum 8 chars, upper, lower, digit, special)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Breakdown:
(?=.*[a-z])— At least one lowercase letter (lookahead)(?=.*[A-Z])— At least one uppercase letter(?=.*\d)— At least one digit(?=.*[@$!%*?&])— At least one special character[A-Za-z\d@$!%*?&]{8,}$— At least 8 characters total from the allowed set
Matches: MyP@ss1word, Str0ng!Pass
Does not match: password (no upper, digit, special), SHORT1! (too short)
Gotcha: Modern NIST guidelines (SP 800-63B) actually recommend against complex character requirements. They suggest focusing on minimum length (12+ characters) and checking against known breached password lists instead. Consider whether complexity rules actually improve security for your use case.
Need to generate strong passwords to test against? Try the password generator.
▶ Generate Test Passwords with Password Generator7. Date Formats (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Breakdown:
\d{4}— Four-digit year(0[1-9]|1[0-2])— Month: 01-12(0[1-9]|[12]\d|3[01])— Day: 01-31
Matches: 2026-02-09, 1999-12-31, 2000-01-01
Does not match: 2026-13-01 (invalid month), 02/09/2026 (wrong format), 2026-2-9 (needs zero padding)
Gotcha: This regex does not validate whether the day is actually valid for the month. It will accept 2026-02-31 even though February never has 31 days. Always do secondary validation with actual date parsing after the regex match.
8. Date Formats (DD/MM/YYYY or MM/DD/YYYY)
^(0[1-9]|[12]\d|3[01])[\/\-](0[1-9]|1[0-2])[\/\-]\d{4}$
This matches dates with slash or dash separators. The same caveat applies: regex validates the format, not the actual calendar date. Parse with a date library for full validation.
9. Credit Card Number (Major Networks)
^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13}|3(?:0[0-5]|[68]\d)\d{11}|6(?:011|5\d{2})\d{12}|(?:2131|1800|35\d{3})\d{11})$
Breakdown by network:
4\d{12}(?:\d{3})?— Visa: starts with 4, 13 or 16 digits5[1-5]\d{14}— Mastercard: starts with 51-55, 16 digits3[47]\d{13}— Amex: starts with 34 or 37, 15 digits3(?:0[0-5]|[68]\d)\d{11}— Diners Club: starts with 300-305, 36, or 386(?:011|5\d{2})\d{12}— Discover: starts with 6011 or 65(?:2131|1800|35\d{3})\d{11}— JCB: starts with 2131, 1800, or 35
Gotcha: Always strip spaces and dashes before matching. Users commonly type card numbers as 4111 1111 1111 1111 or 4111-1111-1111-1111. Also, regex only checks the format, not validity. Use the Luhn algorithm for actual card number validation.
Security note: Never log or store raw credit card numbers. If you need to validate card numbers client-side, do it for UX only and always process payments through a PCI-compliant provider like Stripe or PayPal.
10. HTML Tags
<([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>
Breakdown:
<([a-zA-Z][a-zA-Z0-9]*)— Opening tag name (captured in group 1)\b[^>]*>— Optional attributes, then closing bracket(.*?)— Content between tags (non-greedy)<\/\1>— Closing tag matching the opening tag (backreference)
Matches: <div>content</div>, <p class="text">hello</p>
Gotcha: This is the big one. You cannot reliably parse HTML with regex. Nested tags, self-closing tags, attributes with angle brackets, comments — they all break regex-based parsing. Use a DOM parser (like DOMParser in JavaScript or BeautifulSoup in Python) for real HTML processing. This pattern is only useful for simple, known-structure extraction tasks.
11. Hex Color Codes
^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$
Breakdown:
#?— Optional hash symbol[a-fA-F0-9]{6}— Six-character hex (e.g.,FF5733)[a-fA-F0-9]{3}— Three-character shorthand (e.g.,F00)
Matches: #FF5733, #fff, A1B2C3, 000
Does not match: #GGHHII, #12345 (5 digits), #1234567 (7 digits)
12. File Extensions
^[\w,\s-]+\.([a-zA-Z]{2,10})$
This matches a filename followed by an extension of 2-10 characters. For filtering specific file types:
^.+\.(jpg|jpeg|png|gif|svg|webp)$
Matches: photo.jpg, document.pdf, my-file_v2.png
Does not match: .gitignore (no filename before dot), noextension
13. Username Validation
^[a-zA-Z0-9_-]{3,20}$
Rules: 3-20 characters, letters, numbers, underscores, and hyphens only. No spaces, no special characters, no leading/trailing whitespace.
Matches: john_doe, user-123, dev42
Does not match: ab (too short), user name (space), user@name (special char)
Gotcha: Decide early whether usernames are case-sensitive. If not, always normalize to lowercase before matching and storing. Also consider whether to allow Unicode characters for international users. If you do, replace [a-zA-Z] with \p{L} (Unicode letter property, supported in most modern engines).
14. Slug (URL-Friendly String)
^[a-z0-9]+(?:-[a-z0-9]+)*$
Matches: my-blog-post, regex-patterns-2026, hello
Does not match: -leading-dash, UPPERCASE, double--dash, trailing-
This ensures clean URL slugs: lowercase letters and digits, separated by single hyphens, no leading or trailing hyphens.
▶ Transform Strings into Slugs with String Utilities15. Social Security Number (US SSN)
^(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}$
Breakdown:
(?!000|666|9\d{2})\d{3}— Area number: 001-665, 667-899 (SSA rules exclude 000, 666, and 900-999)(?!00)\d{2}— Group number: 01-99(?!0000)\d{4}— Serial number: 0001-9999
Gotcha: Handle SSN data with extreme care. Never log, store in plaintext, or transmit without encryption. For display purposes, mask all but the last four digits: ***-**-1234.
16. Whitespace Trimming and Normalization
^\s+|\s+$
This matches leading and trailing whitespace for trimming. To normalize multiple spaces into a single space within a string:
\s{2,}
Replace matches of \s{2,} with a single space to clean up inconsistent spacing in user input, scraped content, or imported data.
17. IP Address with Port
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?):(\d{1,5})$
Builds on the IPv4 pattern by adding a colon and port number (1-5 digits). To also validate the port range (1-65535), check programmatically after the regex match since regex-based numeric range validation gets unwieldy fast.
Matches: 192.168.1.1:8080, 10.0.0.1:443, 127.0.0.1:3000
18. Markdown Links
\[([^\]]+)\]\(([^)]+)\)
Breakdown:
\[([^\]]+)\]— Link text in square brackets (captured)\(([^)]+)\)— URL in parentheses (captured)
Matches: [NexTool](https://nextool.app), [click here](/page)
Group 1 gives you the link text, group 2 gives you the URL. Useful for converting Markdown to HTML or extracting links from Markdown documents.
19. Semantic Versioning (SemVer)
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
This is the official SemVer regex from semver.org. It matches version strings like:
Matches: 1.0.0, 2.1.3-alpha.1, 0.0.1+build.123, 10.20.30-beta+exp.sha.5114f85
Does not match: 1.0 (missing patch), 01.0.0 (leading zero), v1.0.0 (prefix v)
Gotcha: Many projects use a v prefix (like v2.1.0). If you need to accommodate that, prepend v? to the pattern.
20. JWT (JSON Web Token)
^[A-Za-z0-9_-]{2,}(?:\.[A-Za-z0-9_-]{2,}){2}$
A JWT consists of three Base64URL-encoded parts separated by dots: header, payload, and signature.
Matches: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U
Gotcha: This only validates the structure, not the signature or payload. Always verify JWTs server-side using a proper library. Never trust a JWT just because it matches the regex format.
▶ Test All 20 Patterns Live in the Regex TesterCommon Mistakes and Gotchas
Even experienced developers fall into these regex traps. Here are the most common mistakes and how to avoid them.
1. Forgetting to Escape the Dot
The dot (.) matches any character in regex, not just a literal period. This means a pattern like example.com will also match exampleXcom, example5com, and so on. Always use \. when you mean a literal dot.
# Wrong: matches "example" + any char + "com"
example.com
# Right: matches "example" + literal dot + "com"
example\.com
2. Greedy vs. Lazy Quantifiers
By default, quantifiers like * and + are greedy — they match as much as possible. This causes problems when you want to match the smallest possible segment.
# Greedy: matches from first <div> to LAST </div>
<div>.*</div>
# Lazy: matches from first <div> to NEAREST </div>
<div>.*?</div>
Add a ? after any quantifier to make it lazy (non-greedy).
3. Missing Anchors
Without ^ (start) and $ (end) anchors, your pattern matches anywhere within the string. A pattern like \d{3} will match inside abc123def. If you want to validate that the entire string is exactly three digits, use ^\d{3}$.
4. Catastrophic Backtracking
Nested quantifiers like (a+)+ or (a|a)* can cause exponential processing time on certain inputs. This is called catastrophic backtracking and it can freeze your application or server.
# Dangerous: can cause catastrophic backtracking
^(a+)+$
# Safe: equivalent but without nested quantifiers
^a+$
Always test your patterns against adversarial inputs. If a pattern takes more than a few milliseconds on a short string, it probably has a backtracking problem.
5. Over-Engineering with Regex
Not everything needs regex. If you just need to check whether a string contains a substring, use string.includes() in JavaScript or in in Python. Regex adds complexity, is harder to debug, and is slower than simple string methods for straightforward checks.
Rule of thumb: If you can explain the pattern requirement in one sentence without using the word "or," you probably do not need regex.
Quick Reference Table
Here is a condensed cheat sheet of all 20 patterns for quick copy-paste access:
# 1. Email ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
# 2. URL ^https?:\/\/(www\.)?[a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)$
# 3. IPv4 ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
# 4. IPv6 ^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$
# 5. Phone ^\+?(\d{1,3})?[-.\s]?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9}$
# 6. Password ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
# 7. Date ISO ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
# 8. Date DD/MM ^(0[1-9]|[12]\d|3[01])[\/\-](0[1-9]|1[0-2])[\/\-]\d{4}$
# 9. Credit Card ^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13}|...)$
# 10. HTML Tags <([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>
# 11. Hex Color ^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$
# 12. File Ext ^[\w,\s-]+\.([a-zA-Z]{2,10})$
# 13. Username ^[a-zA-Z0-9_-]{3,20}$
# 14. Slug ^[a-z0-9]+(?:-[a-z0-9]+)*$
# 15. US SSN ^(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}$
# 16. Trim Space ^\s+|\s+$
# 17. IP + Port ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?):(\d{1,5})$
# 18. MD Links \[([^\]]+)\]\(([^)]+)\)
# 19. SemVer ^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-...)?(?:\+...)?$
# 20. JWT ^[A-Za-z0-9_-]{2,}(?:\.[A-Za-z0-9_-]{2,}){2}$
How to Actually Learn Regex
Copying patterns from an article is useful, but understanding why they work makes you dangerous. Here is the learning path that actually sticks:
- Start with literals. A regex is just a search pattern.
hellomatches the string "hello." That is it. No magic. - Add character classes. Learn
\d(digit),\w(word character),\s(whitespace), and their negated versions (\D,\W,\S). - Learn quantifiers.
*(zero or more),+(one or more),?(zero or one),{n,m}(between n and m). - Master groups and alternation. Parentheses
()group parts, and|means "or." - Understand anchors and boundaries.
^,$,\bcontrol where matches can occur. - Tackle lookaheads and lookbehinds last.
(?=...)and(?<=...)are powerful but rarely needed.
The most important step? Practice with real text. Open the regex tester, paste some sample data, and start writing patterns. There is no substitute for hands-on experimentation.
▶ Open Regex Tester — Practice With Real DataWhen Not to Use Regex
Regex is a tool, not a religion. Here are situations where you should reach for something else:
- Parsing HTML or XML: Use a DOM parser. Always. The famous Stack Overflow answer about parsing HTML with regex is not a joke — nested structures break regex fundamentally.
- Complex JSON manipulation: Use
JSON.parse()and work with objects. - Email validation in production: Send a verification email. The only way to know an email is valid is to deliver a message to it.
- URL parsing: Use the
URLconstructor. It handles edge cases regex cannot. - Date validation: Use a date library. February 30th passes format validation but is not a real date.
Regex excels at pattern matching, extraction, and search-and-replace within known text structures. It falls apart when the structure is recursive, ambiguous, or context-dependent.
Wrapping Up
These 20 patterns cover the vast majority of regex tasks you will encounter in day-to-day development. Bookmark this page, keep the quick reference table handy, and remember: the best way to get comfortable with regex is to test patterns against real data.
Every pattern in this article is ready to copy, paste, and test. Open the NexTool Regex Tester, drop in your test strings, and see exactly what matches. No installation, no sign-up, instant feedback.
For string transformations like slugification, case conversion, or whitespace cleanup, the String Utilities tool handles the heavy lifting so you do not have to write your own replacement logic.
And if you are working with text data at scale — analyzing word frequency, character distributions, or readability scores — the Text Analyzer gives you instant metrics without writing a single line of code.
Regex does not have to be painful. With the right patterns and the right tools, it is one of the most powerful skills in your toolkit.