Regex Patterns
Practical regex patterns across languages and use cases. Use when validating input (email, URL, IP), parsing log lines, extracting data from text, refactoring code with search-and-replace, or debuggin
Practical regex patterns across languages and use cases. Use when validating input (email, URL, IP), parsing log lines, extracting data from text, refactoring code with search-and-replace, or debuggin
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Practical regular expression cookbook. Patterns for validation, parsing, extraction, and refactoring across JavaScript, Python, Go, and command-line tools.
| Pattern | Matches | Example |
|---|---|---|
| Any character (except newline) | matches , |
| Digit | matches |
| Word char | matches |
| Whitespace | matches spaces/tabs |
| Word boundary | matches not |
| Start of line | matches line starting with Error |
| End of line | matches line ending with .js |
, , | Negated: non-digit, non-word, non-space |
| Pattern | Meaning |
|---|---|
| 0 or more (greedy) |
| 1 or more (greedy) |
| 0 or 1 (optional) |
| Exactly 3 |
| Between 2 and 5 |
| 3 or more |
, | Lazy (match as few as possible) |
| Pattern | Meaning |
|---|---|
| Capture group |
| Non-capturing group |
| Named group (Python) |
| Named group (JS/Go) |
| Alternation (a or b) |
| Character class (a, b, or c) |
| Negated class (not a, b, or c) |
| Range |
| Pattern | Meaning |
|---|---|
| Positive lookahead (followed by abc) |
| Negative lookahead (not followed by abc) |
| Positive lookbehind (preceded by abc) |
| Negative lookbehind (not preceded by abc) |
# Basic (covers 99% of real emails) ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$Stricter (no consecutive dots, no leading/trailing dots in local part)
^a-zA-Z0-9?@a-zA-Z0-9?(.[a-zA-Z]{2,})+$
# HTTP/HTTPS URLs https?://[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?)*(/[^\s]*)?With optional port and query
https?://[^\s/]+(/[^\s?])?(?[^\s#])?(#[^\s]*)?
# IPv4 \b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\bIPv4 (simple, allows invalid like 999.999.999.999)
\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b
IPv6 (simplified)
(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}
# US phone (various formats) (?:\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4} # Matches: +1 (555) 123-4567, 555.123.4567, 5551234567International (E.164)
+[1-9]\d{6,14}
# ISO 8601 date \d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])ISO 8601 datetime
\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:.\d+)?(?:Z|[+-]\d{2}:\d{2})
US date (MM/DD/YYYY)
(?:0[1-9]|1[0-2])/(?:0[1-9]|[12]\d|3[01])/\d{4}
Time (HH:MM:SS, 24h)
(?:[01]\d|2[0-3]):[0-5]\d:[0-5]\d
# At least 8 chars, 1 upper, 1 lower, 1 digit, 1 special ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*()_+=-]).{8,}$
[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}
\bv?(\d+)\.(\d+)\.(\d+)(?:-([\w.]+))?(?:\+([\w.]+))?\b # Captures: major, minor, patch, prerelease, build # Matches: 1.2.3, v1.0.0-beta.1, 2.0.0+build.123
# Apache/Nginx access log # Format: IP - - [date] "METHOD /path HTTP/x.x" status size grep -oP '(\S+) - - \[([^\]]+)\] "(\w+) (\S+) \S+" (\d+) (\d+)' access.logExtract IP and status code
grep -oP '^\S+|"\s\K\d{3}' access.log
Syslog format
Format: Mon DD HH:MM:SS hostname process[pid]: message
grep -oP '^\w+\s+\d+\s[\d:]+\s(\S+)\s(\S+)[(\d+)]:\s(.*)' syslog
JSON log — extract a field
grep -oP '"level"\s*:\s*"\K[^"]+' app.log grep -oP '"message"\s*:\s*"\K[^"]+' app.log
# Find function definitions (JavaScript/TypeScript) grep -nP '(?:function\s+\w+|(?:const|let|var)\s+\w+\s*=\s*(?:async\s*)?\([^)]*\)\s*=>|(?:async\s+)?function\s*\()' src/*.tsFind class definitions
grep -nP 'class\s+\w+(?:\s+extends\s+\w+)?' src/*.ts
Find import statements
grep -nP '^import\s+.\s+from\s+' src/.ts
Find TODO/FIXME/HACK comments
grep -rnP '(?:TODO|FIXME|HACK|XXX|WARN)(?:([^)]+))?:?\s+' src/
Find console.log left in code
grep -rnP 'console.(log|debug|info|warn|error)(' src/ --include='.ts' --include='.js'
# Extract all email addresses from a file grep -oP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' file.txtExtract all URLs
grep -oP 'https?://[^\s<>"]+' file.html
Extract all quoted strings
grep -oP '"[^"\](?:\.[^"\])*"' file.json
Extract numbers (integer and decimal)
grep -oP '-?\d+.?\d*' data.txt
Extract key-value pairs (key=value)
grep -oP '\b(\w+)=([^\s&]+)' query.txt
Extract hashtags
grep -oP '#\w+' posts.txt
Extract hex colors
grep -oP '#[0-9a-fA-F]{3,8}\b' styles.css
// Test if a string matches const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/; emailRegex.test('user@example.com'); // true// Extract with capture groups const match = '2026-02-03T12:30:00Z'.match(/(\d{4})-(\d{2})-(\d{2})/); // match[1] = '2026', match[2] = '02', match[3] = '03'
// Named groups const m = 'John Doe, age 30'.match(/(?<name>[A-Za-z ]+), age (?<age>\d+)/); // m.groups.name = 'John Doe', m.groups.age = '30'
// Find all matches (matchAll returns iterator) const text = 'Call 555-1234 or 555-5678'; const matches = [...text.matchAll(/\d{3}-\d{4}/g)]; // [{0: '555-1234', index: 5}, {0: '555-5678', index: 18}]
// Replace with callback 'hello world'.replace(/\b\w/g, c => c.toUpperCase()); // 'Hello World'
// Replace with named groups '2026-02-03'.replace(/(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})/, '$<m>/$<d>/$<y>'); // '02/03/2026'
// Split with regex 'one, two; three'.split(/[,;]\s*/); // ['one', 'two', 'three']
import reMatch (anchored to start)
m = re.match(r'^(\w+)@(\w+).(\w+)$', 'user@example.com') if m: print(m.group(1)) # 'user'
Search (find first match anywhere)
m = re.search(r'\d{3}-\d{4}', 'Call 555-1234 today') print(m.group()) # '555-1234'
Find all matches
emails = re.findall(r'[\w.+-]+@[\w.-]+.\w{2,}', text)
Named groups
m = re.match(r'(?P<name>\w+)\s+(?P<age>\d+)', 'Alice 30') print(m.group('name')) # 'Alice'
Substitution
result = re.sub(r'\bfoo\b', 'bar', 'foo foobar foo')
'bar foobar bar'
Sub with callback
result = re.sub(r'\b\w', lambda m: m.group().upper(), 'hello world')
'Hello World'
Compile for reuse (faster in loops)
pattern = re.compile(r'\d{4}-\d{2}-\d{2}') dates = pattern.findall(log_text)
Multiline and DOTALL
re.findall(r'^ERROR.*$', text, re.MULTILINE) # ^ and $ match line boundaries re.search(r'start.*end', text, re.DOTALL) # . matches newlines
Verbose mode (readable complex patterns)
pattern = re.compile(r''' ^ # Start of string (?P<year>\d{4}) # Year -(?P<month>\d{2}) # Month -(?P<day>\d{2}) # Day $ # End of string ''', re.VERBOSE)
import "regexp"// Compile pattern (panics on invalid regex) re := regexp.MustCompile(
)\d{4}-\d{2}-\d{2}// Match test re.MatchString("2026-02-03") // true
// Find first match re.FindString("Date: 2026-02-03 and 2026-03-01") // "2026-02-03"
// Find all matches re.FindAllString(text, -1) // []string of all matches
// Capture groups re := regexp.MustCompile(
) match := re.FindStringSubmatch("user@example.com") // match[0] = "user@example.com", match[1] = "user", match[2] = "example"(\w+)@(\w+)\.(\w+)// Named groups re := regexp.MustCompile(
) match := re.FindStringSubmatch("2026-02-03") for i, name := range re.SubexpNames() { if name != "" { fmt.Printf("%s: %s\n", name, match[i]) } }(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})// Replace re.ReplaceAllString("foo123bar", "NUM") // "fooNUMbar"
// Replace with function re.ReplaceAllStringFunc(text, strings.ToUpper)
// Note: Go uses RE2 syntax — no lookahead/lookbehind
# grep -P uses PCRE (Perl-compatible — full features) # grep -E uses Extended regex (no lookahead/lookbehind)Find lines matching a pattern
grep -P '\d{3}-\d{4}' file.txt
Extract only the matching part
grep -oP '\d{3}-\d{4}' file.txt
Invert match (lines NOT matching)
grep -vP 'DEBUG|TRACE' app.log
sed replacement
sed 's/oldPattern/newText/g' file.txt # Basic sed -E 's/foo_([a-z]+)/bar_\1/g' file.txt # Extended with capture group
Perl one-liner (most powerful)
perl -pe 's/(?<=price:\s)\d+/0/g' file.txt # Lookbehind works in Perl
# Rename a variable across files grep -rlP '\boldName\b' src/ | xargs sed -i 's/\boldName\b/newName/g'Convert var to const (JavaScript)
sed -i -E 's/\bvar\b/const/g' src/*.js
Convert single quotes to double quotes
sed -i "s/'/"/g" src/*.ts
Add trailing commas to object properties
sed -i -E 's/^(\s+\w+:.+[^,])$/\1,/' config.json
Update import paths
sed -i 's|from '''../old-path/|from '''../new-path/|g' src/*.ts
Convert snake_case to camelCase (Python → JavaScript naming)
perl -pe 's/_([a-z])/uc($1)/ge' file.txt
# Remove trailing whitespace sed -i 's/[[:space:]]*$//' file.txtRemove blank lines
sed -i '/^$/d' file.txt
Remove duplicate blank lines (keep at most one)
sed -i '/^$/N;/^\n$/d' file.txt
Trim leading and trailing whitespace from each line
sed -i 's/^[[:space:]]//;s/[[:space:]]$//' file.txt
Remove HTML tags
sed 's/<[^>]*>//g' file.html
Remove ANSI color codes
sed 's/\x1b[[0-9;]*m//g' output.txt
Pattern: <.*> Input: <b>bold</b> Greedy matches: <b>bold</b> (entire string between first < and last >) Lazy matches: <b> (stops at first >) Pattern: <.*?> (lazy version)
Characters that need escaping in regex: . * + ? ^ $ { } [ ] ( ) | \ In character classes []: only ] - ^ \ need escapingTo match a literal dot: .
To match a literal *: *
To match a literal : \
To match [ or ]: [ or ]
By default . does NOT match newline. By default ^ and $ match start/end of STRING.To make . match newlines:
JavaScript: /pattern/s (dotAll flag) Python: re.DOTALL or re.S Go: (?s) inline flag
To make ^ $ match line boundaries:
JavaScript: /pattern/m (multiline flag) Python: re.MULTILINE or re.M Go: (?m) inline flag
# Catastrophic backtracking (avoid these patterns on untrusted input): (a+)+ # Nested quantifiers (a|a)+ # Overlapping alternation (.*a){10} # Ambiguous .* with repetitionSafe alternatives:
[a]+ # Instead of (a+)+ a+ # Instead of (a|a)+ [^a]*a # Possessive/atomic instead of .*a
\d+ is almost always enough — you rarely need [0-9]+.(?:...) when you don't need the captured value. It's slightly faster and cleaner.g flag for matchAll and global replace. Without it, only the first match is found/replaced.regexp package uses RE2 (no lookahead/lookbehind). If you need those, use a different approach or the regexp2 package.grep -P (PCRE) is the most powerful command-line regex. Use it over grep -E when you need lookahead, \d, or \b.re.VERBOSE in Python, /x in Perl) with comments explaining each part.No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.