Blog

Regular Expressions: A Brief Introduction

July 4, 2022·2 min read

A regular expression, or regex, is a string of characters that specifies a pattern. It's used across search, string validation, and lexical analysis. The languages of regular expressions coincide with those recognized by finite state automata.

Most programming languages support regex — Python, C, C++, Java, JavaScript, and Dart among them.

Basics of regex

A single character is itself a regular expression. The boolean or operator | matches either pattern:

Key operators

| Pattern | Meaning | |---------|---------| | [a-z] | Any lowercase letter | | ^word | String begins with "word" | | word$ | String ends with "word" | | \d | Any digit | | . | Any character | | o{2} | Exactly two occurrences of "o" |

Regex for validation

Say we want to validate emails for a specific domain, allowing only - and _ as special characters:

[A-Za-z0-9_-]+@mycompany\.com

The outer group ( )+ matches one or more occurrences. The inner [A-Za-z0-9_-] matches alphanumeric characters plus underscore and dash. The \. escapes the dot since . alone matches any character.

You can interact with this example on regex101.