Regular expressions, or regex, are “patterns” that allow you to search for words with certain properties. Here are some examples of regular expressions:
- ion: searches for words that contain the string “ion” in any position
- ion$: words ending with “ion” ($=end of word)
- ^anti: searches for all words beginning with “anti” (^=beginning of word)
- ^house$: searches exactly for the word “house”
- ^p.*r$: words starting with “p” and ending with “r” (*=repetitions – 0 or more – of the previous character, here ‘.’, meaning any character)
- [aeiou][aeiou]: words containing (at least) 2 successive vowels ([]=character class; [abc] means the same as (a|b|c))
- [ptkbdg][ptkbdg]: word containing a consonant cluster formed of plosives (search in the phon field)
- oid|ion|ein: searches for words containing (at least) one of the three strings “iod”, “ion” or “ein” (|=or). For example, you can search for words that contain dog, cat or rabbit with the regex dog|cat|rabbit
- ^(day|night|morning|evening)$: exactly the four words “day”, “night”, “morning” or “evening” (useful for searching a list of words)
- p.t: searches for words containing a “p”, followed by any letter, then a “t” (the dot corresponds to any character)
- ^p…r$: words beginning with “p”, followed by three characters, and ending with “r” (the . symbol in a regex corresponds to any character)
- ^[aeiou]: words beginning with a vowel
- ^[aeiou]+$: words containing only vowels
- ^[^aeiou]: words not beginning with a vowel
There are many tutorials on regex on the web, such as this page or this video. The following course is also recommended: http://regextutorials.com/intro.html
The bible on regex is the book Mastering Regular Expressions
A regex describes a finite-state transition automaton. The site https://regexper.com/ allows you to visualize the automaton associated with your regex. For example:
[ptk].*[aiou][aiou].?ion$ corresponds to the finite automaton:
