Notepad++ is an awesome text editor and can do so much. The find and replace tools allow for you to do regular expression finds and replaces (I believe you need the TextFX plugin installed). This makes for a nearly limitless set of options! But you need to know how to use it… ha!
These are a few things I tend to run into or get hung up on, so passing them along.
Contents
Wildcard Match
Lets say you want to find all the anchor tags in your code – you can use a wildcard in place of where the href URL would be.
1 | .*? |
That wildcard expression will say “get me anything that is in this position.
Example
1 | <a href=".*?"> |
1 | <a href="http://www.example.com"> |
1 | <a href='http://www.example.com'> |
Grouping
You can then group things together using parenthesis. Each grouping then becomes another index in the output (used when doing a find/replace) – I’ll explain more later on how that works.
Continuing to use the href URL example above, we will improve our regular expression a bit to handle whether the <a> tag uses single or double quotes.
1 | ".*?" |
1 | ("|').*?("|') |
The double quotes in the regex become (“|’). The parenthesis create a group. That group has two options – single quote or a double quote, which is separated by a pipe. The pipe means or. You can use more options by adding another pipe to separate the options.
Example
1 | <a href=("|'|&).*?("|'|&)> |
1 | <a href="http://www.example.com"> |
1 | <a href='http://www.example.com'> |
1 | <a href=&http://www.example.com&> |
1 | <a href=*http://www.example.com*> |
Output
This is great that we can find stuff – but now lets make this more powerful. Lets do a find and a replace – allowing us to bulk edit a document.
I mentioned above that each time you group something with a parenthesis it will create an output group. What this means is that you can then use that output group in the replace portion of you find/replace tool in Notepad++.
To illustrate this, let’s modify our last regular expression one more time – and what we are doing is adding a third grouping.
1 | <a href=("|'|&)(.*?)("|'|&)> |
So we created three groupings that can be used in the output. Using the groupings is fairly straightforward – you use a backslash followed by the output number you’d like to use. Output numbers start with an index of 1.
Example
1 2 3 4 5 | # Find <a href=("|'|&)(.*?)("|'|&)> # Replace <a href="\2" target="_blank"> |
Find/Replace #1
1 2 3 4 | <a href="http://www.example.com"> # Output Group 1: " # Output Group 2: http://www.example.com # Output Group 3: " |
1 | <a href="http://www.example.com" target="_blank"> |
Find/Replace #2
1 2 3 4 | <a href='http://www.example.com'> # Output Group 1: ' # Output Group 2: http://www.example.com # Output Group 3: ' |
1 | <a href='http://www.example.com' target="_blank"> |
Find/Replace #3
1 2 3 4 | <a href=&http://www.example.com&> # Output Group 1: & # Output Group 2: http://www.example.com # Output Group 3: & |
1 | <a href=&http://www.example.com& target="_blank"> |
Find/Replace #4
1 | <a href=*http://www.example.com*> |
In this example, we are looking for one of three separators to wrap the URL, but our replace regular expression instructs it to only use double-quotes. We then use the output from group #2 to keep the same URL. Lastly, we add in a target=”_blank” into the <a> tag.
If you wanted to keep the same URL wrapping separators you could easily do this as well:
1 2 3 4 5 | # Find <a href=("|'|&)(.*?)("|'|&)> # Replace <a href=\1\2\3 target="_blank"> |
Find/Replace #1
1 2 3 4 | <a href="http://www.example.com"> # Matches - Output Group 1: " # Output Group 2: http://www.example.com # Output Group 3: " |
1 | <a href="http://www.example.com" target="_blank"> |
Find/Replace #2
1 2 3 4 | <a href='http://www.example.com'> # Output Group 1: ' # Output Group 2: http://www.example.com # Output Group 3: ' |
1 | <a href="http://www.example.com" target="_blank"> |
Find/Replace #3
1 2 3 4 | <a href=&http://www.example.com&> # Output Group 1: & # Output Group 2: http://www.example.com # Output Group 3: & |
1 | <a href="http://www.example.com" target="_blank"> |
Find/Replace #4
1 | <a href=*http://www.example.com*> |
Remove the URL from an <a> HREF Tag
Ok, lets do one last thing. Let’s say we just want to remove the HREF out of the <a> anchor tags. Simple enough once again.
Example Using Same HREF Separators
1 2 3 4 5 | # Find <a href=("|'|&).*?("|'|&)> # Replace <a href=\1\2> |
Find/Replace #1
1 | <a href="http://www.example.com"> |
1 | <a href=""> |
Find/Replace #2
1 | <a href='http://www.example.com'> |
1 | <a href=''> |
Find/Replace #3
1 | <a href=&http://www.example.com&> |
1 | <a href=&&> |
Find/Replace #4
1 | <a href=*http://www.example.com*> |
Example Changing the HREF Separators
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | Regular Expression (Find) <a href=("|'|&).*?("|'|&)> Regular Expression (Replace) <a href=""> <a href="http://www.example.com"> - Matches Replaced Output: <a href=""> <a href='http://www.example.com'> - Matches Replaced Output: <a href=""> <a href=&http://www.example.com&> - Matches Replaced Output: <a href=""> <a href=*http://www.example.com*> - Does Not Match |
I hope this helps to clear up some confusion!
Other RegEx Patterns
Numbers
Let’s say you have a numbered list such as the following, and you want the text only.
#1: – First Item
#2: – Second Item
#3: – Third Item
1 2 3 4 5 6 7 8 9 10 11 12 | Regular Expression (Find) #([0-9]*): - Regular Expression (Replace) <em>(this space intentionally left blank)</em> <strong>Result:</strong> First Item Second Item Third Item |