Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

5/10/2020 Using Regular Expressions in Java | Reading 3: Regular Expressions & Grammars | 6.005.

2x Courseware | edX

Course  Readings  Readin…  Using R…

Using Regular Expressions in Java


Using regular expressions in Java
Regular expressions (“regexes”) are widely used in programming, and you should have
them in your toolbox.

In Java, you can use regexes for manipulating strings (see String.split, String.matches,
java.util.regex.Pattern). They're built-in as a rst-class feature of modern scripting
languages like Python, Ruby, and Javascript, and you can use them in many text editors
for nd and replace. Regular expressions are your friend! Most of the time. Here are
some examples.

Replace all runs of spaces with a single space:

String singleSpacedString = string.replaceAll(" +", " ");

Match a URL:

Pattern regex = Pattern.compile("http://([a-z]+\\.)+[a-z]+(:[0-9]+)?/");


Matcher m = regex.matcher(string);
if (m.matches()) {
// then string is a url
}

Extract part of an HTML tag:

https://courses.edx.org/courses/course-v1:MITx+6.005.2x+1T2017/courseware/Readings/03-Regular-Expressions-Grammars/?child=first 1/2
5/10/2020 Using Regular Expressions in Java | Reading 3: Regular Expressions & Grammars | 6.005.2x Courseware | edX

Pattern regex = Pattern.compile("<a href=['\"]([^'\"]*)['\"]>");


Matcher m = regex.matcher(string);
if (m.matches()) {
String url = m.group(1);
// Matcher.group(n) returns the nth parenthesized part of the regex
}

Notice the backslashes in the URL and HTML tag examples. In the URL example, we
want to match a literal period ., so we have to rst escape it as \. to protect it from
being interpreted as the regex match-any-character operator, and then we have to
further escape it as \\. to protect the backslash from being interpreted as a Java string
escape character. In the HTML example, we have to escape the quote mark " as \" to
keep it from ending the string. The frequency of backslash escapes makes regexes still
less readable.

Discussion Hide Discussion


Topic: Reading 3 / Using Regular Expressions in Java

Show all posts by recent activity

 Why escape twice in the URL example? 4


What does it mean by "then we have to further escape it as \\. to protect the backslash from bein…

 Extract Part of an HTML Tag


4
Hello All, In the example provided (see bottom of post) I believe I understand why we are escapin…

 Match href attribute content - unclear about the middle text part of anchor tag
regex 2
If it is allowed to use a single or double quote in the rst and last section after href= why would t…

 HTML Tag 3
Tried the exapmle for the HTML part, but it only works if I use m. nd(), m.matches() returns alwa…

 Regular expressions are your friend! 1

© All Rights Reserved

https://courses.edx.org/courses/course-v1:MITx+6.005.2x+1T2017/courseware/Readings/03-Regular-Expressions-Grammars/?child=first 2/2

You might also like