Rust lang series episode #31 — regex (#rust-series)
Hi again everyone, here comes another Rust lang series episode, this time about Rust and regular expressions. Regular expressions (regex) are standard for a powerful operations with strings.
Although String struct provides methods like find and replace to search and replace text, whenever we need to create more generic pattern, we need Regular expressions. If you are not familiar with regular expressions, I advice you to get some basics first for example here.
Dependencies
To utilize regular expressions in Rust we mut regex crate into our dependencies in Cargo.toml because this is not part of standard library.
[dependencies]
regex = "0.1.77"
Matching strings
extern crate regex;
use regex::Error;
fn main() {
use regex::Regex;
let date_pattern = r"^\d{4}-\d{2}-\d{2}$";
let res = Regex::new(date_pattern);
if let Ok(regex) = res {
println!("regex ok, trying to match...");
let str = "2016-01-01";
let matches = regex.is_match(str);
if matches {
println!("{} matches pattern {}", str, date_pattern);
} else {
println!("{} doesn't match pattern {}", str, date_pattern);
}
} else {
println!("{:?}", res);
}
}
Output
regex ok, trying to match...
2016-01-01 matches pattern ^\d{4}-\d{2}-\d{2}$
Breaking down
regex is crate providing regex::Regex API.
Regex::new() compiles regular expression that can be used for futher processing.
is_match() method returns true if matches given string, false otherwise
Finding pattern
Regex provides multiple method. One of most usefull is find method for finding pattern. Let's find "Steemit is great"
extern crate regex;
fn main() {
use regex::Regex;
let steem_pattern = r"Steemit is\s\w\sgreat";
let res = Regex::new(steem_pattern);
if let Ok(regex) = res {
println!("regex ok, trying to find text");
let str = "After all Steemit is a great platform";
let found = regex.find(str);
if let Some(res) = found {
println!("'{}' pattern found in '{}' at {:?}", steem_pattern, str, res);
} else {
println!("'{}' pattern not found in '{}'", steem_pattern, str);
}
} else {
println!("{:?}", res);
}
}
Output
regex ok, trying to find text
'Steemit is\s\w\sgreat' pattern found in 'After all Steemit is a great platform' at (10, 28)
Breaking down
Everything is quite like the same as in previous example except this time we call find() method which returns Option. If found Some with tuple containing integer with start and end matching position.
Other useful Regex methods
There are some other useful methods in Regex besided these two mostly used. The most useful are:
- split - can split string based on regular expression
- replace - provides string replacement based on regular expression
- capture - provides string search with sub-matches
Postfix
That's all for now, thank you for your appreciations, feel free to comment and point out possible mistakes (first 24 hours works the best but any time is fine). May Jesus bless your programming skills, use them wisely and see you next time.
Meanwhile you can also check the official documentation for additional related information: