Understanding Regular ExpressionsteemCreated with Sketch.

in #regex7 years ago

Hello, Today I am going to help the newbies or the beginners understanding

What is Regular Expression?
Why it is being used?
How it is used?

So, lets get started.

Firstly let's understand what regex or regular expression means.

In my knowledge Regular Expression can be defined as that Regex or Regular Expression help us in describe complex patterns in text. Once the pattern/regex is being defined afterward it can be easily be useful in various things like : searching, replacing, extracting and modifying, validate the text data.

Now as we all know that every structured data has a specific pattern. If still this don't make any sense lets see at an example.

Lets consider a website url Like: www.steemit.com

Now if we see/observe closely we can see that it have a pattern. Let's break that pattern out

1st) www
2nd) The domain name in our example it is steemit.
3rd) com or the extension of the domain it can be com or net or edu</ or in or org etc

Note: They all are separated by a "."

Now the most complicated or the biggest issue that we all face is that how to define the regex pattern.

Lets say for example below is the regex for evaluation a email address.

^[a-zA-Z0-9]{1,10}@[a-zA-Z]{1,10}.(com|org)$

Now as this is target to newbie or the beginner they might be scratching there head that what the hell is this. And this shouldn't be making any sense to them at the moment. But let me tell you that this is the regular expression or regex which will can be used to validate a email address.

Note: Let me clear a thing regular expression can only be used to check the syntax of certain pattern. They don't guarantee the existence of any such things.

For eg: [email protected] is a valid expression but regex will not tell you whether this email id in actual exists or not.
But where as xyz@@@xyz.com is invalid email id as it don't match the expression.

So, lets dissect the expression. So, that it may make sense to you all.

All regular expression can be defined/created no matter how complex they are they can be defined using 3 symbols.
Which are famously be called the BCD

B stands for Brackets
C stands for Carrot
D stands for Dollar

Now lets further deep down into it.

As we know now that B stands for Bracket

But Brackets in regular expression are being of 3 types. They are as follows

Square Brackets [], Curly Brackets {}, Round Brackets ()

Now square brackets [] are used for specifying the characters which needed to be matched.
Where as curly brackets {} are used for specifying how many characters are allowed. Lastly the round brackets () which are used for grouping or certain words together.

So, that was all about Brackets or the B in regular expression. Now let's move ahead

C stands for Carrot ^ this marks the start of any regular expression.

And lastly D stands for Dollar $ which marks the end of any regular expression.

So, to sum up we can see the below pictorial representation of a regular expression using the BCD of regular expression

bcd.png

So, I hope now the above expression which I showed you earlier for validating any email address must be making some sense to you. If not still let's understand it by dividing it into the BCD of regular expression for better understanding.

^[a-zA-Z0-9]{1,10}@[a-zA-Z]{1,10}.(com|org)$

This is our regular expression.

We can see that and now we all know that Carrot "^" and Dollar "$". Are used for marking the start and the end of any regular expression which can also be seen in above regex expression.

And after carrot we have Square brackets [] which denotes the characters which are allowed. So, in this case we have lower case a-z and upper case A-Z and lastly numeric values 0-9. Now this means that any other characters apart from this will result in invalid entry. Meaning if we try to pass !@#$%^&*() any of these characters it will be treated as invalid characters as only numeric and alphabets are allowed in the Square brackets. After that we have Curly brackets {} which denotes the number of characters allowed. Now we see 2 digits separated by comma 1 and 10. This means that the minimum of 1 character is allowed and maximum of 10 characters. And in the end we have Dollar $ for denoting the end of the expression and before that we have Round brackets () which have 2 domain name separated by pipe sign | this means either of two meaning. Either com or org are allowed.

So, lets sum it up.

If we evaluate this email address : [email protected] or [email protected] or [email protected].

These all are valid

But if we try @[email protected] or [email protected]

These are invalid

Because our first bracket the square bracket only allows [a-zA-Z0-9] characters and our second square bracket only allow [a-zA-Z] characters. And only com or org domain are allowed. No other domain apart from it.

So, I hope this must be helpful. Kindly do comment if anything is not making any sense. And do post any constructive comment or in what area I need to focus in more. As this is my first article.

Regards

Sort:  

nice blog , i follow you , please see my blog and if you liked please follow and upvote

It is good to know that not everything could be 'matched' using regular expressions.

Well not exactly I think we can validate/match many things using regex. Do post some examples which cannot be matched using regular expression.

Congratulations @dotnetvideos! You have received a personal award!

1 Year on Steemit
Click on the badge to view your Board of Honor.

Do not miss the last post from @steemitboard!


Participate in the SteemitBoard World Cup Contest!
Collect World Cup badges and win free SBD
Support the Gold Sponsors of the contest: @good-karma and @lukestokes


Do you like SteemitBoard's project? Then Vote for its witness and get one more award!

Congratulations @dotnetvideos! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 2 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Coin Marketplace

STEEM 0.16
TRX 0.15
JST 0.028
BTC 59220.04
ETH 2316.03
USDT 1.00
SBD 2.51