Learn Python Series (#3) - Handling Strings Part 2

in #utopian-io6 years ago (edited)

Learn Python Series (#3) - Handling Strings Part 2

python_logo.png

What Will I Learn?

  • You will learn about some more sequence type operations,
  • and about how-to use built-in string manipulation methods,
  • and about built-in string Truth Value Testing methods.

Requirements

  • A working modern computer running macOS, Windows or Ubuntu
  • An installed Python 3(.6) distribution, such as (for example) the Anaconda Distribution
  • The ambition to learn Python programming

Difficulty

Basic / Intermediate

Curriculum (of theLearn Python Series):

Learn Python Series (#3) - Handling Strings Part 2

In the previous tutorial episode about how to handle strings in Python 3, we covered the creation of strings, how to concatenate and repeat them into forming longer strings, how to index and slice substrings, and how to format the printing of strings via C-like printing and the newer format() method.

In this tutorial episode we'll expand our knowledge by looking into (some of) the built-in sequence type and/or string manipulation functions / methods.

Some more common sequence operations

In the Intro episode we covered a lot, but regarding common sequence operations, the following ones were not explicitly covered yet:

The count() method

The count(x) method returns the number of occurrences of the x argument in the sequence providing the count() method. For example:

# Count how often the character 'l'
# is found in the word 'hello'
word = 'hello'
num_l = word.count('l')
print(num_l)
# 2

# This also works on longer substrings
# In this case, we're counting the occurrences
# of the substring 'll' in the 'hello'
num_ll = word.count('ll')
print(num_ll)
# 1
2
1

The in operation

We haven't yet explicitly covered the in operation, although it was used when introducing for loops already.
For example:

some_numbers = [1, 3, 5, 7, 9]
for i in some_numbers:
    print(i)

# 1
# 3
# 5
# 7
# 9
1
3
5
7
9

The statement for i in some_numbers:, or in pseudo-code for each_element in sequence:, means that Python traverses over every element contained in that sequence and assigns the values of those elements to a new variable name (i, in this case), which variable name can be used throughout the remainder of the loop body.

But the in keyword can also be used without a for loop. It then tests if the sequence contains a value, and it returns True or False. For example:

some_numbers = [1, 3, 5, 7, 9]
test1 = 1 in some_numbers
test2 = 2 in some_numbers

print(test1, test2)
# True False

if 3 in some_numbers:
    print('Hi'*3)
# HiHiHi
True False
HiHiHi

The not in operation

Returns the opposite of in: True if the sequence does not contain a value, and False if it does.

some_numbers = [1, 3, 5, 7, 9]
test1 = 1 not in some_numbers
test2 = 2 not in some_numbers

print(test1, test2)
# False True
False True

Built-in string methods

str.find()

Usage: str.find(sub[, start[, end]])

find() searches for a substring contained in a larger string. It functions the same, more or less, as the in operator, but it returns the lowest index number (position) of where the substring was first found (from left to right) in the larger string. If find() doesn't find the substring in the string, it returns -1.

str1 = "I love Python!"
index = str1.find('o')
print(index)
# 3
# Notice there are two 'o's in the string,
# index 3 is the lowest index an 'o' was found.

index = str1.find('Py')
print(index)
# 7
3
7

str.rfind()

Usage: str.rfind(sub[, start[, end]])

Just as str.find(), but rfind() returns the highest index at which the substring was found.

str1 = "I love Python!"
index = str1.rfind('o')
print(index)
# 11
# Notice there are two 'o's in the string,
# index 11 is the highest index an 'o' was found.
11

str.replace()

Usage: str.replace(old,new[,count])

The replace() method searches for the occurrence of the substring old inside the total string, and when it finds that substring it will be replaced by the substring new, in effect returning a copy of the original string.

Optionally we can add a count (integer) argument, to only replace the first count occurences of the substring found.

str_original = 'Mary had a little lamb'

# Let's change `lamb` with `dog`
str_new1 = str_original.replace('lamb', 'dog')
print(str_new1)
# Mary had a little dog

# Let's search for a substring that isn't in the string
str_new2 = str_original.replace('cat', 'fish')
print(str_new2)
# PS: nothing was replaced
# Mary had a little lamb

# Let's replace only the first `a` with an `o`
str_new3 = str_original.replace('a', 'o', 1)
print(str_new3)
# PS: Only Mary's name will be changed to Mory now...
# Mory had a little lamb
Mary had a little dog
Mary had a little lamb
Mory had a little lamb

str.join()

Usage: str.join(iterable)

The join() method iterates over the function argument iterable, which can be any iterable as long as it contains strings, and concatenates every iteration found (in the case of strings: characters, and in the case of lists: list elements) with the string providing the method as the returned-string's character separator.

# Let's add a space separator
str_original = 'Let the sun shine'
str_new = ' '.join(str_original)
print(str_new)
# L e t   t h e   s u n   s h i n e

# Let's add `*-*` as the separator
str_original = 'Stars and Stripes'
str_new = '*-*'.join(str_original)
print(str_new)
# PS: please notice no separators are added after the final
# iterated character.
# S*-*t*-*a*-*r*-*s*-* *-*a*-*n*-*d*-* *-*S*-*t*-*r*-*i*-*p*-*e*-*s

# If we pass in a list of strings, not every character
# but every list element is split by the separator string
list_original = ['Python', 'Is', 'Fun']
str_new = ', '.join(list_original)
print(type(str_new), str_new)
# And indeed, a string is returned, not a list:
# <class 'str'> Python, Is, Fun
L e t   t h e   s u n   s h i n e
S*-*t*-*a*-*r*-*s*-* *-*a*-*n*-*d*-* *-*S*-*t*-*r*-*i*-*p*-*e*-*s
<class 'str'> Python, Is, Fun

str.split()

Usage: str.split(sep=None, maxsplit=-1)

The split() method is, more or less, the opposite of join(). It returns a list of elements contained in the string that's providing the split() method.

str_original = 'Steem on, Dream on'

# Split string into words
# by not passing in in a separator (`sep`) argument,
# therefore spaces are used as the separator string
list_new1 = str_original.split()
print(list_new)
# Notice that the comma behind the word `on` is 
# treated as part of the word `on`
# ['Steem', 'on,', 'Dream', 'on']

# Now let's split by the string `', '`
list_new2 = str_original.split(sep=', ')
print(list_new2)
# ['Steem on', 'Dream on']
['Steem', 'on,', 'Dream', 'on']
['Steem on', 'Dream on']

str.strip()

Usage: str.strip([chars])

The strip() method can be used to "strip away" unwanted characters from a string. The chars argument is optional and can be omitted. If omitted, all spaces from the beginning and from the end of a string will be removed, but not the spaces in between words inside the string.

The strip() method works a little "strange" (maybe?) when you try to use it for the first time by including the chars argument. It works as follows:

  • strip() "scans" the string for every character contained within chars string, and strips those characters from the string,
  • in case the same character is found, every instance of that character will be stripped,
  • whenever a character in the string is found that is not contained in the chars string, then the stripping stops,
  • this is however done to the string starting at the beginning, and starting at the end!

Hopefully the following examples make things a bit clearer:

# Using `strip()` without a `chars` argument, 
# to remove spaces from the beginning and the end.
orig = '    This  is  an example     string.    '
stripped = orig.strip()
print(stripped)
# This  is  an example     string.

# This would work to strip to a domain
url = 'www.steemit.com'
domain = url.strip('w.')
print(domain)
# steemit.com

# And this as well
# PS: I now added a trailing `/` slash
url = 'http://www.steemit.com/'
domain = url.strip('htp:/w.')
print(domain)
# steemit.com

# But this would not
url = 'https://www.steemit.com'
domain = url.strip('htps:/w.')
print(domain)
# eemit.com
This  is  an example     string.
steemit.com
steemit.com
eemit.com

str.lower()

Converts and returns a string where all cased characters become lower case.

message = 'Welcome to Utopian.io!'
lower_case = message.lower()
print(lower_case)
# welcome to utopian.io!
welcome to utopian.io!

str.upper()

Converts and returns a string where all cased characters become upper case.

message = 'Welcome to Utopian.io!'
upper_case = message.upper()
print(upper_case)
# WELCOME TO UTOPIAN.IO!
WELCOME TO UTOPIAN.IO!

Truth Value Testing String methods

str.isdigit()

Returns True if the string contains only digits, False otherwise

str1 = "ABC is easy as 123"
test1 = str1.isdigit()
print(test1)
# False

str2 = "1234567890"
test2 = str2.isdigit()
print(test2)
# True
False
True

str.isalpha()

Returns True if the string contains only alphabetic characters, False otherwise.

str3 = 'Three is the magic number'
test3 = str3.isalpha()
print(test3)
# False <= there are also spaces in the string!

str4 = 'Three'
test4 = str4.isalpha()
print(test4)
# True
False
False
True

str.islower()

Returns True if all cased characters ("letters") inside the string are of lower case, False otherwise.

str5 = '123 this is a test'
test5 = str5.islower()
print(test5)
# True

str6 = 'Testing Testing 123'
test6 = str6.islower()
print(test6)
# False
True
False

str.isupper()

Returns True if all cased characters ("letters") inside the string are of upper case, False otherwise.

str7 = 'JAMES BOND 007'
test7 = str7.isupper()
print(test7)
# True

str8 = 'Testing Testing 123'
test8 = str8.isupper()
print(test8)
# False
True
False

What did we learn, hopefully?

Combined with Handling Strings Part 1, where the indexing, slicing, (repeated) concatenation and formatting of strings was covered, and in the Intro Episode where we covered the most essential Python mechanisms, we've now as well covered some more (general) sequence operations in, not in and the occurence count() method, which can of course be applied to strings as well, plus we've now discussed some built-in string manipulation and truth value testing methods as well.

That's quite a lot we've learned already! In the next episode, which is a Round-Up episode, I'll show you what you can create already only using what's been covered already! Come and find out! See you there!

Thank you for your time!



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

Thank you for the contribution. It has been approved.

Somehow I never realised that is how str.strip([chars]) works (no idea why I thought it worked differently), so thanks for enlightening me!

You can contact us on Discord.
[utopian-moderator]

Yup! It scans & strips from the left, and it scans & strips from the right. Although I don't see too many use-cases for it actually, at least not in the way I code. But that could just be me I suppose! ;-)

Great post! I didn't know that behavior of the function strip (I just had been using it to erase spaces from the start and the end of the string), but I see it useful. In case of text formatting and validation, if you want to strip the apparition of some phrase at the beginning or the end of the string.

Without that function you could do

"the phrase here, and then the all text".replace ("the phrase here", "")

but it will erase ALL apparitions of the phrase in the text. So you could end doing

"".join("the phrase here, and then the all text".split("the phrase here")[1:])

But, what if you want to strip a phrase at the end of the text? Here the strip function is more useful, because without it we would need to reverse the text and also the phrase to be striped so it can work. Code:

"".join("the all text, the phrase at the end"[::-1].split("the phrase at the end"[::-1]))[::-1]

In the last case it's much better just use

"the phrase here, and then the all text".rstrip("the phrase here")

By the way, there is also lstrip (left strip) and rstrip (right strip) functions at you want by example only strip phrases at the beginning or the end of the string respectively, like I did in the last example ;)

Hi, thx for your lengthy response, and welcome to Steemit by the way!
Besides strip(), I do know about lstrip() and rstrip(). ;-)

I don't see too many use-cases for strip() and its left/right brother and sister. Because when working with lots of (unstructured) textual data, you don't know upfront what can or cannot be stripped without inspecting the data, either manually or with a bunch of algos, and for the latter other methods are far more convenient / precise.

Ref. (as written above):

url = 'https://steemit.com'
domain = url.strip('htps:/')
print(domain)
# eemit.com

When trying to use strip() for parsing URL strings for example, in order to remove the protocol / scheme from the URL, strip() fails when the domain begins with either an 'h', 't', 'p', or 's'.

By the way, have you seen Learn Python Series (#4) - Round-Up #1? In there I've explained how to develop a parse_url() function using only what's been covered in the first three episodes! Without using strip()! :-)

(Also in there, using a for else clause ! :P )
@scipio

Thank you :)

Great and useful exercise. I haven't seem it yet, but I'm going to take time to read your posts soon :) I'm also a enthusiast pythonist, it's always a good idea keep reading and learning new tricks ;)

Honestly I have been inspired by you, I'm going to post tutorial's about great tools, techniques and ideas. I hope soon you could see by here my first tutorials on steemit =)

By the way, I didn't know that the for statement also contains a else clause, great men! Thank you for good content

Hey scipio thank you for another great tutorial. I must say I am now wondering what I will use all these str. .... for but one day I might think back and say "Thats Why".

Tnx for sharing!

The "That's why" would be a perfect string! ;-)

Thanks this is great;
I am allready into python, also in relation to several 3d programs ( Blender and Cinema4D ) but I get stuck trying to get the steem library installed under windows as well as linux ( ubuntu 16.04 ) both python 3.6....

any tips in that direction? I found on that I am not the only one with this problem....

thx, @vjbasil

Yeah, don't do. Specifically on Windows 10, it really completely refuses to install. In particular pycrypt will blow up repeatedly because it doesn't have a decent modern billed for the library and will simply crash when trying to unify them.

Beem, on the other hand, seems to be doing everything that steem-Pythoncan but doing so as a fresh build. Definitely worth checking out, even if documentation is scant.

OK.... thx for the tip

I will check out beem asap!

so the installing of the libraries works well with beem?

windows or linux?

The install went perfectly easily, interestingly enough. Windows 10, because that's just what my heavy-duty processing system happens to be.

The project is under active development, so it certainly doesn't hurt to subscribe to the github updates and the developer on Steemit.

Thanx I will look into it then

Hi, thx!

Well, yes, installing steem-python can get a bit "hairy" because of all the dependencies involved from multiple modules on which is built upon. On various systems

  • I was once missing the python-dev package,
  • the toml dependency includes a subversion which apparently doesn't exist,
  • setting nodes=['https://api.steemit.com'] might help

OK.... thx for the tip

I will check out beem asap!

so the installing of the libraries works well with beem?

windows or linux?

Ok thx I will give it another try then...

I just released two episodes of my Learn Python Series book! ;-)

But neither of them is about steem-python , much has been covered about that already, although I will cover it in a mini-series later on. Have a look at my curriculum!

Nederlands had ook nog gekund trouwens ;-)

Ok, ben je ook een Nedersteemer dus.

bedankt voor je bijdrage. ziet er goed uit; leuk

grt, @vjbasil

And indeed, as @lextenebris pointed out, beem is another option as well. But since I haven't had any (severe) problems (that I couldn't solve) using steem-python, I haven't put much time on tinkering with beem. Let me know how it works out for you!

Hey @scipio I am @utopian-io. I have just upvoted you!

Achievements

  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

Thx @utopian-io ! Beep! Beep!

Thanks @scipio this is great. I would love to go into python any ideas in that direction am a lover of various and different 3d programs.

In Blender you can do a lot with python!

The same Python coding I'm covering in the entire Learn Python Series can be used elsewhere as well! So follow along! ;-) I will be covering a lot!!

Coin Marketplace

STEEM 0.16
TRX 0.13
JST 0.027
BTC 60841.72
ETH 2603.92
USDT 1.00
SBD 2.56