Tutorial on GUI (continues) and Example of Web Crawler by combining PyQt5 and BeautifulSoup package in Python #Tutorial-4

in utopian-io •  last year

What Will I Learn?

This tutorial covers the topics on creating combining PyQt5 package and BeautifulSoup package and make a in Python. This is the second part of the PyQt5 package tutorial. You can find the first part of the PyQt5 package tutorial here.

  • First of all, you will learn how to align your content inside your GUI by using .move and .resize methods.
  • Then, you will learn how to create a label inside your GUI by using .QLabel method and also you will learn to change the styles of that label by using .setStyleSheet
  • Then, you will learn how to get the text from the labels, box, etc, inside GUI by using .text method.
  • Then you will learn re package to find integers (you can also find string as well.) to create an example which will be given in the last section.
  • Then you will learn how to create some functions inside your code to realize your commands.
  • Then you will learn how to combine PyQt5 package and BeautifulSoup package and get some informations from webpage and write those informations to your GUI.
  • Then we will write an example similar to (small version) "https://www.steemnow.com". (This will not be a webpage version but GUI version.)

Requirements

Write here a bullet list of the requirements for the user in order to follow this tutorial.

  • Windows 8 or higher.
  • Python 3.6.4.
  • PyCharm Community Edition 2017.3.4 x64.
  • PyQt5 package (also for this package you need to upgrade your pip version to 9.0.3. We could not apply PyQt5 package for below pip version from 9.0.3).
  • sys package
  • BeautifulSoup package
  • re package
  • request package

Difficulty

This tutorial has an indermadiate level.

Tutorial Contents

In this tutorial, we will work on PyQt5 and BeautifulSoup package to create a GUI and a simple web crawler on a new window.

Firstly, to work on PyQt5, BeautifulSoup, sys, re and request package we need to install these packages. To install the package we will use PyCharm Community Edition 2017.3.4 x64 and also you can find the installment of the packages in detailed in our previous tutorials (given in curriculum part).

We will start our code by importing the packages which will be used in the next parts.

import sys
from PyQt5 import QtWidgets
from bs4 import BeautifulSoup
import requests
from PyQt5.QtWidgets import QMainWindow, QApplication, QWidget, QPushButton, QAction, QLineEdit,QInputDialog
import re

Then, we created our class which is called main (you can find the explanation of this in our previous tutorial):

class Main(QtWidgets.QWidget):

Then, we used init() constructor method to call inherited class which helps us not to write in each line to call our inherited class as explained in our previous tutorial.

    def __init__(self):
        QtWidgets.QWidget.__init__(self)
        self.initUI()

We are ready to write our main body. Firstly we want to create title, labels, editting lines, buttons (radio and push), and design our window such as:

    def initUI(self):

        self.setWindowTitle('SteemNow @onderakcaalan')

Let's explain the methods:

  • .setWindowTitle is a method to give a title for our window such as:

foto1.png

self.lineHeader=QtWidgets.QLabel("Last 5 posts",self)
        self.lineHeader.setStyleSheet("color:blue;")
        self.lineHeader.move(0, 35)
        self.lineHeader.resize(100, 25)
  • .QLabel is a method to give a label inside your window and you can move it by using .move method and you need to create specific labels for each labels such as:

foto2.png

As seen that the labels are written blue. This is created by using .setStyleSheet method. We used "color:blue" but you can change whatever you want as a color.

self.lineRepPoints = QtWidgets.QLineEdit(self)
        self.lineRepPoints.move(610, 35)
        self.lineRepPoints.setReadOnly(True)
        self.lineRepPoints.resize(100, 25)
  • To get informations and show these informations we used .QLineEdit method and we add a .setReadOnly not to change it by users and you need to create specific lines for each lines. Because these lines will match specific info.

foto3.png

Up to here, we created just some labels, lines which are not connected to a function. Now, we will create some buttons to realize some functions such as getting some info from websites. For example; - To create a push button we used .QPushButton method such as:

getInfo = QtWidgets.QPushButton("Get Info", self)
        getInfo.move(305, 5)
        getInfo.resize(295, 25)
        getInfo.clicked.connect(self.getText)

foto4.png

  • To create a radio button we used .QRadioButton method such as:
getInfo2 = QtWidgets.QRadioButton("Cal. Budget in $", self)
        getInfo2.toggled.connect(self.OnDollars)
        getInfo2.move(970, 10)
        getInfo2.resize(150, 45)

foto5.png

As seen, we assigned some functions to our buttons such as getInfo and getInfo2. We will write some codes inside these functions to get some info.

Let's move to explain the functions for getInfo and getInfo2 functions.

    def getText(self):
        text, OKPressed = QInputDialog.getText(self, "Get text", "Your name:", QLineEdit.Normal, "")
        if OKPressed and text != '':

We defined our function OKPressed and when we push Get Info button then a nee window will be seen to write the username (for this example) and if we push ok button at the new window then the if statement will be realized. To get new window, we used .getText method and we wrote some labels to understand what we need to write.

foto6.png

Then if we write a username from the steemit.com then the new window will dissappear and some informations will be seen in the main window. The commands is like this:

self.lineProfileName.setText("@"+text)
            quote_page = "https://steemit.com/"+"@"+text

            page = requests.get(quote_page)
            soup = BeautifulSoup(page.content, "html.parser")
            Data=[]
            for a in soup.find_all('a', attrs={"class": "timestamp__link"}, href=True):
                Data.append(str(a['href']))
            self.lineLastPost1.setText(Data[0])
            self.lineLastPost2.setText(Data[1])
            self.lineLastPost3.setText(Data[2])
            self.lineLastPost4.setText(Data[3])
            self.lineLastPost5.setText(Data[4])

            for a in soup.find_all('span', attrs={"class": "UserProfile__rep"}):
                RepPoint = a.text.strip()
            self.lineRepPoints.setText(RepPoint)

            quote_page2 = "https://steemit.com/"+"@"+text+"/transfers"
            page = requests.get(quote_page2)
            soup = BeautifulSoup(page.content, "html.parser")

            TotalSteem=[]
            for a in soup.find_all('div', attrs={"class": "column small-12 medium-4"}):
                TotalSteem.append(a.text.strip())
            self.lineTotalSteem.setText(TotalSteem[1])

In our if statement we call some info from Steemit website. To call these info we used BeautifulSoup package and you can find this package tutorial from previous tutorials. However, some advanced knowledge will be give here. First of all, we define our webpage name and we used requests method.

Requests allows you to send organic, grass-fed HTTP/1.1 requests, without the need for manual labor. There's no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, thanks to urllib3.[1]

Then we used BeautifulSoup method to get info from "view-source:https://steemit.com/@onderakcaalan". When you examine the page you will see some starting commands such as "a, div, span, etc,". You can call these starting commands by using .find_all method. Also you can take info inside these commands. For example, in a span command: "(html comment removed: react-text: 155 )((html comment removed: /react-text )(html comment removed: react-text: 156 )52(html comment removed: /react-text )(html comment removed: react-text: 157 ))(html comment removed: /react-text )" then we can get whatever you want inside this span command. For example is you want to take reputation rate you need to specify your data such as:

soup.find_all('span', attrs={"class": "UserProfile__rep"}):

If you have a list which has data more than one, you can append it to some variable by using .append method.

Also when you check the source of the page also you will see that, black marking data which is the main part of the specific part. For example; "(html comment removed: react-text: 155 )((html comment removed: /react-text )(html comment removed: react-text: 156 )52(html comment removed: /react-text )(html comment removed: react-text: 157 ))(html comment removed: /react-text )" inside this 52 is the main data. To get this data you can use .text.strip method to erase the data but not the main data which is 52 in this example. In the package some methods are also defined such as "href" and you can use inside the "href" easily.

  • For radio button we used a function such as:
    def OnDollars(self):
        Checked = self.sender()

        if Checked.isChecked():

As seen our function is called OnDollars because we define our radio button to connect "getInfo2.toggled.connect(self.OnDollars)"

Also we get some info when we cicked this radio button from a website which is "https://coinmarketcap.com/". After getting info we need to mae some calculations here. However each data which is taken from the website is a list and to get integers or float we used re package and .findall method. Firstly, we had a problem because there is "." and when we used .findall method also "." is erased and we found some codes from here which makes some exceptions for your string get the integers in the right way. As seen that also, we called some info from our lines which is inside our window by using .text method. This is really good method to take info from the lines and you can easily use these lines and you do not need to write a code to get that data such that:

TotalSteem=self.lineTotalSteem.text()
            aa = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", TotalSteem)
            tt = float(aa[0])

            quote_page = "https://coinmarketcap.com/"

            page = requests.get(quote_page)
            soup = BeautifulSoup(page.content, "html.parser")
            for a in soup.find_all('a', attrs={"class": "price"}, href="/currencies/steem/#markets"):
                WorthSteem = a.text.strip()
            WSteem = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", WorthSteem)
            WSteem = float(WSteem[0])
            totalSteem=int(tt*WSteem)
            self.lineDollars.setText(str(totalSteem))

Here is the all code:

import sys
from PyQt5 import QtWidgets
from bs4 import BeautifulSoup
import requests
from PyQt5.QtWidgets import QMainWindow, QApplication, QWidget, QPushButton, QAction, QLineEdit,QInputDialog
import re

class Main(QtWidgets.QWidget):

    def __init__(self):
        QtWidgets.QWidget.__init__(self)
        self.initUI()

    def initUI(self):

        self.setWindowTitle('SteemNow @onderakcaalan')
        self.lineHeader=QtWidgets.QLabel("Last 5 posts",self)
        self.lineHeader.setStyleSheet("color:blue;")
        self.lineHeader.move(0, 35)
        self.lineHeader.resize(100, 25)

        self.lineRep = QtWidgets.QLabel("Reputation", self)
        self.lineRep.setStyleSheet("color:blue")
        self.lineRep.move(610, 5)
        self.lineRep.resize(100, 25)


        self.lineTotalSP = QtWidgets.QLabel("Total S.P. (Delegated S.P.)", self)
        self.lineTotalSP.setStyleSheet("color:blue;")
        self.lineTotalSP.move(715, 5)
        self.lineTotalSP.resize(250, 25)

        self.lineRepPoints = QtWidgets.QLineEdit(self)
        self.lineRepPoints.move(610, 35)
        self.lineRepPoints.setReadOnly(True)
        self.lineRepPoints.resize(100, 25)


        self.lineTotalSteem = QtWidgets.QLineEdit(self)
        self.lineTotalSteem.move(715, 35)
        self.lineTotalSteem.setReadOnly(True)
        self.lineTotalSteem.resize(250, 25)

        self.lineLastPost1 = QtWidgets.QLineEdit(self)
        self.lineLastPost1.move(0, 65)
        self.lineLastPost1.setReadOnly(True)
        self.lineLastPost1.resize(1300, 25)

        self.lineLastPost2 = QtWidgets.QLineEdit(self)
        self.lineLastPost2.move(0, 95)
        self.lineLastPost2.setReadOnly(True)
        self.lineLastPost2.resize(1300, 25)

        self.lineLastPost3 = QtWidgets.QLineEdit(self)
        self.lineLastPost3.move(0, 125)
        self.lineLastPost3.setReadOnly(True)
        self.lineLastPost3.resize(1300, 25)

        self.lineLastPost4 = QtWidgets.QLineEdit(self)
        self.lineLastPost4.move(0, 155)
        self.lineLastPost4.setReadOnly(True)
        self.lineLastPost4.resize(1300, 25)

        self.lineLastPost5 = QtWidgets.QLineEdit(self)
        self.lineLastPost5.move(0, 185)
        self.lineLastPost5.setReadOnly(True)
        self.lineLastPost5.resize(1300, 25)

        self.lineProfileName = QtWidgets.QLineEdit(self)
        self.lineProfileName.move(0, 5)
        self.lineRepPoints.setReadOnly(True)
        self.lineProfileName.resize(300, 25)


        self.lineDollars = QtWidgets.QLineEdit(self)
        self.lineDollars.move(1175, 10)
        self.lineDollars.setReadOnly(True)
        self.lineDollars.resize(100, 45)

        getInfo = QtWidgets.QPushButton("Get Info", self)
        getInfo.move(305, 5)
        getInfo.resize(295, 25)
        getInfo.clicked.connect(self.getText)

        getInfo2 = QtWidgets.QRadioButton("Cal. Budget in $", self)
        getInfo2.toggled.connect(self.OnDollars)
        getInfo2.move(970, 10)
        getInfo2.resize(150, 45)


    def getText(self):
        text, OKPressed = QInputDialog.getText(self, "Write a username", "Username:", QLineEdit.Normal, "")
        if OKPressed and text != '':
            self.lineProfileName.setText("@"+text)
            quote_page = "https://steemit.com/"+"@"+text

            page = requests.get(quote_page)
            soup = BeautifulSoup(page.content, "html.parser")
            Data=[]
            for a in soup.find_all('a', attrs={"class": "timestamp__link"}, href=True):
                Data.append(str(a['href']))
            self.lineLastPost1.setText(Data[0])
            self.lineLastPost2.setText(Data[1])
            self.lineLastPost3.setText(Data[2])
            self.lineLastPost4.setText(Data[3])
            self.lineLastPost5.setText(Data[4])

            for a in soup.find_all('span', attrs={"class": "UserProfile__rep"}):
                RepPoint = a.text.strip()
            self.lineRepPoints.setText(RepPoint)

            quote_page2 = "https://steemit.com/"+"@"+text+"/transfers"
            page = requests.get(quote_page2)
            soup = BeautifulSoup(page.content, "html.parser")

            TotalSteem=[]
            for a in soup.find_all('div', attrs={"class": "column small-12 medium-4"}):
                TotalSteem.append(a.text.strip())
            self.lineTotalSteem.setText(TotalSteem[1])


    def OnDollars(self):
        Checked = self.sender()

        if Checked.isChecked():
            TotalSteem=self.lineTotalSteem.text()
            aa = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", TotalSteem)
            tt = float(aa[0])

            quote_page = "https://coinmarketcap.com/"

            page = requests.get(quote_page)
            soup = BeautifulSoup(page.content, "html.parser")
            for a in soup.find_all('a', attrs={"class": "price"}, href="/currencies/steem/#markets"):
                WorthSteem = a.text.strip()
            WSteem = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", WorthSteem)
            WSteem = float(WSteem[0])
            totalSteem=int(tt*WSteem)
            self.lineDollars.setText(str(totalSteem))

def main():
    app = QtWidgets.QApplication(sys.argv)
    main = Main()
    main.show()

    sys.exit(app.exec_())

if __name__ == "__main__":
    main()

These are some results for @rdvn and @scipio who are moderator for @utopian-io:

For @rdvn:

foto7.png

For @scipio:

foto8.png

Curriculum

Here is the list of related tutorials we have already shared on Utopian that make up a Course Curriculum



Posted on Utopian.io - Rewarding Open Source Contributors

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Thank you for the contribution. It has been approved.

You can contact us on Discord.
[utopian-moderator]

·

Thank you for your support @deathwing.

Hey @onderakcaalan I am @utopian-io. I have just upvoted you!

Achievements

  • You have less than 500 followers. Just gave you a gift to help you succeed!
  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

·

Thanks to @utopian-io community...

Congratulations! This post has been upvoted from the communal account, @minnowsupport, by onderakcaalan from the Minnow Support Project. It's a witness project run by aggroed, ausbitbank, teamsteem, theprophet0, someguy123, neoxian, followbtcnews, and netuoso. The goal is to help Steemit grow by supporting Minnows. Please find us at the Peace, Abundance, and Liberty Network (PALnet) Discord Channel. It's a completely public and open space to all members of the Steemit community who voluntarily choose to be there.

If you would like to delegate to the Minnow Support Project you can do so by clicking on the following links: 50SP, 100SP, 250SP, 500SP, 1000SP, 5000SP.
Be sure to leave at least 50SP undelegated on your account.