MLB Pitchers Analysis Program [BETA]

in #programming9 years ago (edited)

Extracting data from Pitchers for MLB Expert System. In this case I have focused on Max Scherzer, these are the ESPN numbers:  

                           Max Scherzer

+---------+-------+-------+-------+------+---+----+------------+-----+

| Results | P_Fis | P_Int | P_Emo | Runs | H | BB |    Date    |  Vs |

+---------+-------+-------+-------+------+---+----+------------+-----+

|    W    | 98.95 |  15.5 |  50.0 |  2   | 4 | 2  | 2017-04-07 | PHI |

|    L    | 69.92 | 59.46 |  4.95 |  3   | 4 | 2  | 2017-04-12 | STL |

|    W    |  2.89 | 98.59 | 18.83 |  0   | 2 | 3  | 2017-04-18 | ATL |

|    W    | 24.02 | 87.79 | 71.69 |  3   | 5 | 1  | 2017-04-23 | NYM |

|    L    | 86.54 | 45.25 | 100.0 |  5   | 9 | 1  | 2017-04-28 | NYM |

|    W    | 81.55 |  2.75 | 61.13 |  1   | 2 | 2  | 2017-05-04 | ARI |

|    L    | 18.45 |  9.27 | 10.91 |  2   | 4 | 2  | 2017-05-09 | BAL |

|    W    |  5.61 |  50.0 |  4.95 |  3   | 9 | 0  | 2017-05-14 | PHI |

|    L    | 75.98 | 95.48 | 61.13 |  3   | 4 | 3  | 2017-05-20 | ATL |

|    W    | 90.85 | 87.79 | 100.0 |  1   | 3 | 2  | 2017-05-26 |  SD |

|    W    | 30.08 | 45.25 | 71.69 |  1   | 5 | 0  | 2017-05-31 |  SF |

|    W    |  5.61 |  2.75 | 10.91 |  1   | 3 | 2  | 2017-06-06 | LAD |

|    L    | 63.49 |  9.27 |  4.95 |  3   | 3 | 1  | 2017-06-11 | TEX |

|    W    | 99.88 |  50.0 |  50.0 |  1   | 4 | 2  | 2017-06-16 | NYM |

|    L    | 56.81 | 90.73 | 95.05 |  2   | 2 | 1  | 2017-06-21 | MIA |

+---------+-------+-------+-------+------+---+----+------------+-----+




This  table shows all the games of Max Scherzer, with which team played and  how many races he scored among other things, but what is important is  the Bio-Rhythm my added, it is interesting now to see as now in Baseball  that seems a simpler game Now we realize how complicated it is to add the variable  Bio-Rhythm, Even now we should have a value of the other team to be able  to analyze a little more deeply the Bio-Rhythm of this Player for each  game ....  

Now we need more data but even this data we can find them Scraping or looking in the trash hahahaha, ESPN scrapping !!!  

Analyzing a little we realize that the NYM team has fit 9 races of the 20 that allowed us to fix it in the following table:
+---------+-------+-------+-------+------+---+----+------------+-----+

 | Results | P_Fis | P_Int | P_Emo | Runs | H | BB |    Date    |  Vs | +---------+-------+-------+-------+------+---+----+------------+-----+

|    W    | 24.02 | 87.79 | 71.69 |  3   | 5 | 1  | 2017-04-23 | NYM |

 |    L    | 86.54 | 45.25 | 100.0 |  5   | 9 | 1  | 2017-04-28 | NYM |

 |    W    | 99.88 |  50.0 |  50.0 |  1   | 4 | 2  | 2017-06-16 | NYM | +---------+-------+-------+-------+------+---+----+------------+-----+


We could say that the Euclidean Distance between Intellectual and  Emotional, according to my analysis is the most important in the pitcher  and plays an important role but it would only be superficial to talk  like this.

We must also analyze the wear and tear in which the races have been  done, that is to say each game is an analysis and we will do it God in  the next days.

We can observe how we could see if the Physical, Intellectual and  emotional points generate a Triangle and what type of Triangle can be  this would give us some interesting data, I will take it into account in  the Graphs.


Now in the following there is another panorama, where we must take  into account how the defensive was that day what they call Fit or help  of the defensive to the Pitcher.
 

Here we will learn Sabermetric Friend if or if, if we get into the graphics we will learn something.

+---------+-------+-------+-------+------+---+----+------------+-----+

 | Results | P_Fis | P_Int | P_Emo | Runs | H | BB |    Date    |  Vs | +---------+-------+-------+-------+------+---+----+------------+-----+

 |    W    | 98.95 |  15.5 |  50.0 |  2   | 4 | 2  | 2017-04-07 | PHI |

 |    W    |  5.61 |  50.0 |  4.95 |  3   | 9 | 0  | 2017-05-14 | PHI | +---------+-------+-------+-------+------+---+----+------------+-----+

Well then I leave the code, with change the id of the player will get the table of the pitcher they want ...

May God bless you until soon: D

Python code:

# -*- coding: utf-8 -*-

import urllib2

from bs4 import BeautifulSoup

import re

from prettytable import PrettyTable

from datetime import date, datetime

import math

url_mlb_lanzador_ficha = "http://www.espn.com/mlb/player/gamelog/_/id/"

lista_lanzadores = [url_mlb_lanzador_ficha+"28976"]

for url_lanzador in lista_lanzadores:

    list_game = []

    page = urllib2.urlopen(url_lanzador)

    soup = BeautifulSoup(page, "lxml")

    datos_lanzador = {}

    name = soup.find('h1')  

    nombre = name.text

    datos_lanzador["nombre"] = nombre

    name_box = soup.find('ul', attrs={'class': 'player-metadata floatleft'})  

    name_box_general = soup.find('ul', attrs={'class': 'general-info'})  

    results = []

    iterator = 0

    for row in name_box:

                 row = str(row)

                 find_span = row.find("</span>")

                 find_li = row.find("</li>")

                 if iterator == 0:

                       Birth_Date = row[find_span+7:find_li]

                       Birth_Date = Birth_Date[0:Birth_Date.find("(")]

                       datos_lanzador["Birth_Date"]= Birth_Date

                 if iterator == 1:

                       Birthplace = row[find_span+7:find_li]

                       datos_lanzador["Birthplace"]= Birthplace

                 if iterator == 2:

                       Experience = row[find_span+7:find_li]

                       datos_lanzador["Experience"]= Experience

                 if iterator == 3:

                       College = row[find_span+7:find_li]

                       datos_lanzador["College"]= College

                 iterator += 1

    name_box_general1 = str(name_box_general)

    find_span = name_box_general1.find('class="first">')

    find_li = name_box_general1.find("</li>")

    numero = name_box_general1[find_span+14:find_li]

    find_name = name_box_general1.find('_/name/')

    find_li1 = name_box_general1[find_name:].find('">')

    equipo = name_box_general1[find_name+11:]

    buscar_mayor = equipo.find('">')

    buscar_a = equipo.find('</a>')

    equipo = equipo[buscar_mayor+2:buscar_a]

    datos_lanzador["numero"]= numero

    datos_lanzador["equipo"]= equipo

    print datos_lanzador

    name_boxaaa = soup.find('table', attrs={'class': 'tablehead mod-player-stats'})  

    table_headers = name_boxaaa.find_all('tr', attrs={'class': 'oddrow'}) 

    results = []

    for row1 in table_headers:

         table_data = row1.find_all('td')

         if u'Monthly Totals' in table_data[0]:

             pass

         elif table_data:

     results.append([data.get_text() for data in table_data])

    table_headers = name_boxaaa.find_all('tr', attrs={'class': 'evenrow'}) 

    for row1 in table_headers:

         table_data = row1.find_all('td')

         if u'Monthly Totals' in table_data[0]:

             pass

         elif table_data:

     results.append([data.get_text() for data in table_data])

    for game in results:

        dict_G_to_G = {}

        day = str(game[0][3:]).strip()

        month = game[0][0:3]

        if month == 'Jan':

            month= '01'

        elif month == 'Feb':

            month= '02'

        elif month == 'Mar':

            month= '03'

        elif month == 'Apr':

            month= '04'

        elif month == 'May':

            month= '05'

        elif month == 'Jun':

            month= '06'

        elif month == 'Jul':

            month= '07'

        elif month == 'Aug':

            month= '08'

        elif month == 'Sep':

            month= '09'

        elif month == 'Oct':

            month= '10'

        elif month == 'Nov':

            month= '11'

        elif month == 'Dec':

            month= '12'

        if int(day)< 10:

            day = "0"+str(day)

        date_game = day +month+ "2017"

        Birth_Date =datos_lanzador["Birth_Date"]

        search_com = Birth_Date.find(",")

        day = str(Birth_Date[search_com-3:search_com]).strip()

        month = Birth_Date[0:3]

        anio = Birth_Date[-2]

        if month == 'Jan':

            month= '01'

        elif month == 'Feb':

            month= '02'

        elif month == 'Mar':

            month= '03'

        elif month == 'Apr':

            month= '04'

        elif month == 'May':

            month= '05'

        elif month == 'Jun':

            month= '06'

        elif month == 'Jul':

            month= '07'

        elif month == 'Aug':

            month= '08'

        elif month == 'Sep':

            month= '09'

        elif month == 'Oct':

            month= '10'

        elif month == 'Nov':

            month= '11'

        elif month == 'Dec':

            month= '12'

        if int(day)< 10:

            day = "0"+str(day)

        birthday =  (day+month +Birth_Date[-5:]).strip()

        print birthday

        if "@" in game[1]:

            localia = "Guest"

        else:

            localia = "Home"

        versus_equip = game[1][-3:]

        score = game[2][1:]

        resulted = game[2][0]

        IP = game[3]

        H = game[4]

        R = game[5]

        ER = game[6]

        HR = game[7]

        BB = game[8]

        SO = game[9]

        GB = game[10]

        FB = game[11]

        Pit = game[12]

        BF = game[13]

        GSc = game[14]

        formatter_string = "%d%m%Y" 

        print date_game

        datetime_object = datetime.strptime(date_game, '%d%m%Y').date()

        print datetime_object

        datetime_birth = datetime.strptime(str(birthday) , formatter_string).date()

        print datetime_birth

        d0 = datetime_birth

        d1 =  datetime_object

        delta = d1 - d0

        dias_de_vida =  delta.days

        dias_de_vida_f = dias_de_vida

        while dias_de_vida_f > 23:

                dias_de_vida_f = dias_de_vida_f - 23

        porcentaje_fisico =  math.sin ( 2*math.pi*( dias_de_vida_f / 23.00))

        porcentaje_fisico =  100*((porcentaje_fisico + 1)/2)

        porc_fis = round(porcentaje_fisico, 2)

        dias_de_vida_e = dias_de_vida

        while dias_de_vida_e > 23:

                dias_de_vida_e = dias_de_vida_e - 28

        porcentaje_emocional =  math.sin ( 2*math.pi*( dias_de_vida_e / 28.00))

        porcentaje_emocional =  100*((porcentaje_emocional + 1)/2)

        porc_emoc = round(porcentaje_emocional, 2)

        dias_de_vida_i = dias_de_vida

        while dias_de_vida_i > 23:

                dias_de_vida_i = dias_de_vida_i - 33

        porcentaje_intelectual =  math.sin ( 2*math.pi*(  dias_de_vida_i/ 33.00))

        porcentaje_intelectual =  100*((porcentaje_intelectual + 1)/2)

        porc_inte = round(porcentaje_intelectual, 2)

        dict_G_to_G = {'date_game':datetime_object,

                       'localia':localia,

                       'versus_equip':versus_equip,

                       'score':score,

                       'resulted':resulted,

                       'IP':IP,

                       'H':H, 

                       'R':R,

                       'ER':ER, 

                       'HR':HR,

                       'BB':BB, 

                       'SO':SO, 

                       'GB':GB, 

                       'FB':FB, 

                       'Pit':Pit, 

                       'BF':BF, 

                       'GSc':GSc, 

                       'porc_inte':porc_inte, 

                       'porc_fis':porc_fis, 

                       'porc_emoc':porc_emoc

                       } 

        list_game.append(dict_G_to_G)

    datos_lanzador["list_game"]= list_game

    print datos_lanzador

    h = 0

    table = PrettyTable(["Results", "P_Fis","P_Int" , "P_Emo","Runs","H","BB","Date", "Vs"])

    for game in list_game:

                              table.add_row([game["resulted"],

                                             game["porc_fis"],

                                             game["porc_inte"],

                                             game["porc_emoc"],

                                             game["R"],

                                             game["H"],

                                             game["BB"],

                                             game["date_game"],

                                             game["versus_equip"]

                                             ])

                              h += 1

    table=  table.get_string(sortby=("Date"),  reversesort=False)

    print table




Coin Marketplace

STEEM 0.04
TRX 0.32
JST 0.080
BTC 59313.08
ETH 1551.67
USDT 1.00
SBD 0.47