LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   LinuxQuestions.org Member Success Stories (http://www.linuxquestions.org/questions/linuxquestions-org-member-success-stories-23/)
-   -   real time MLB baseball scores at the command line (http://www.linuxquestions.org/questions/linuxquestions-org-member-success-stories-23/real-time-mlb-baseball-scores-at-the-command-line-556959/)

DeuceNegative 05-27-2007 12:10 AM

real time MLB baseball scores at the command line
 
I often want to check the baseball score without opening a browser. (That's right, it's too much work. And sometimes not appropriate at work, considering how much I check it.) So I wrote this script to fetch and parse the ESPN MLB scoreboard page.

Here's some example output:
Code:

$ ./mlbscores boston
        1  2  3  4  5  6  7  8  9    R  H  E
Boston  0  0  0  2  0  5  0  0  0    7  8  1
Texas  0  1  0  0  3  0  0  0  0    4  6  1

(If the game wasn't over, those inning entries would be blank.)

Here's the code:
Code:

#!/usr/bin/env python

#NAME
#        mlbscores - report baseball scores in realtime by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
#        mlbscores [team]
#DESCRIPTION
#        run the script with no arguments to see all the scores
#        run the script with one argument to only show games with teams with that phrase in their name participating (case insensitive)
#        team names are as they appear on the ESPN website (city names actually, and/or team names if ambiguous)
#        the special phrase 'all' will show all games (useful when default team is hardcoded)
#        use watch for realtime updates
#BUGS
#        python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
#        I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
#        obviously, if ESPN changes their webpage enough, it will break this script


import sgmllib, copy, sys


def rpad(text, length, char=' '): return text+char*(length-len(text))
def lpad(text, length, char=' '): return char*(length-len(text))+text


class ESPNMLBScoresParser(sgmllib.SGMLParser):
        #game data is stored like [['Away Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E'], ['Home Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E']]
        #1..9 are the runs scored that inning, and RHE are the respective totals
        #everything is a string (all numeric, except 1..9 entries may be '' (inning no yet completed) or 'X' (bottom of the 9th not played))
        #the data is filled in by first creating all entries as '', then filling them in one by one as they're reached when parsing the html source
        #self.getGames() returns a list of these game data strutures
        #notes on ESPN html code
        #        <div class=gameContainer> starts a game
        #        <td class=teamLine>, first <a>'s data after that is away team name
        #        <td class=teamLine>, first <a>'s data after that is home team name
        #        <td class=innLine> x9, data are away runs for each inning
        #        <td class=innLine> x9, data are home runs for each inning
        #        <td class=rheLine> x3, data are away RHE
        #        <td class=rheLine> x3, data are home RHE
       
        def __init__(self):
                sgmllib.SGMLParser.__init__(self)
               
                #sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
                verReq = [2,3,3]
                verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
                if verSys<verReq:
                        print '*** ERROR *** You are using Python '+'.'.join(map(str,verSys))+'.  This script requires Python >= '+'.'.join(map(str,verReq))+'.'
                        sys.exit(1)
       
        def _reset_vars(self):
                self.next_a_is_away_team = False
                self.next_a_is_home_team = False

                self.next_data_is_stat = False
               
                self.team_index = 0  #0 ~ away, 1 ~ home
                self.stat_index = 0  #0 ~ team name, 1..9 ~ runs in each inning, 10-12 ~ RHE

                self.in_a              = False
                self.in_td_that_matters = False

                self.end_game = False

        def reset(self):
                sgmllib.SGMLParser.reset(self)
               
                self.games = []
                self._reset_vars()
       
        def start_div(self, atts):
                for name, value in atts:
                        if name=='class' and value=='gameContainer':
                                self._reset_vars()
                                self.games.append([['']*13,['']*13])
                                break
       
        def start_td(self, atts):
                for name, value in atts:
                        if name=='class':
                                if value=='teamLine':
                                        if  not self.games[-1][0][0]: self.next_a_is_away_team = True
                                        elif not self.games[-1][1][0]: self.next_a_is_home_team = True
                                        break
                                if value in ['innLine', 'rheLine']:
                                        self.in_td_that_matters = True
                                       
                                        #self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
                                        if self.stat_index==9:
                                                if self.team_index==0:
                                                        self.team_index = 1
                                                        self.stat_index = 1
                                                else:
                                                        self.team_index =  0
                                                        self.stat_index = 10
                                        elif self.stat_index==12 and self.team_index==0:
                                                        self.team_index =  1
                                                        self.stat_index = 10
                                        elif self.stat_index==11 and self.team_index==1:
                                                        self.stat_index = 12
                                                        self.end_game = True
                                        else:
                                                self.stat_index += 1
                                        break
        def end_td(self): self.in_td_that_matters = False
       
        def start_a(self, atts): self.in_a = True
        def end_a(self)        : self.in_a = False
       
        def handle_data(self, data):
                if self.in_a:
                        if self.next_a_is_away_team:
                                self.games[-1][0][0] = data
                                self.next_a_is_away_team = False
                                return
                        if self.next_a_is_home_team:
                                self.games[-1][1][0] = data
                                self.next_a_is_home_team = False
                                return
                if self.in_td_that_matters:
                        self.games[-1][self.team_index][self.stat_index] = data
                        if self.end_game: self._reset_vars()
       
        def getGames(self): return copy.deepcopy(self.games)


if __name__=='__main__':
        import sys, urllib2

        parser = ESPNMLBScoresParser()
        parser.reset()
        parser.feed(urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard').read())
        parser.close()
       
        games = parser.getGames()

        qry = 'all'  # <-- put your favorite team here
        if len(sys.argv)>1: qry = sys.argv[1]
        if qry.lower()=='all': qry = ''
        games = filter(lambda game: qry.lower() in game[0][0].lower() or qry.lower() in game[1][0].lower(), games)
        if len(games)==0:
                print '*** ERROR *** No teams match ['+qry+'].'
                sys.exit(1)

        team_width = max(map(len, [game[0][0] for game in games]+[game[1][0] for game in games]))
        for game in games:
                print lpad('', team_width+1),
                for i in range(1,10): print rpad(`i`, 2),
                print ' ',
                print 'R  H  E',
                print

                for i in range(2):
                        print rpad(game[i][0], team_width),
                        for j in range( 1,10): print lpad(game[i][j], 2),
                        print ' ',
                        for j in range(10,13): print lpad(game[i][j], 2),
                        print
               
                if len(games)!=1: print

I hope someone else finds this useful!

MS3FGX 05-27-2007 01:44 AM

I'm not a sports fan myself, but excellent work here. Very clever, I am sure there are many people who would be interested in such a thing.

I wonder if you couldn't put this up on Sourceforge and maybe start to expand it a bit. Say, have it run in the background checking scores and email the user when such and such team wins; that sort of thing. Everyone loves niche software.

XavierP 05-27-2007 11:48 AM

i wonder if it could be expanded to other sports? Regardless, good program and thanks for sharing with us, there are bound to be a lot of people here who would find it useful. Wonder who will be first to create a GUI for it? :)

AlanL 05-27-2007 11:51 PM

Very nice! How do I save it in the
form of a file?

DeuceNegative 05-31-2007 11:25 AM

Thanks for the comments. I think I will look into Sourceforge. I've never contributed before.

Quote:

How do I save it in the form of a file?
Really?

It should work fine if you just copy what's in the text box above and paste it into a text file, say mlbscores. Indentation is important in python, so copy it exactly. As long as you have python you should then be able to run it as

Code:

$ python mlbscores
or, if you give it execute permissions (chmod u+x mlbscores), just

Code:

$ ./mlbscores
Let me know if something else is the problem.

richie314159 04-03-2008 12:43 AM

w00000000t!!!!!!
 
i LOVE this!!!! you are a freaking genius!!!!!!! i bow before you, buddha of the cli.

DeuceNegative 04-04-2008 03:58 AM

Thanks. I'm glad ESPN didn't change their website--my script still works as is from last year. But I had some bug fixes since I originally posted this, e.g. it wasn't reporting extra innings games correctly. Here's the latest:

Code:

#!/usr/bin/env python

#NAME
#        mlbscores - report up to date baseball scores by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
#        mlbscores [team]
#DESCRIPTION
#        Run the script with no arguments to see all the scores.
#        Run the script with one argument to show only games with teams with that phrase in their name participating (case insensitive).
#        Team names are as they appear on the ESPN website (city names, and/or team names if ambiguous).
#        Use the watch program (watch -n 60 mlbscores boston) to make this a realtime scoreboard.
#BUGS
#        python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
#        I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
#        obviously, if ESPN changes their webpage enough, it will break this script


import sgmllib, copy, sys


DEBUG = False


class BaseballGameTeamData(object):
        def __init__(self):       
                self.teamName = None
                self.runsPerInning = [None]*9
                self.R = 0
                self.H = 0
                self.E = 0

class BaseballGameData(object):
        def __init__(self):
                self.startInning = None  #greater than 1 when there are extra innings
                self.home = BaseballGameTeamData()
                self.away = BaseballGameTeamData()
       
        def formatLineScore(self, teamWidth):
                s = ''
                def mystr(i):
                        if i is None: return ''
                        return str(i)
                s += '%-*s' % (teamWidth, '')
                for i in range(int(self.startInning), int(self.startInning)+9): s += '%3d' % i
                s += ' %3s%3s%3s' % ('R', 'H', 'E')
                s += '\n'
                for team in self.away, self.home:
                        s += '%-*s' % (teamWidth, team.teamName)
                        for i in map(mystr, team.runsPerInning): s += '%3s' % i
                        s += ' %3s%3s%3s' % (team.R, team.H, team.E)
                        s += '\n'
                return s[:-1]

class ESPNMLBScoreboardParser(sgmllib.SGMLParser):
        #notes on ESPN html code
        #        <div class=gameContainer> starts a game
        #        <td class=teamLine>, first <a>'s data after that is away team name
        #        <td class=teamLine>, first <a>'s data after that is home team name
        #        <td class=innTop> x9, data are inning numbers (nine of them, but only care about first; starts > 1 if extra innings)
        #        <td class=innLine> x9, data are away runs for each inning
        #        <td class=innLine> x9, data are home runs for each inning
        #        <td class=rheLine> x3, data are away RHE
        #        <td class=rheLine> x3, data are home RHE
       
        def __init__(self):
                sgmllib.SGMLParser.__init__(self)
               
                #sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
                verReq = [2,3,3]
                verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
                if verSys<verReq:
                        sys.stderr.write("*** ERROR ***this script requires Python >= %s (this is Python %s)\n" % ('.'.join(map(str,verReq), '.'.join(map(str,verSys)))))
                        sys.exit(1)
       
        def reset_vars(self):
                self.next_a_is_away_team = False
                self.next_a_is_home_team = False

                self.next_data_is_inning_start = False

                self.team_index = 0  #0 ~ away, 1 ~ home
                self.stat_index = 0  #0 ~ not used, 1..9 ~ runs in each inning, 10-12 ~ RHE

                self.in_a = False
                self.in_td_that_matters = False

                self.end_game = False

        def reset(self):
                sgmllib.SGMLParser.reset(self)
               
                self.games = []
                self.reset_vars()
       
        def start_div(self, atts):
                for name, value in atts:
                        if name=='class' and value=='gameContainer':
                                self.reset_vars()
                                self.games.append(BaseballGameData())
                                break
       
        def start_td(self, atts):
                for name, value in atts:
                        if name=='class':
                                if value=='teamLine':
                                        if  self.games[-1].away.teamName is None: self.next_a_is_away_team = True
                                        elif self.games[-1].home.teamName is None: self.next_a_is_home_team = True
                                        break
                                if value=='innTop':
                                        if self.games[-1].startInning is None:
                                                self.in_td_that_matters = True
                                                self.next_data_is_inning_start = True
                                        break
                                if value in ['innLine', 'rheLine']:
                                        self.in_td_that_matters = True
                                       
                                        #self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
                                        if self.stat_index==9:
                                                if self.team_index==0:
                                                        self.team_index = 1
                                                        self.stat_index = 1
                                                else:
                                                        self.team_index =  0
                                                        self.stat_index = 10
                                        elif self.stat_index==12 and self.team_index==0:
                                                        self.team_index =  1
                                                        self.stat_index = 10
                                        elif self.stat_index==11 and self.team_index==1:
                                                        self.stat_index = 12
                                                        self.end_game = True
                                        else:
                                                self.stat_index += 1
                                        break
        def end_td(self): self.in_td_that_matters = False
       
        def start_a(self, atts): self.in_a = True
        def end_a(self): self.in_a = False
       
        def handle_data(self, data):
                if self.in_a:
                        if self.next_a_is_away_team:
                                self.games[-1].away.teamName = data
                                self.next_a_is_away_team = False
                                return
                        if self.next_a_is_home_team:
                                self.games[-1].home.teamName = data
                                self.next_a_is_home_team = False
                                return
                if self.in_td_that_matters:
                        if self.next_data_is_inning_start:
                                self.games[-1].startInning = data
                                self.next_data_is_inning_start=False
                                return
                        if  self.team_index==0: team = self.games[-1].away
                        elif self.team_index==1: team = self.games[-1].home
                        if  self.stat_index in range( 1,10): team.runsPerInning[self.stat_index-1] = data
                        elif self.stat_index==10: team.R = data
                        elif self.stat_index==11: team.H = data
                        elif self.stat_index==12: team.E = data
                        if self.end_game: self.reset_vars()
       
        def getGames(self): return copy.deepcopy(self.games)


if __name__=='__main__':
        import sys, urllib2

        try:
                u = urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard')
        except urllib2.URLError, e:
                sys.stderr.write("*** ERROR *** unable to open the ESPN webpage")
                if hasattr(e, 'args') and type(e.args)==type(()): sys.stderr.write(': %s' % e[0][1])
                else: sys.stderr.write(': %s' % str(e) )
                sys.stderr.write('.\n')
                if DEBUG: raise
                sys.exit(1)

        source = u.read()

        parser = ESPNMLBScoreboardParser()
        parser.reset()
        try:
                parser.feed(source)
        except:
                sys.stderr.write('*** ERROR *** unable to parse the ESPN webpage\n')
                if DEBUG: raise
                sys.exit(1)
        parser.close()
       
        games = parser.getGames()

        qry = ''
        if len(sys.argv)>1: qry = sys.argv[1]
        games = filter(lambda game: qry.lower() in game.home.teamName.lower() or qry.lower() in game.away.teamName.lower(), games)
        if len(games)==0:
                sys.stderr.write("*** ERROR *** no teams match [%s]\n" % qry)
                sys.exit(1)

        teamWidth = max(map(len, [game.home.teamName for game in games]+[game.away.teamName for game in games]))
        for game in games:
                print game.formatLineScore(teamWidth)
                if len(games)!=1: print


JMJ_coder 04-07-2008 09:24 PM

Hello,

That is really nice. One thing I noticed is that when it encounters a game with extra innings, it doesn't show the earlier innings - they get cut off. There is enough space in the terminal to be able to show all the innings, even extra innings within reason (no 30 inning games :p).

But, yes a very nice program. Well done!

eveostay 04-08-2008 03:44 PM

This is very cool. For contrast, here's the lazy version.

Code:

~>which xps
xps:    aliased to lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 "!*"
~>xps phila

  Top 9th
  Philadelphia (3-4, 2-2 away)
  NY Mets (2-3, 0-0 home)
  1 2 3 4 5 6 7 8 9
  0 0 0 0 0 0 3 2 0
  0 1 0 1 0 0 0 0
  R H E
  5 8 0
  2 7 1
  GameCast | Box Score | RealTime [in.gif] | Watch
  Balls: Strikes: Outs:
  Pitching: () 0 IP, 0 ER, 0 K
  Batting: () 0-0
  C Muniz relieved A Heilman.

I am also grateful that ESPN didn't change this page this year. Not so lucky for the MLB.com script I had to tell me the next game, TV channels, and pitchers for various teams. They added in a bunch of javascript gunk which has broken it completely.

(Graphical browsers sux ;)

DeuceNegative 04-09-2008 08:40 PM

Quote:

Originally Posted by eveostay (Post 3115010)
This is very cool. For contrast, here's the lazy version.

Code:

~>which xps
xps:    aliased to lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 "!*"


Awesome! I'm going to use your approach from now on.

I still might fix my script's issue with the missing early innings for extra innings games that JMJ_coder mentions, for posterity's sake and in case others find that helpful, but I like this lynx/grep approach MUCH better.

Thanks!

richie314159 04-10-2008 12:58 AM

i don't understand.
 
eveostay,

i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.

thanks,

rich

eveostay 04-10-2008 01:19 PM

Quote:

Originally Posted by richie314159 (Post 3116487)
i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.

Hi. Well, It's a shell alias, not a script. To create an alias depends on what shell you're using, so you need to check the manual for your shell.

Different versions of lynx might have different command line options, so you might need to read that manual too.

The "!*" part passes in the argument you supply to egrep.

You might have better luck just getting it to work from the command line first (without the alias) try

Code:

lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 Phila
You can also remove the egrep and just get the dump from lynx, pipe it to more (or less) or whatever.

HTH.

Quote:

Originally Posted by DeuceNegative
Awesome! I'm going to use your approach from now on.

Wow -- I'm really flattered. Sometimes really simple is good enough, I guess.

Steve


All times are GMT -5. The time now is 01:21 PM.