LinuxQuestions.org Member Success StoriesJust spent four hours configuring your favorite program? Just figured out a Linux problem that has been stumping you for months?
Post your Linux Success Stories here.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I often want to check the baseball score without opening a browser. (That's right, it's too much work. And sometimes not appropriate at work, considering how much I check it.) So I wrote this script to fetch and parse the ESPN MLB scoreboard page.
Here's some example output:
Code:
$ ./mlbscores boston
1 2 3 4 5 6 7 8 9 R H E
Boston 0 0 0 2 0 5 0 0 0 7 8 1
Texas 0 1 0 0 3 0 0 0 0 4 6 1
(If the game wasn't over, those inning entries would be blank.)
Here's the code:
Code:
#!/usr/bin/env python
#NAME
# mlbscores - report baseball scores in realtime by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
# mlbscores [team]
#DESCRIPTION
# run the script with no arguments to see all the scores
# run the script with one argument to only show games with teams with that phrase in their name participating (case insensitive)
# team names are as they appear on the ESPN website (city names actually, and/or team names if ambiguous)
# the special phrase 'all' will show all games (useful when default team is hardcoded)
# use watch for realtime updates
#BUGS
# python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
# I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
# obviously, if ESPN changes their webpage enough, it will break this script
import sgmllib, copy, sys
def rpad(text, length, char=' '): return text+char*(length-len(text))
def lpad(text, length, char=' '): return char*(length-len(text))+text
class ESPNMLBScoresParser(sgmllib.SGMLParser):
#game data is stored like [['Away Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E'], ['Home Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E']]
#1..9 are the runs scored that inning, and RHE are the respective totals
#everything is a string (all numeric, except 1..9 entries may be '' (inning no yet completed) or 'X' (bottom of the 9th not played))
#the data is filled in by first creating all entries as '', then filling them in one by one as they're reached when parsing the html source
#self.getGames() returns a list of these game data strutures
#notes on ESPN html code
# <div class=gameContainer> starts a game
# <td class=teamLine>, first <a>'s data after that is away team name
# <td class=teamLine>, first <a>'s data after that is home team name
# <td class=innLine> x9, data are away runs for each inning
# <td class=innLine> x9, data are home runs for each inning
# <td class=rheLine> x3, data are away RHE
# <td class=rheLine> x3, data are home RHE
def __init__(self):
sgmllib.SGMLParser.__init__(self)
#sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
verReq = [2,3,3]
verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
if verSys<verReq:
print '*** ERROR *** You are using Python '+'.'.join(map(str,verSys))+'. This script requires Python >= '+'.'.join(map(str,verReq))+'.'
sys.exit(1)
def _reset_vars(self):
self.next_a_is_away_team = False
self.next_a_is_home_team = False
self.next_data_is_stat = False
self.team_index = 0 #0 ~ away, 1 ~ home
self.stat_index = 0 #0 ~ team name, 1..9 ~ runs in each inning, 10-12 ~ RHE
self.in_a = False
self.in_td_that_matters = False
self.end_game = False
def reset(self):
sgmllib.SGMLParser.reset(self)
self.games = []
self._reset_vars()
def start_div(self, atts):
for name, value in atts:
if name=='class' and value=='gameContainer':
self._reset_vars()
self.games.append([['']*13,['']*13])
break
def start_td(self, atts):
for name, value in atts:
if name=='class':
if value=='teamLine':
if not self.games[-1][0][0]: self.next_a_is_away_team = True
elif not self.games[-1][1][0]: self.next_a_is_home_team = True
break
if value in ['innLine', 'rheLine']:
self.in_td_that_matters = True
#self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
if self.stat_index==9:
if self.team_index==0:
self.team_index = 1
self.stat_index = 1
else:
self.team_index = 0
self.stat_index = 10
elif self.stat_index==12 and self.team_index==0:
self.team_index = 1
self.stat_index = 10
elif self.stat_index==11 and self.team_index==1:
self.stat_index = 12
self.end_game = True
else:
self.stat_index += 1
break
def end_td(self): self.in_td_that_matters = False
def start_a(self, atts): self.in_a = True
def end_a(self) : self.in_a = False
def handle_data(self, data):
if self.in_a:
if self.next_a_is_away_team:
self.games[-1][0][0] = data
self.next_a_is_away_team = False
return
if self.next_a_is_home_team:
self.games[-1][1][0] = data
self.next_a_is_home_team = False
return
if self.in_td_that_matters:
self.games[-1][self.team_index][self.stat_index] = data
if self.end_game: self._reset_vars()
def getGames(self): return copy.deepcopy(self.games)
if __name__=='__main__':
import sys, urllib2
parser = ESPNMLBScoresParser()
parser.reset()
parser.feed(urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard').read())
parser.close()
games = parser.getGames()
qry = 'all' # <-- put your favorite team here
if len(sys.argv)>1: qry = sys.argv[1]
if qry.lower()=='all': qry = ''
games = filter(lambda game: qry.lower() in game[0][0].lower() or qry.lower() in game[1][0].lower(), games)
if len(games)==0:
print '*** ERROR *** No teams match ['+qry+'].'
sys.exit(1)
team_width = max(map(len, [game[0][0] for game in games]+[game[1][0] for game in games]))
for game in games:
print lpad('', team_width+1),
for i in range(1,10): print rpad(`i`, 2),
print ' ',
print 'R H E',
print
for i in range(2):
print rpad(game[i][0], team_width),
for j in range( 1,10): print lpad(game[i][j], 2),
print ' ',
for j in range(10,13): print lpad(game[i][j], 2),
print
if len(games)!=1: print
I'm not a sports fan myself, but excellent work here. Very clever, I am sure there are many people who would be interested in such a thing.
I wonder if you couldn't put this up on Sourceforge and maybe start to expand it a bit. Say, have it run in the background checking scores and email the user when such and such team wins; that sort of thing. Everyone loves niche software.
i wonder if it could be expanded to other sports? Regardless, good program and thanks for sharing with us, there are bound to be a lot of people here who would find it useful. Wonder who will be first to create a GUI for it?
Thanks for the comments. I think I will look into Sourceforge. I've never contributed before.
Quote:
How do I save it in the form of a file?
Really?
It should work fine if you just copy what's in the text box above and paste it into a text file, say mlbscores. Indentation is important in python, so copy it exactly. As long as you have python you should then be able to run it as
Code:
$ python mlbscores
or, if you give it execute permissions (chmod u+x mlbscores), just
Thanks. I'm glad ESPN didn't change their website--my script still works as is from last year. But I had some bug fixes since I originally posted this, e.g. it wasn't reporting extra innings games correctly. Here's the latest:
Code:
#!/usr/bin/env python
#NAME
# mlbscores - report up to date baseball scores by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
# mlbscores [team]
#DESCRIPTION
# Run the script with no arguments to see all the scores.
# Run the script with one argument to show only games with teams with that phrase in their name participating (case insensitive).
# Team names are as they appear on the ESPN website (city names, and/or team names if ambiguous).
# Use the watch program (watch -n 60 mlbscores boston) to make this a realtime scoreboard.
#BUGS
# python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
# I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
# obviously, if ESPN changes their webpage enough, it will break this script
import sgmllib, copy, sys
DEBUG = False
class BaseballGameTeamData(object):
def __init__(self):
self.teamName = None
self.runsPerInning = [None]*9
self.R = 0
self.H = 0
self.E = 0
class BaseballGameData(object):
def __init__(self):
self.startInning = None #greater than 1 when there are extra innings
self.home = BaseballGameTeamData()
self.away = BaseballGameTeamData()
def formatLineScore(self, teamWidth):
s = ''
def mystr(i):
if i is None: return ''
return str(i)
s += '%-*s' % (teamWidth, '')
for i in range(int(self.startInning), int(self.startInning)+9): s += '%3d' % i
s += ' %3s%3s%3s' % ('R', 'H', 'E')
s += '\n'
for team in self.away, self.home:
s += '%-*s' % (teamWidth, team.teamName)
for i in map(mystr, team.runsPerInning): s += '%3s' % i
s += ' %3s%3s%3s' % (team.R, team.H, team.E)
s += '\n'
return s[:-1]
class ESPNMLBScoreboardParser(sgmllib.SGMLParser):
#notes on ESPN html code
# <div class=gameContainer> starts a game
# <td class=teamLine>, first <a>'s data after that is away team name
# <td class=teamLine>, first <a>'s data after that is home team name
# <td class=innTop> x9, data are inning numbers (nine of them, but only care about first; starts > 1 if extra innings)
# <td class=innLine> x9, data are away runs for each inning
# <td class=innLine> x9, data are home runs for each inning
# <td class=rheLine> x3, data are away RHE
# <td class=rheLine> x3, data are home RHE
def __init__(self):
sgmllib.SGMLParser.__init__(self)
#sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
verReq = [2,3,3]
verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
if verSys<verReq:
sys.stderr.write("*** ERROR ***this script requires Python >= %s (this is Python %s)\n" % ('.'.join(map(str,verReq), '.'.join(map(str,verSys)))))
sys.exit(1)
def reset_vars(self):
self.next_a_is_away_team = False
self.next_a_is_home_team = False
self.next_data_is_inning_start = False
self.team_index = 0 #0 ~ away, 1 ~ home
self.stat_index = 0 #0 ~ not used, 1..9 ~ runs in each inning, 10-12 ~ RHE
self.in_a = False
self.in_td_that_matters = False
self.end_game = False
def reset(self):
sgmllib.SGMLParser.reset(self)
self.games = []
self.reset_vars()
def start_div(self, atts):
for name, value in atts:
if name=='class' and value=='gameContainer':
self.reset_vars()
self.games.append(BaseballGameData())
break
def start_td(self, atts):
for name, value in atts:
if name=='class':
if value=='teamLine':
if self.games[-1].away.teamName is None: self.next_a_is_away_team = True
elif self.games[-1].home.teamName is None: self.next_a_is_home_team = True
break
if value=='innTop':
if self.games[-1].startInning is None:
self.in_td_that_matters = True
self.next_data_is_inning_start = True
break
if value in ['innLine', 'rheLine']:
self.in_td_that_matters = True
#self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
if self.stat_index==9:
if self.team_index==0:
self.team_index = 1
self.stat_index = 1
else:
self.team_index = 0
self.stat_index = 10
elif self.stat_index==12 and self.team_index==0:
self.team_index = 1
self.stat_index = 10
elif self.stat_index==11 and self.team_index==1:
self.stat_index = 12
self.end_game = True
else:
self.stat_index += 1
break
def end_td(self): self.in_td_that_matters = False
def start_a(self, atts): self.in_a = True
def end_a(self): self.in_a = False
def handle_data(self, data):
if self.in_a:
if self.next_a_is_away_team:
self.games[-1].away.teamName = data
self.next_a_is_away_team = False
return
if self.next_a_is_home_team:
self.games[-1].home.teamName = data
self.next_a_is_home_team = False
return
if self.in_td_that_matters:
if self.next_data_is_inning_start:
self.games[-1].startInning = data
self.next_data_is_inning_start=False
return
if self.team_index==0: team = self.games[-1].away
elif self.team_index==1: team = self.games[-1].home
if self.stat_index in range( 1,10): team.runsPerInning[self.stat_index-1] = data
elif self.stat_index==10: team.R = data
elif self.stat_index==11: team.H = data
elif self.stat_index==12: team.E = data
if self.end_game: self.reset_vars()
def getGames(self): return copy.deepcopy(self.games)
if __name__=='__main__':
import sys, urllib2
try:
u = urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard')
except urllib2.URLError, e:
sys.stderr.write("*** ERROR *** unable to open the ESPN webpage")
if hasattr(e, 'args') and type(e.args)==type(()): sys.stderr.write(': %s' % e[0][1])
else: sys.stderr.write(': %s' % str(e) )
sys.stderr.write('.\n')
if DEBUG: raise
sys.exit(1)
source = u.read()
parser = ESPNMLBScoreboardParser()
parser.reset()
try:
parser.feed(source)
except:
sys.stderr.write('*** ERROR *** unable to parse the ESPN webpage\n')
if DEBUG: raise
sys.exit(1)
parser.close()
games = parser.getGames()
qry = ''
if len(sys.argv)>1: qry = sys.argv[1]
games = filter(lambda game: qry.lower() in game.home.teamName.lower() or qry.lower() in game.away.teamName.lower(), games)
if len(games)==0:
sys.stderr.write("*** ERROR *** no teams match [%s]\n" % qry)
sys.exit(1)
teamWidth = max(map(len, [game.home.teamName for game in games]+[game.away.teamName for game in games]))
for game in games:
print game.formatLineScore(teamWidth)
if len(games)!=1: print
That is really nice. One thing I noticed is that when it encounters a game with extra innings, it doesn't show the earlier innings - they get cut off. There is enough space in the terminal to be able to show all the innings, even extra innings within reason (no 30 inning games ).
This is very cool. For contrast, here's the lazy version.
Code:
~>which xps
xps: aliased to lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 "!*"
~>xps phila
Top 9th
Philadelphia (3-4, 2-2 away)
NY Mets (2-3, 0-0 home)
1 2 3 4 5 6 7 8 9
0 0 0 0 0 0 3 2 0
0 1 0 1 0 0 0 0
R H E
5 8 0
2 7 1
GameCast | Box Score | RealTime [in.gif] | Watch
Balls: Strikes: Outs:
Pitching: () 0 IP, 0 ER, 0 K
Batting: () 0-0
C Muniz relieved A Heilman.
I am also grateful that ESPN didn't change this page this year. Not so lucky for the MLB.com script I had to tell me the next game, TV channels, and pitchers for various teams. They added in a bunch of javascript gunk which has broken it completely.
Awesome! I'm going to use your approach from now on.
I still might fix my script's issue with the missing early innings for extra innings games that JMJ_coder mentions, for posterity's sake and in case others find that helpful, but I like this lynx/grep approach MUCH better.
i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.
i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.
Hi. Well, It's a shell alias, not a script. To create an alias depends on what shell you're using, so you need to check the manual for your shell.
Different versions of lynx might have different command line options, so you might need to read that manual too.
The "!*" part passes in the argument you supply to egrep.
You might have better luck just getting it to work from the command line first (without the alias) try
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.