LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General > LinuxQuestions.org Member Success Stories
User Name
Password
LinuxQuestions.org Member Success Stories Just spent four hours configuring your favorite program? Just figured out a Linux problem that has been stumping you for months?
Post your Linux Success Stories here.

Notices

Reply
 
Search this Thread
Old 05-27-2007, 12:10 AM   #1
DeuceNegative
LQ Newbie
 
Registered: May 2006
Location: Boston, MA
Distribution: Gentoo
Posts: 29

Rep: Reputation: 16
real time MLB baseball scores at the command line


I often want to check the baseball score without opening a browser. (That's right, it's too much work. And sometimes not appropriate at work, considering how much I check it.) So I wrote this script to fetch and parse the ESPN MLB scoreboard page.

Here's some example output:
Code:
$ ./mlbscores boston
        1  2  3  4  5  6  7  8  9    R  H  E
Boston  0  0  0  2  0  5  0  0  0    7  8  1
Texas   0  1  0  0  3  0  0  0  0    4  6  1
(If the game wasn't over, those inning entries would be blank.)

Here's the code:
Code:
#!/usr/bin/env python

#NAME
#	mlbscores - report baseball scores in realtime by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
#	mlbscores [team]
#DESCRIPTION
#	run the script with no arguments to see all the scores
#	run the script with one argument to only show games with teams with that phrase in their name participating (case insensitive)
#	team names are as they appear on the ESPN website (city names actually, and/or team names if ambiguous)
#	the special phrase 'all' will show all games (useful when default team is hardcoded)
#	use watch for realtime updates
#BUGS
#	python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
#	I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
#	obviously, if ESPN changes their webpage enough, it will break this script


import sgmllib, copy, sys


def rpad(text, length, char=' '): return text+char*(length-len(text))
def lpad(text, length, char=' '): return char*(length-len(text))+text


class ESPNMLBScoresParser(sgmllib.SGMLParser):
	#game data is stored like [['Away Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E'], ['Home Team', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E']]
	#1..9 are the runs scored that inning, and RHE are the respective totals
	#everything is a string (all numeric, except 1..9 entries may be '' (inning no yet completed) or 'X' (bottom of the 9th not played))
	#the data is filled in by first creating all entries as '', then filling them in one by one as they're reached when parsing the html source
	#self.getGames() returns a list of these game data strutures
	#notes on ESPN html code
	#	<div class=gameContainer> starts a game
	#	<td class=teamLine>, first <a>'s data after that is away team name
	#	<td class=teamLine>, first <a>'s data after that is home team name
	#	<td class=innLine> x9, data are away runs for each inning
	#	<td class=innLine> x9, data are home runs for each inning
	#	<td class=rheLine> x3, data are away RHE
	#	<td class=rheLine> x3, data are home RHE
	
	def __init__(self):
		sgmllib.SGMLParser.__init__(self)
		
		#sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
		verReq = [2,3,3]
		verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
		if verSys<verReq:
			print '*** ERROR *** You are using Python '+'.'.join(map(str,verSys))+'.  This script requires Python >= '+'.'.join(map(str,verReq))+'.'
			sys.exit(1)
	
	def _reset_vars(self):
		self.next_a_is_away_team = False
		self.next_a_is_home_team = False

		self.next_data_is_stat = False
		
		self.team_index = 0  #0 ~ away, 1 ~ home
		self.stat_index = 0  #0 ~ team name, 1..9 ~ runs in each inning, 10-12 ~ RHE

		self.in_a               = False
		self.in_td_that_matters = False

		self.end_game = False

	def reset(self):
		sgmllib.SGMLParser.reset(self)
		
		self.games = []
		self._reset_vars()
	
	def start_div(self, atts):
		for name, value in atts:
			if name=='class' and value=='gameContainer':
				self._reset_vars()
				self.games.append([['']*13,['']*13])
				break
	
	def start_td(self, atts):
		for name, value in atts:
			if name=='class':
				if value=='teamLine':
					if   not self.games[-1][0][0]: self.next_a_is_away_team = True
					elif not self.games[-1][1][0]: self.next_a_is_home_team = True
					break
				if value in ['innLine', 'rheLine']:
					self.in_td_that_matters = True
					
					#self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
					if self.stat_index==9:
						if self.team_index==0:
							self.team_index = 1
							self.stat_index = 1
						else:
							self.team_index =  0
							self.stat_index = 10
					elif self.stat_index==12 and self.team_index==0:
							self.team_index =  1
							self.stat_index = 10
					elif self.stat_index==11 and self.team_index==1:
							self.stat_index = 12
							self.end_game = True
					else:
						self.stat_index += 1
					break
	def end_td(self): self.in_td_that_matters = False
	
	def start_a(self, atts): self.in_a = True
	def end_a(self)        : self.in_a = False
	
	def handle_data(self, data):
		if self.in_a:
			if self.next_a_is_away_team:
				self.games[-1][0][0] = data
				self.next_a_is_away_team = False
				return
			if self.next_a_is_home_team:
				self.games[-1][1][0] = data
				self.next_a_is_home_team = False
				return
		if self.in_td_that_matters:
			self.games[-1][self.team_index][self.stat_index] = data
			if self.end_game: self._reset_vars()
	
	def getGames(self): return copy.deepcopy(self.games)


if __name__=='__main__':
	import sys, urllib2

	parser = ESPNMLBScoresParser()
	parser.reset()
	parser.feed(urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard').read())
	parser.close()
	
	games = parser.getGames()

	qry = 'all'  # <-- put your favorite team here
	if len(sys.argv)>1: qry = sys.argv[1]
	if qry.lower()=='all': qry = ''
	games = filter(lambda game: qry.lower() in game[0][0].lower() or qry.lower() in game[1][0].lower(), games)
	if len(games)==0:
		print '*** ERROR *** No teams match ['+qry+'].'
		sys.exit(1)

	team_width = max(map(len, [game[0][0] for game in games]+[game[1][0] for game in games]))
	for game in games:
		print lpad('', team_width+1),
		for i in range(1,10): print rpad(`i`, 2),
		print ' ',
		print 'R  H  E',
		print

		for i in range(2):
			print rpad(game[i][0], team_width),
			for j in range( 1,10): print lpad(game[i][j], 2),
			print ' ',
			for j in range(10,13): print lpad(game[i][j], 2),
			print
		
		if len(games)!=1: print
I hope someone else finds this useful!
 
Old 05-27-2007, 01:44 AM   #2
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
I'm not a sports fan myself, but excellent work here. Very clever, I am sure there are many people who would be interested in such a thing.

I wonder if you couldn't put this up on Sourceforge and maybe start to expand it a bit. Say, have it run in the background checking scores and email the user when such and such team wins; that sort of thing. Everyone loves niche software.
 
Old 05-27-2007, 11:48 AM   #3
XavierP
Moderator
 
Registered: Nov 2002
Location: Kent, England
Distribution: Lubuntu
Posts: 19,174
Blog Entries: 4

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
i wonder if it could be expanded to other sports? Regardless, good program and thanks for sharing with us, there are bound to be a lot of people here who would find it useful. Wonder who will be first to create a GUI for it?
 
Old 05-27-2007, 11:51 PM   #4
AlanL
Member
 
Registered: Dec 2002
Location: New Westminster, B.C.,CANADA
Distribution: Slax. Tinycore. Puppy.
Posts: 109

Rep: Reputation: 15
Very nice! How do I save it in the
form of a file?
 
Old 05-31-2007, 11:25 AM   #5
DeuceNegative
LQ Newbie
 
Registered: May 2006
Location: Boston, MA
Distribution: Gentoo
Posts: 29

Original Poster
Rep: Reputation: 16
Thanks for the comments. I think I will look into Sourceforge. I've never contributed before.

Quote:
How do I save it in the form of a file?
Really?

It should work fine if you just copy what's in the text box above and paste it into a text file, say mlbscores. Indentation is important in python, so copy it exactly. As long as you have python you should then be able to run it as

Code:
$ python mlbscores
or, if you give it execute permissions (chmod u+x mlbscores), just

Code:
$ ./mlbscores
Let me know if something else is the problem.
 
Old 04-03-2008, 12:43 AM   #6
richie314159
LQ Newbie
 
Registered: Nov 2005
Distribution: kubuntu feisty
Posts: 10

Rep: Reputation: 0
w00000000t!!!!!!

i LOVE this!!!! you are a freaking genius!!!!!!! i bow before you, buddha of the cli.
 
Old 04-04-2008, 03:58 AM   #7
DeuceNegative
LQ Newbie
 
Registered: May 2006
Location: Boston, MA
Distribution: Gentoo
Posts: 29

Original Poster
Rep: Reputation: 16
Thanks. I'm glad ESPN didn't change their website--my script still works as is from last year. But I had some bug fixes since I originally posted this, e.g. it wasn't reporting extra innings games correctly. Here's the latest:

Code:
#!/usr/bin/env python

#NAME
#	mlbscores - report up to date baseball scores by reading the ESPN MLB scoreboard webpage
#SYNOPSIS
#	mlbscores [team]
#DESCRIPTION
#	Run the script with no arguments to see all the scores.
#	Run the script with one argument to show only games with teams with that phrase in their name participating (case insensitive).
#	Team names are as they appear on the ESPN website (city names, and/or team names if ambiguous).
#	Use the watch program (watch -n 60 mlbscores boston) to make this a realtime scoreboard.
#BUGS
#	python's sgmllib.SGMLParser complains about ESPN's webpage as late as python 2.2.3
#	I've seen ESPN put a 0 in the top of the 1st for a postponed game, making it appear as if it's in progress (this 0 is not rendered in a web browser)
#	obviously, if ESPN changes their webpage enough, it will break this script


import sgmllib, copy, sys


DEBUG = False


class BaseballGameTeamData(object):
	def __init__(self):	
		self.teamName = None
		self.runsPerInning = [None]*9
		self.R = 0
		self.H = 0
		self.E = 0

class BaseballGameData(object):
	def __init__(self):
		self.startInning = None  #greater than 1 when there are extra innings
		self.home = BaseballGameTeamData()
		self.away = BaseballGameTeamData()
	
	def formatLineScore(self, teamWidth):
		s = ''
		def mystr(i):
			if i is None: return ''
			return str(i)
		s += '%-*s' % (teamWidth, '')
		for i in range(int(self.startInning), int(self.startInning)+9): s += '%3d' % i
		s += ' %3s%3s%3s' % ('R', 'H', 'E')
		s += '\n'
		for team in self.away, self.home:
			s += '%-*s' % (teamWidth, team.teamName)
			for i in map(mystr, team.runsPerInning): s += '%3s' % i
			s += ' %3s%3s%3s' % (team.R, team.H, team.E)
			s += '\n'
		return s[:-1]

class ESPNMLBScoreboardParser(sgmllib.SGMLParser):
	#notes on ESPN html code
	#	<div class=gameContainer> starts a game
	#	<td class=teamLine>, first <a>'s data after that is away team name
	#	<td class=teamLine>, first <a>'s data after that is home team name
	#	<td class=innTop> x9, data are inning numbers (nine of them, but only care about first; starts > 1 if extra innings)
	#	<td class=innLine> x9, data are away runs for each inning
	#	<td class=innLine> x9, data are home runs for each inning
	#	<td class=rheLine> x3, data are away RHE
	#	<td class=rheLine> x3, data are home RHE
	
	def __init__(self):
		sgmllib.SGMLParser.__init__(self)
		
		#sgmllib.SGMLParser complains about ESPN's page as late as 2.2.3, but doesn't as early as 2.3.3 (I don't know about anything between)
		verReq = [2,3,3]
		verSys = [sys.version_info[0],sys.version_info[1],sys.version_info[2]]
		if verSys<verReq:
			sys.stderr.write("*** ERROR ***this script requires Python >= %s (this is Python %s)\n" % ('.'.join(map(str,verReq), '.'.join(map(str,verSys)))))
			sys.exit(1)
	
	def reset_vars(self):
		self.next_a_is_away_team = False
		self.next_a_is_home_team = False

		self.next_data_is_inning_start = False

		self.team_index = 0  #0 ~ away, 1 ~ home
		self.stat_index = 0  #0 ~ not used, 1..9 ~ runs in each inning, 10-12 ~ RHE

		self.in_a = False
		self.in_td_that_matters = False

		self.end_game = False

	def reset(self):
		sgmllib.SGMLParser.reset(self)
		
		self.games = []
		self.reset_vars()
	
	def start_div(self, atts):
		for name, value in atts:
			if name=='class' and value=='gameContainer':
				self.reset_vars()
				self.games.append(BaseballGameData())
				break
	
	def start_td(self, atts):
		for name, value in atts:
			if name=='class':
				if value=='teamLine':
					if   self.games[-1].away.teamName is None: self.next_a_is_away_team = True
					elif self.games[-1].home.teamName is None: self.next_a_is_home_team = True
					break
				if value=='innTop':
					if self.games[-1].startInning is None:
						self.in_td_that_matters = True
						self.next_data_is_inning_start = True
					break
				if value in ['innLine', 'rheLine']:
					self.in_td_that_matters = True
					
					#self.team_index and self.stat_index now point to where the last thing went; make them point to where the next thing will go
					if self.stat_index==9:
						if self.team_index==0:
							self.team_index = 1
							self.stat_index = 1
						else:
							self.team_index =  0
							self.stat_index = 10
					elif self.stat_index==12 and self.team_index==0:
							self.team_index =  1
							self.stat_index = 10
					elif self.stat_index==11 and self.team_index==1:
							self.stat_index = 12
							self.end_game = True
					else:
						self.stat_index += 1
					break
	def end_td(self): self.in_td_that_matters = False
	
	def start_a(self, atts): self.in_a = True
	def end_a(self): self.in_a = False
	
	def handle_data(self, data):
		if self.in_a:
			if self.next_a_is_away_team:
				self.games[-1].away.teamName = data
				self.next_a_is_away_team = False
				return
			if self.next_a_is_home_team:
				self.games[-1].home.teamName = data
				self.next_a_is_home_team = False
				return
		if self.in_td_that_matters:
			if self.next_data_is_inning_start:
				self.games[-1].startInning = data
				self.next_data_is_inning_start=False
				return
			if   self.team_index==0: team = self.games[-1].away
			elif self.team_index==1: team = self.games[-1].home
			if   self.stat_index in range( 1,10): team.runsPerInning[self.stat_index-1] = data
			elif self.stat_index==10: team.R = data
			elif self.stat_index==11: team.H = data
			elif self.stat_index==12: team.E = data
			if self.end_game: self.reset_vars()
	
	def getGames(self): return copy.deepcopy(self.games)


if __name__=='__main__':
	import sys, urllib2

	try:
		u = urllib2.urlopen('http://sports.espn.go.com/mlb/scoreboard')
	except urllib2.URLError, e:
		sys.stderr.write("*** ERROR *** unable to open the ESPN webpage")
		if hasattr(e, 'args') and type(e.args)==type(()): sys.stderr.write(': %s' % e[0][1])
		else: sys.stderr.write(': %s' % str(e) )
		sys.stderr.write('.\n')
		if DEBUG: raise
		sys.exit(1)

	source = u.read()

	parser = ESPNMLBScoreboardParser()
	parser.reset()
	try:
		parser.feed(source)
	except:
		sys.stderr.write('*** ERROR *** unable to parse the ESPN webpage\n')
		if DEBUG: raise
		sys.exit(1)
	parser.close()
	
	games = parser.getGames()

	qry = ''
	if len(sys.argv)>1: qry = sys.argv[1]
	games = filter(lambda game: qry.lower() in game.home.teamName.lower() or qry.lower() in game.away.teamName.lower(), games)
	if len(games)==0:
		sys.stderr.write("*** ERROR *** no teams match [%s]\n" % qry)
		sys.exit(1)

	teamWidth = max(map(len, [game.home.teamName for game in games]+[game.away.teamName for game in games]))
	for game in games:
		print game.formatLineScore(teamWidth)
		if len(games)!=1: print
 
Old 04-07-2008, 09:24 PM   #8
JMJ_coder
Member
 
Registered: Apr 2006
Distribution: Fedora
Posts: 478

Rep: Reputation: 30
Hello,

That is really nice. One thing I noticed is that when it encounters a game with extra innings, it doesn't show the earlier innings - they get cut off. There is enough space in the terminal to be able to show all the innings, even extra innings within reason (no 30 inning games ).

But, yes a very nice program. Well done!
 
Old 04-08-2008, 03:44 PM   #9
eveostay
LQ Newbie
 
Registered: Apr 2008
Posts: 12

Rep: Reputation: 0
This is very cool. For contrast, here's the lazy version.

Code:
~>which xps
xps:     aliased to lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 "!*"
~>xps phila

   Top 9th
   Philadelphia (3-4, 2-2 away)
   NY Mets (2-3, 0-0 home)
   1 2 3 4 5 6 7 8 9
   0 0 0 0 0 0 3 2 0
   0 1 0 1 0 0 0 0
   R H E
   5 8 0
   2 7 1
   GameCast | Box Score | RealTime [in.gif] | Watch
   Balls: Strikes: Outs:
   Pitching: () 0 IP, 0 ER, 0 K
   Batting: () 0-0
   C Muniz relieved A Heilman.
I am also grateful that ESPN didn't change this page this year. Not so lucky for the MLB.com script I had to tell me the next game, TV channels, and pitchers for various teams. They added in a bunch of javascript gunk which has broken it completely.

(Graphical browsers sux
 
Old 04-09-2008, 08:40 PM   #10
DeuceNegative
LQ Newbie
 
Registered: May 2006
Location: Boston, MA
Distribution: Gentoo
Posts: 29

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by eveostay View Post
This is very cool. For contrast, here's the lazy version.

Code:
~>which xps
xps:     aliased to lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 "!*"
Awesome! I'm going to use your approach from now on.

I still might fix my script's issue with the missing early innings for extra innings games that JMJ_coder mentions, for posterity's sake and in case others find that helpful, but I like this lynx/grep approach MUCH better.

Thanks!
 
Old 04-10-2008, 12:58 AM   #11
richie314159
LQ Newbie
 
Registered: Nov 2005
Distribution: kubuntu feisty
Posts: 10

Rep: Reputation: 0
i don't understand.

eveostay,

i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.

thanks,

rich
 
Old 04-10-2008, 01:19 PM   #12
eveostay
LQ Newbie
 
Registered: Apr 2008
Posts: 12

Rep: Reputation: 0
Quote:
Originally Posted by richie314159 View Post
i've used lynx before but i don't understand how to use your script. and could you explain how it works a little bit? i'm still a noob but i want to learn as much as possible.
Hi. Well, It's a shell alias, not a script. To create an alias depends on what shell you're using, so you need to check the manual for your shell.

Different versions of lynx might have different command line options, so you might need to read that manual too.

The "!*" part passes in the argument you supply to egrep.

You might have better luck just getting it to work from the command line first (without the alias) try

Code:
lynx -nonumbers -dump http://scores.espn.go.com/mlb/scoreboard | egrep -i -A12 -B2 Phila
You can also remove the egrep and just get the dump from lynx, pipe it to more (or less) or whatever.

HTH.

Quote:
Originally Posted by DeuceNegative
Awesome! I'm going to use your approach from now on.
Wow -- I'm really flattered. Sometimes really simple is good enough, I guess.

Steve
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
change the time from the command line juanb Linux - Newbie 3 10-31-2007 02:11 AM
LXer: Real-time garbage collection with Real-time Java LXer Syndicated Linux News 0 05-05-2007 12:16 PM
LXer: Real-time Linux gains real-time JVM LXer Syndicated Linux News 0 10-12-2006 10:54 AM
Network time through command line? Phaethar Linux - Software 2 05-06-2005 02:33 PM
altering the time at the command line vendemmian Linux - General 1 05-01-2001 06:44 AM


All times are GMT -5. The time now is 09:48 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration