Share your knowledge at the LQ Wiki.
Go Back > Blogs > Musings on technology, philosophy, and life in the corporate world
User Name


Hi. I'm a Unix Administrator, mathematics enthusiast, and amateur philosopher. This is where I rant about that which upsets me, laugh about that which amuses me, and jabber about that which holds my interest most: Unix.
Rate this Entry

I can hear that train a'comin'

Posted 02-01-2011 at 04:42 PM by rocket357

A few months ago the execs where I work gave me a new project. I was to take Version 4.0 of our flagship product and gather all of the performance data collected since 2003 (the beginning of time, as far as Version 4.0 is concerned) so I can run reports on it. Most importantly, I could run reports on *recent* performance data...and give the report to exec.

Ok, I'm cool with that. I have nothing to hide. I work late hours to finish up the report, and it's a smashing success. So much so, that they want Version 5.0 to have a comparable report. I get to work...

Let's look at some stats, shall we? Version 4 was written in 2003 (well, started collecting performance data in 2003). (For simplicity's sake, "Version 4" includes all Versions from 1.0 - 4.0 because they shared a common architecture). Version 4 is a "true" php app in that the browser loads a page, the customer does some work, clicks a button, the browser loads another page, etc...real "pre-ajax" kinda application. From 2003 to today, we've logged 425 million "pageloads" (just this one application...doesn't include our public sites and what-not). Today I'm aggregating Version 5.0 performance data...and I'm starting to get scared.

You see, when Version 4.0 had played out, everyone got together and threw ideas around to make Version 5.0 the best yet. For starters, it would run on a more OSS-friendly platform (yeah, it runs purely on Linux...Version 4.0 was a mixed Windows/Linux environment). Second, it badly needed a facelift...GWT was chosen as the development environment and php was retired from mainline use here. Third, (and mainly as a side-effect of GWT and our programmers), it needed to be more like a desktop application. Enter ajax...

So what we have is a web app that only refreshes portions of the page as needed. How can I possibly build a report that compares Version 4.0 (full pageloads) to Version 5.0 (ajaxified partial pageloads)? I decide to just aggregate the data that Version 5.0 has and see where that leads me.

I'm 1/2 way through aggregation on *one* shared database machine, and I've logged 125 million "pageloads". Did I mention that Version 5.0 has only been out since last summer?

Math time. (Version 4.0) 425 million in 8 years (96 months) = around 4.5 million pageloads a month. That's around 150k per day (20k per hour, roughly). (Version 5.0) 125 million for 1/2 of 1 server * 2 halves * 20 servers = 5 billion rows in six months. 800 million per month, roughly. That's 2x the logs from Version 4's entire lifetime in a single month. That's 27 million entries a day, or roughly 3.5 million an hour.

I just put in a purchase request...I need another drive array, loaded with 15x 1 TB SATA drives...haha.
Posted in Uncategorized
Views 516 Comments 0
« Prev     Main     Next »
Total Comments 0




All times are GMT -5. The time now is 01:30 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration