A Conversation with Patrick ChuBy Scott Mitchell
Patrick Chu is the founder and lead developer for ItsYourTurn.com (IYT), a popular turn-by-turn gaming site that has been in operation since 1998. Developing, testing, and supporting IYT is handled by Patrick, two other full-time software developers, and a part-time support staff. ItsYourTurn.com has over 2,500,000 registered accounts, records around 700,000 game moves per day that result in four million daily page views and a SQL Server 2005 database with over 470 million rows of game moves. Initially, IYT was powered by custom C++ ISAPI extensions, but has since moved over to using ASP.NET 2.0.
I've been playing games on IYT for years (my two favorite games being Chess and Stack4x4, a variant of Connect4) and had exchanged brief emails with Patrick a couple times in the past. Clearly there's a lot that can be learned, both about technology and business, from a person with the experience and background of Patrick Chu. I recently emailed Patrick some questions about IYT, both from a business and technology standpoint. He was kind enough to write back with some very detailed and valuable answers that he agreed to let me post here. Read on to learn more about ItsYourTurn.com, the technology behind it, Patrick's views on business and the future of web development, and many other interesting topics!
Scott: What has ItsYourTurn.com's growth looked like since its inception? (Perhaps you can provide some stats - page views, bandwidth, moves per month, active users, etc.)
Patrick: ItsYourTurn.com went online in April 1998. In the early days (1998-2000) we grew very fast, but it's leveled off in recent years. Since our marketing consists solely of buying ad words on Google, this isn't surprising. Over the last two years we've been quietly restructuring both our hardware and software infrastructure (more on that later), so we've been going light on the marketing. However, starting later this year, that's going to change. Our changes will be done around August, and we'll be marketing far more aggressively than we have in the past (the ad campaign is already under development by an outside design firm).
As for the number of visitors, according to Alexa.com, ItsYourTurn.com is ranked #5601 today (June 30, 2006). By comparison, the 4GuysFromRolla.com site (which I visited a lot for guidance in my early ASP.NET days) is ranked #7085 today. Like yourselves, I would say we're a medium-sized website.
We record about 650,000 - 700,000 game moves every day, so a conservative estimate is about four million page views a day. It takes about four to five page views for each move, and in addition players usually visit their game status page several times and track their tournaments and ladders.
We try to archive each game that's played for several years so that players can return to them and review the games later. While we've lost some of the very early moves due to database corruption (we started with SQL Server 7.0 and also had corruption problems with SQL Server 2000), right now there are still over 470 million rows of live, accessible game boards in our current database. At one point we had over 600 million rows in the games database, but some of that old data is on a backup disk somewhere and not accessible. Right now our active database has games over two years old, and we figure that's as far back as anyone wants to review.
Right now ItsYourTurn.com has two full-time programmers (I'm one of them) and a part-time person who answers the customer-support requests (we try to answer the vast majority of our customer support questions -- the "easy" questions that we've seen before -- in one business day or less). At the peak of the Internet bubble (2000-1) we had five and a half full-time employees and we were running three websites, but we've scaled down since then. We are also currently working with an outside firm on our site redesign, which should be online soon.
Scott: What sort of hardware architecture do you have setup (database version, how many web/database servers, IIS version, ASP.NET version, etc.)? Who do you host with?
Patrick: We're running ASP.NET 2.0 on Windows 2003 (which implies IIS 6.0). Our development work is mostly C# code, with some legacy C++ code for the pages we haven't converted to C#. We develop in Visual Studio 2005, and we run our database on SQL Server 2005. We co-locate with a local ISP called Intrex.net, and we build our own servers in-house (crammed with as much RAM as they'll hold) and move them to Intrex when they're built.
We are slowly moving over to VMWare Server virtual machines. This allows us to better utilize our hardware resources, and also allows us to deploy new servers very quickly, reducing our deployment time per server from one to two days to less than 30 minutes when we have to upgrade our hardware.
Scott: What software architecture do you have? Sounds like it's mostly custom ISAPI extensions, but where do you use ASP.NET? Use any off-the-shelf components/libraries or is it all custom-built?
Patrick: Back in 1998 when we started, the only available technologies to build database-driven websites were Perl and C++ ISAPIs. Even Java at that time wasn't being widely used for database-driven websites (and was very slow). Early versions of classic ASP were available then, but they were far too slow for the 300 Mhz Pentium II's that were the state of the art back then. For performance reasons we chose to go the ISAPI path.
We wrote our own database layer on top of ODBC / OLE-DB, and each page was a separate C++ function. The web server parses
the URL and knows which C++ function to call to create the page. The HTML string that you want to output gets assembled in
memory in a
CString, and then it's streamed out back to the web server, which then sends it back to the client.
So all our HTML is embedded inside our C++ code. I know, it's insane, but that's all we had to go on back then. This
technology is still used in a different form today -- ASP.NET is implemented as an ISAPI DLL. By the way, this is the same
architecture that Ebay used for their first years before moving over to the Java platform that they use today. Over time
we developed our own internal web framework to reduce the amount of repetitive code that we had to write for each page, and
we wrote our own ad rotator, message boards, tournament and ladders code and game logic code in C++.
Again, back in 1998, there were no pre-built libraries available, so we had to custom-build everything ourselves, even our database access code. While technologies like ColdFusion and PHP decided to place database-aware tags into the HTML page itself and replaced those tags with dynamic content, we went the other way, where all the HTML was inside the DLL and we had very few external HTML files. Since the in-house designer worked on the graphics on not on the HTML design (he wasn't familiar with web design at that time), I wrote the design in the C++ (and it shows -- the site looks like it was designed by a programmer).
Starting about two years ago, we knew that this technology had reached the end of its useful life, and that we had to move to a different platform if we were going to keep up with the rest of the world. Since we were already using Visual Studio 2003 for our C++ work, it wasn't much of a stretch to start moving some of it over to C# and ASP.NET.
Currently, most of the site is running on C# and ASP.NET. I wrote a custom program (basically, a bunch of regular expressions) to help me port tens of thousands of C++ code into C#, and that effort has been completed. All the game logic and the page that displays the game boards and accepts game moves has been running on C# and ASP.NET for several months. The game board page is responsible for about 95% of our page views, so I would say most of our site is now running on ASP.NET. Many other pages on the site have also been converted to ASP.NET. There are still C++ pages on the site, most notably the game status page (which will be redesigned and rewritten from scratch), the messaging system (which will also be rewritten from scratch) and the ladder section (which we'll probably leave in place, since it's working fine for now).
I should say that we're running on our own strange ASP.NET implementation -- we don't use a lot of the high-level ASP.NET
framework. Since all the code to produce the pages was already in the C++ code, it was easier for us to override the
Render() method and just stream the raw HTML from our legacy code. Going forward, we are looking into
leveraging more of the ASP.NET framework and looking at some of the great pre-built packages out there.
The most compelling reason for us to move to ASP.NET has been the (almost) built-in caching. I say "almost" because I wrote our own caching layer on top of the ASP.NET caching layer. In our ISAPI program, we had to retrieve every small bit of data from the database, even if that information never changed, and that creates a lot of unnecessary load on the database. The reason for writing our own caching layer is because we need "data caching".
Unfortunately, ASP.NET 1.1 doesn't have this type of "data caching" built-in, at least not in the way we wanted to use it. Page caching and fragment caching are not appropriate for us, since the pages are completely different URLs, and contains different information. So we built a custom "data cache" layer, which caches the information coming back from the database. Every query using this layer has metadata that tells how long to cache the information, and how high to set the priority. It's keyed by query and parameter rather than per-table. It's a thin cache layer built on top of DataReader -- essentially a stripped-down DataSet class without the bloat, and runs many times faster than DataSet. The application calls this cache layer whenever it requests information from the database (it never calls the database directly), and if the caching layer finds the information in the cache it will serve it from the cache, if not it will retrieve the information from the database and insert it into the cache. The calling function doesn't know or care if the query was served from the cache or the database.
So database information that rarely changes (lookup tables, for instance) but also used across many different pages (user profile information, for instance) can be served from RAM, and then flushed from RAM automatically when that specific user is no longer logged on.
Before moving over to ASP.NET, we ran a copy of MySQL on every web server, and we used that as a local cache to reduce the load on our primary web server.