Thursday, March 22, 2012

Contrasting MIT's MITx with Stanford's Coursera

I've been really interested in the user experience of highly interactive sites - webapps where the user must interact directly with the site and stay within it for a good chunk of time. Courseware sites are a great example of such user experiences - web applications that engage students in interactive learning. Some big examples have launched within the past few years; Stanford University began providing open access to not only course materials but actually began to engage the public at large with interactive courses offered entirely online through Coursera. This year MIT has followed suit by creating MITx - and they upped the ante not only in student interaction but in how much content was released to the public. I enrolled in both MIT's 6.002x and Stanford's Game Theory class and gave them a spin for a week.

Screenshot of MITx 6.002x Courseware
MIT's 6.002x has been far more intense in comparison to the pace that Stanford's Coursera classes usually take. The class asks for 10 hours a week for study, lectures, exercises, labs, homework assignments and exams. Some students report that 40 minutes a day is sufficient to get through the lectures and exercises, however there are a fair number who are taking the full two hours a day.

Piotr Mitros was introduced as the lead software designer for MITx, and the user experience provided within the site really shines. The rather voluminous textbook is fully available within the site (apparently rendered as an HTML 5 canvas), and renders beautifully on a laptop as well as tables such as the Kindle Fire. In fact, the textbook was actually easier to read on a Kindle Fire than Amazon's own e-books. Lectures are interspersed with interactive exercises that ask you to submit answers to key concepts presented throughout the hour-long video series.

Both quizzes, homeworks and exercises are presented as forms submitted to the site, validated in JavaScript. There appears to be a rather nice algebraic interpreter behind the courses, as it takes a flexible set of inputs (e.g. V1, 1/3, 0.33333, 0.33) and evaluates them to a uniform solution solved within x decimal places. At times it refuses to acknowledge parenthesis or variables and throws syntax or evaluation exceptions, but for the most part it works surprisingly well.

Learning is provided through a number of facets. "Tutorials" are given in laboratory format, where one of the MIT professors walks through a live-action example of things such as the KCL rule or Ohm's Law. This hands-on style serves to underscore the series of lectures, given two per week, in a format that mirrors a classroom. Unlike the classroom however, you must respond to the open questions the prof asks of the class. A video lecture segment may proceed for 90 seconds and then halt until you respond to an open question that builds upon preceding concepts. The web application itself was built to have a natural flow of textbook -> lecture -> examples, however often links for the text pointed to a wildly incorrect chapter. Links are also provided to the open (albeit loosely moderated) discussion forum where students posit solutions and questions amongst themselves.

For as many ways to learn the material as MITx offers, it is often difficult to navigate the course itself. I was often lost trying to understand the sequence professors wished us to follow - should we read Chapter 2 first, then the lectures, then the labs? Often I would be deep in a lecture series, get completely lost and only later find we were halfway through a chapter within the text. I didn't even discover the importance of the poorly named "tutorials" (they're more akin to lab lectures) until very late in the game. There were also several algebraic errors made throughout the lecture and even within the text... and for someone such as myself who already had a fragile grasp of the subject matter, it could get frustrating to find out the error only later in the discussion forums.

The MITx platform is amazing - I can easily see it becoming the standard for online courseware going forward. If they open-sourced the stack, it could very well lead to an explosion of education opportunities to the lay audience. As far as MIT's 6.002x... the pace was just far too intense for me. I already work 60+ hour days, and the extra 10 wasn't feasible.

Screenshot of Coursera's Game Theory Lectures
Stanford's Coursera is an entrant that many are already familiar with - it seems last year's Artificial Intelligence class was a HUGE hit with everyone I talk to. I can't throw a pumpkin without hitting an engineer that raves about Stanford's online courses last year... and trust me, I've tried.

Coursera is a bit more low-key than MITx. A simple list of video lectures are provided, a discussion forum, quizzes / problem sets and... that's about it. A 90-ish page textbook is available for $5 from a separate publisher, but is not key to completing the assignments. Contrast that to 6.002x where there was generally 100-150 pages of reading a week, and you get an idea of how different the scope is. If 6.002x requires 1-2 hours a day, Game Theory requires 15-30 minutes a day.

There are some similarities between MIT and Stanford's approaches. Just like MITx, Coursera injects comprehension exercises within the video lecture stream. However, instead of being an HTML form the exercises are displayed as Flash forms within the video player itself. On one hand this is a bit more streamlined an experience, on the other hand you lose a lot of interactivity and features. One major annoyance was that exercises can sneak up... and often I want to rewind 30 seconds to make sure I understood the key concepts being asked. However, backing up from an exercise causes a 30-60 delay in the player while it re-buffers video (or somethin'). Backing up often takes the entire lecture off the rails.

One thing Stanford is doing well is that there are weekly Screenside (read: Fireside) Chats where the professors provide an open forum to ask questions. This shows a great level of dedication by the professors offering the class, and I applaud that level of interactivity especially when there are so many students enrolled for a free course. On occasion associate instructors for MITx would answer questions, but there was no regular schedule.

The very fact that I'm contrasting freely available, online courses I'm taking from both Stanford and MIT is enough to make me flip my lid. To have such staples of industry like MIT 6.002 or Stanford's vast catalog of courses open to the general public can make you excited about what the future holds. If MIT were to open their courseware platform and if stellar CompSci foundations like Stanford continued to offer a battery of courses on such interactive foundations we would have an entirely new workforce of software engineers on our hands.

Tuesday, March 06, 2012

Filling The Pipeline

My past few work engagements have been centered around cloud computing and big data - doing stuff from managing large data centers to machine learning to map/reduce clusters. When I was at VMworld 2011 last year I took the opportunity to ask the "Big Compute and Big Data Panel" about leveraging vector processing hardware such as NVIDIA's Tesla to do data processing. The five panelists (Amr Awadallah of Cloudera, Clint Green of Data Tactics, Luke Lonergan of EMC, Richard McDougall of VMware and Paul Kent of SAS) largely agreed on a few main sticking points in vector processing for massively parallel systems:
  • The toolset is still relatively immature (maybe three years behind general CISC architectures)
  • The infrastructure has not yet reached commodity level
  • Big Compute works well with vector processing clusters, but not big data, since the latter is all about locality rather than in-memory processing
  • Commodity GPU processing is greatly constrained by memory paging - there's too much latency in transferring large in-memory datasets to GPU memory.

AMD had a few interesting announcements over the past few weeks that may pave the way for making cloud and big data/compute clusters more efficient and more "commodity." The first is their acquisition of SeaMicro, whose emphasis is around massively parallel, low-power computing cores with high-speed interconnects. This addresses one big issue brought up during the panel - that interconnects on big data clusters are going to become a prevailing issue as data needs to be transferred across nodes more rapidly to keep otherwise idle compute resources busy. CPUs can't crunch data sets if the data takes forever to arrive over the wire.

The next big announcement, which may be a huge sleeper hit, is AMD's unified memory architecture that's supposed to arrive in 2012. The slide on AnandTech shows that in AMD's 2012 product line the "GPU can access CPU memory," which is a HUGE development in vector processing. Imagine a data set being loaded in 64 GB of main memory, having 8 CPU cores clean the data using branch-intensive algorithms and then that same in-memory dataset being transformed by 512 stream processors. That kind of compute power without the need to stream data across a PCI-E bus could be a really, really big deal.

Still, the issue that remains are the tools that are available to make this happen. Very likely a developer would need to write generic C code to do the branching and then launch a separate OpenCL app to transform it but still share memory pointers so that nothing has to be swapped or paged out. In a world full of enterprise software developers, this kind of software engineering agility isn't exactly easy to find. If Cloudera were able to unleash this kind of power, AMD would have a big hit on their hands. Maybe AMD needs to start looking towards Cloudera as the final stage in the pipeline - an open-source framework that unlocks the potential of their infrastructure.