Friday 15 July 2016

Views from a decade at Cambridge Computer Lab

After spending 9 wonderful years at the University of Cambridge Computer Lab, I will be leaving soon. So let me share my views about projects I have worked on with some anecdotal notes about various other things that I found in academia.

Personally I like building stuff and the projects here have a good engineering component to them. However UK academia (I think as consequence of the REF exercise) is currently focusing more on things like what is your next paper, where are you going to publish it, how many papers have you published this year...I don't think it used to be like that before.

However I have been "lucky" to be part of a research team that doesn't put too much stress on publications (or even funding). It is just because Prof. Andy Hopper is the PI: he has his own unique way of doing research + more importantly he has (unrestricted) industrial donations on the side from which he funds the team!

From a research perspective, Andy has a vision/agenda that sets our specific projects: computing for the future of the planet. It has different research angels, but I will just focus on what I have worked on.

I came to Cambridge in 2007 to do my PhD. Andy had already a view on what I could be working on If I choose to. It was interesting enough and I took it on. I investigated how we could make use of really remote offgrid renewable energy in datacentre computing. We had this crazy idea of putting datacentres in remote locations, connecting them with fiber and move jobs around according to energy availability. 4 years later I got my PhD and a 2011 hotos paper along the way.

Fun facts:
-We managed to put together this hotos paper in less than a week.
-The idea got hammered at that time by a researcher at Google as not practical...but check: Google's floating data center? and Microsoft's Project Natick (that started in 2013).

After finishing my PhD in 2011, Andy offered me a place as a post-doc to join work on data provenance (the FRESCO project): how to track data as they flow in a computer system. For my bit, I suggested that we focus on big data frameworks such as Hadoop and Spark. These systems deal with large amount of data and to construct a fine-grained provenance graph for jobs is technically challenging.

I have managed to get provenance capture for mapreduce jobs to a working prototype and a 2014 hotcloud publication along the way. While it looked promising to continue work on this track (we could even have accepted industrial funds for it), I ended up shifting direction one more time,

At that time my colleague Dr. Rip Sohan and I were shaping up a side project to look at technologies for developing regions. Rip is originally from Kenya and I am Egyptian; we thought we should do research in Africa related to Andy's vision. After 2+ years of negotiations, we managed to convince a cellular operator in Rwanda to let us capture anonymised mobile data traces from their network. Personally it was too interesting not to do it.

I spent 3 weeks in Kigali during February 2015 to capture these traces and strip off any sensitive content. I managed to return back to Cambridge carrying with me 10 disks through airport security! "Easier" data analysis work followed...



I will definitely miss the lab and my experience working with Andy and the rest of the team has been great. I take this opportunity to thanks him for the support he has given me throughout these years. I want also to thank all my colleagues and wish them well.

*disclaimer: this post is my own  personal view at the time of writing.