post

Use open source platforms to find cloud computing’s energy and emissions footprint

Dials

Regular GreenMonk readers will be very aware that I am deeply skeptical about claims that Cloud Computing is Green (or even energy efficient). And that I talk about the significant carbon, water and biodiversity effects cloud computing can have.

One of the biggest issues with any claims of Cloud Computing being energy efficient, or Green, is the lack of transparency from the Cloud Computing providers. None Almost none of them are publishing any data around the energy consumption, or emissions of their Cloud infrastructure (article updated from “None of them” to “Almost none of them…” after comments from Memset and Greenqloud in the comments section below). Without data to back them up, any claims of Cloud computing being efficient are worthless.

Last week, while at the RackSpace EMEA Analyst day, we were given a potted history of OpenStack, RackSpace’s Cloud Computing platform. OpenStack was jointly developed by NASA and RackSpace and they open-sourced it with an Apache License in July 2010.

Anyone can download OpenStack and use it to create and host Cloud Computing solutions. Prominent OpenStack users include NASA, RackSpace (not surprisingly), AT&T, Deutsche Telecom, HP and IBM.

What has this got to do with Cloud Computing and energy efficiency I hear you ask?

Well, it occurred to me, during the analyst day, that because OpenStack is open source, anyone can fork it and write a version with built-in energy and emissions reporting. What would be really cool is, if this functionality, having been written, became a part of the core distribution – then anyone deploying OpenStack, would have this functionality by default.

And, OpenStack isn’t the only open source Cloud platform – there are two others that I’m aware of – Citrix’s CloudStack and Eucalyptus. Having the software written for one open-source platform, should allow reasonably easy porting to the other two.

Of course, with the software written as open-source, there could be constant improvement of it. And as part of one of the cloud platforms, it should achieve widespread distribution quickly.

Having energy and emissions information available, will also allow inefficiencies in Cloud infrastructure to be quickly identified and fixed.

So, the next step is to get someone to write the software – anyone up for it?

Or, what are the chances of getting someone like HP, IBM, RackSpace, or even NASA to sponsor a hackathon whose aim is to develop such software?

Photo Credit Jeremy Burgin

Comments

  1. says

    Why the dependency on Open Source? Trust and transparency? If anything proprietary software would be more trustworthy here because it’s harder to modify to report dodgy numbers.

    Definitely an interesting idea though, at least for cloud infrastructure.

  2. says

    I’m in. I’ve been talking with people on the ##cleanweb irc about it too.

    @Sam – I don’t think the software is the big risk here. I think where we do the measuring (and we need to have 3rd party audits for that) is where we really need to tighten standards. So the software and the sensor hardware need to be looked at.

  3. says

    Good idea Tom. Here are my 2 cents as a founder of one of the public compute clouds that contributes to open source projects and heavily uses open source solutions.

    I fully agree with the lack of transparency from other cloud providers. I say other providers because GreenQloud does indeed show users their energy and emission statistics (and lack of emissions). We also clearly state that we only use renewable energy from a 100% clean energy grid (in Iceland) so at least there is one cloud provider that is forthcoming on this topic (http://greenqloud.com/trulygreen/).

    I would really like there to be at least a common cloud API for accessing the CO2 and energy stats on multiple levels (total, per vm, per project e.g. footprint of an app etc.). We would be interested in participating in such a project or perhaps we should create an open source API spec from what we have already implemented and see if anyone wants to participate…

    That being said, the tricky part about gathering server energy use and emissions data and analytics from that data is that it’s not only a software problem to solve but also needs real world measurements and decisions need to be made about the algorithms that are used to estimate the energy use per user. E.g. one thing we learned while coding our platform was that only a handful of server manufacturers build energy estimation capabilities into their servers. Even worse those estimates are usually based on hard coded tabular data and don’t really reflect your actual setup, making them inaccurate e.g. for storage pods or servers with lots of ram. In our case we ended up with lab testing each of our server specs to learn their energy characteristics so we could build more accurate profiles to work with in our calculations and to correlate with our total energy readings.

    Admittedly the most accurate way would be to set up a big network of sensors but we chose (at least for now) to settle on the measurements way to save on cost and this is something anyone can do with a little extra effort. Then once you have the needed energy readings, profiles and the energy composition of your data center you can start playing with the energy stats and transforming them to emission stats so that they make sense to people. I would say that all of the cloud providers could do this if they wanted to. We even created a very easy to use “energy composition to co2 emissions” metric that could help called Green Power Usage Effectiveness, http://greenqloud.com/greenpowerusageeffectiveness-gpue/

    cheers
    Eiki
    ———–
    Eirikur Hrafnsson, CEO and Co-Founder GreenQloud

  4. says

    Interesting idea. Since starting my new role I’ve become aware of people choosing to switch IaaS cloud providers based on data center location in a bid to avoid “dirty fuel” usage, and thus asking probing questions about hosting, location etc. It’s the exception far more than the rule, but I’ve had more than one Twitter discussion about it.

    One cool hack from the London Green Hackathon back in February was about trying to identify the “greenest” or best cloud host for a Hadoop job http://london.greenhackathon.com/hacks/mastodon/ so it is clear that this is something that is of interest, and something where AMEE has at least some data that could help.

    I certainly like the concept – it’s certainly not just a software problem, but having the relevant hooks and data feeds available at the IaaS layer would ease the solution.

  5. says

    Andy,

    We’re going ahead with the Mastodon idea. Our website with the live ratings is here: http://www.mastodonc.com/dashboard

    What I’m looking forward to are data centres with good sensor data on current power draw, power production, live carbon load of their energy etc similar to this: http://gigaom.com/cleantech/a-green-hadoop-could-manage-solar-powered-data-centers/

    Eiki, I think it would be great to start with what you have and with what the BCS DCSG has already done too.

    Joe, would it be possible to get the spreadsheet for estimating CO2 emissions?

  6. says

    I’m sorry?

    “One of the biggest issues with any claims of Cloud Computing being energy efficient, or Green, is the lack of transparency from the Cloud Computing providers. None of them are publishing any data around the energy consumption, or emissions of their Cloud infrastructure.”

    Please see:
    http://www.memset.com/press/memset-prove-it-possible-beat-moores-law-power-consumption/
    and
    http://www.katescomment.com/energy-efficient-cloud-jevons-paradox-vs-moores-law/

    We are *very* open!!

    Kate.

  7. says

    Kate,

    That’s great (and I’m looking forward to reading your PhD when it is published).

    Is there any chance we could see the data on that? It would be really helpful to look at real data from a real data centre and compare with the DCSG models.

    cheers,
    Bruce

  8. says

    Kate,

    great article on jevons paradox vs. moores law. The conclusion is a bit cloudy but I don’t suppose you are stating with your research that jevons paradox in cloud computing (prices go down, usage goes up and more importantly new usage is created on a bigger scale than before possible) won’t apply because CPU’s are getting more efficient? The main point I make on the subject is that new usage will grow faster than migration (yes clouds are more efficient than in house hosting) because of the new economics and the ease of deployment in public cloud computing.

    Anyway you seem to be missing the main point of the article. Most public clouds are powered by fossil fuel based electricity generation. No amount of carbon credits or efficiency gains makes that ok and we need an standardized and open way of measuring and reporting that impact AND public clouds need to start using renewable energy. Don’t get me started on carbon credits…

    cheers
    Eiki

  9. says

    @Erik Good point – I was only taking issue in one area. As for fossil fuels, well I can’t really talk about some of our projects right now but expect something interesting soon! 😉

    To respond to the core point of the piece; we do actually use open source software with a bunch of in-house modifications to do our data collection. We sample our power bars via SNMP hourly and log the data in our MySQL core database via Python scripts. We also take hourly samples of load on every VM on every host in our estate – that gets dumped into a separate SQlite database and almost warrants big data techniques (several billion data points now, but nothing a powerful modern server can’t handle if you’re willing to wait a few minutes).

    However, the way we mobilise this data is highly dependent on our particular architecture. If OpenStack Nova ever gets to production stable (it is not there in our view yet) then making a repurposable open source measurement system would be interesting.

    The bigger issue in my view is a standardisation around measurement methodolgies though. We’re doing a lot of work on this at present in Intellect since the EC are getting in a legislative mood! Although embedded energy is not a large part of the puzzle with servers it is still a part, and further simply multiplying by PUE is not a good way to assess energy intensity of a service! It goes well beyond how much power is used unfortunately.

    Regardless, we are working towards giving customers the ability to estimate the carbon footprint of all of our products – watch this space!

  10. says

    Hi Tom,

    It’s not really Cloud Computing that offers the “Green” credentials it’s the virtualisation layer. Converting “10” Dedicated servers to “10” virtual machines on “1” dedicated server is where you get the true power savings.

    Openstack and Cloudstack just help with orchestration of the virtualisation platform which in future should help with easier adoption of virtualisation.

    • says

      Unfortunately Marlon it is a lot more complicated than that.

      Sure virtualisation can help reduce power draw, but it is only one form of Cloud (and many don’t consider it Cloud).

      I’m talking more about a situation where someone moves a dedicated in-house server (or app) to an outsourced Cloud solution. If the in-house solution is using 10 servers powered by a local wind farm, and it is outsourced to a Cloud provider which runs mostly on coal, no amount of virtualisation will make this Green.