Is PUE a Flawed or Inadequate Metric?
That’s what many people say but, up-front, let me say that I think it’s neither flawed nor inadequate. In fact, for what it is intended to represent it is nearly perfect; and it would be were it not for a combination of a lack of understanding, deliberate abuse by marketing folks and the ease that it can be manipulated if the data centre operator so wishes. We should all know by now (after 10 years) that PUE stands for Power Usage Effectiveness even though many people still slip into using the word Efficiency in place of Effectiveness since all data centres are zero ‘efficient’ unless all you want to do is create heat…
Like all good stories let us start at the beginning and, if you are sceptical by nature, PUE had a slightly ‘agenda’ based start in life. It was innovated by members of The Green Grid for global consumption and the concept could not have been simpler; the ratio of total data centre annual energy to the net ICT annual energy. Clearly the closer to 1.0 the better, although at the time the average was nearer 2.5. In other words, if you have an ICT load that consumes 10MWh over a full 12 months and to support that load the facility consumes 25MWh over the same period then the PUE is 25/10 = 2.5, simple.
The ‘annualisation’ took care of seasonal changes in cooling energy and averaged out the load whilst the only real definition was that PUE was a metric that should be used to chart the improvement over time in an individual facility and never be used to compare facilities.
In that very simple definition four things are immediately obvious:
It should have been called EUE as it should be an E (for kWh energy) metric not a P (for kW power) metric - but it is too late to change it now, so, despite being a glaring engineering error, we won’t mention it again
It says precisely zero about the ICT load – the ‘one’ in the ‘one point something’ – that is considered as sacrosanct
The user should be honest (with him/herself) and include all the overhead energy consumption, including the offices, plantroom small power, security, external lighting and embedded energy in other resources such as diesel fuel-oil and even potable water that is evaporated or discharged into the drains
How many times have you read an article that says that a new data centre will have a PUE of, for example, 1.30? How can that be if it hasn’t yet been run for a year and will almost certainly fill up with load slowly?
So, why did I say that PUE had a ‘agenda’ in the beginning? Well, this is a very personal view although having aired it over the years I haven’t ever had anyone disagree with me: PUE describes the infrastructure effectiveness and took the world’s attention away from the very thing that the ICT industry didn’t want exposed – the very poor server power supply efficiency at low load and the very low average server utilisation that meant that most servers idled most of the time at a relatively high power. That explained the long-reported condition of data centre power being steady, despite how the business usage should have affected it. Now, apart from the utilisation, things have improved dramatically and PUE remains a valid metric that is valuable, easy to use and describes only the annualised facility energy overhead.
But let’s have a think about Point 2 above… Modern servers idle (doing no useful IT work) at anywhere between 20% and 80% power draw (with a 2017 average of around 35%) and the average utilisation (if you exclude homogenous loads like search engines and HPC clusters) around the globe is near to 10%. In other words, most servers idle most of the time and consume an average of 35% of their ‘pedal to the metal’ power - although the ‘worst’ performers idle at nearer 80%. So, a facility with 60% utilisation and PUE of 2 is a lot more ‘energy effective’ than a facility with 10% utilisation and PUE of 1.1. However, I don’t regard that a failure in PUE as it was never intended to be a measure of goodness of the data centre – only a measure of the ‘overhead’ power and cooling losses, lighting and controls etc.
On Point 3 some users have made up their own rules about what to include (or not) when doing the PUE calculation. In fact, a lot of people still say that ‘PUE isn’t well defined’. That may have been true in 2007 but once Version 2 was published by TGG all the holes had been plugged. Since then PUE has been standardised in ISO/IEC 30134-2 and no one should be in any doubt. To be a little critical of the ISO process for a moment their resulting document is probably not as ‘perfect’ as V2 of TGG’s original document as it doesn’t include a clear definition of ‘partial PUE’ (pPUE, useful for sub-system contribution to the overall PUE ) and ‘instantaneous PUE’ (PUE0, useful for describing the peak kW facility power). Having said that, no one ‘has’ to follow a standard (unless it’s H&S related or embodied in local legislation) and, on the condition that you consistently apply the same rules every year your PUE improvement plan will be well founded. Of course, if you are reporting your PUE as some part of an energy saving scheme, such as an EU CoC Participant, then a set of common rules is a good thing. Having said all that there are many examples of ‘PUE abuse’, which brings us onto the fourth point…
Some time ago I coined the phrase ‘PUE abuse’ and here are several common examples:
PUE is an annualised energy ratio so how can any marketing department claim that their brand-new data centre is, for example, 1.3? Only after running for a full seasonal year can you report your PUE
The press-release that stated (on a cold January day in Amsterdam) that their data centre had achieved a PUE of 1.09 for ‘a whole 24-hour period’
The (albeit tiny) cheat that a social-networking site uses when applying LED lighting with Power-over-Ethernet. It certainly saves energy and copper but conveniently allocates the lighting load to the ICT and not the overhead
Try not to behave badly now so that you can make great leaps in PUE reduction when called upon to do. I did visit a very large (>15MW) facility in Germany without any blanking plates in the cabinets and lots of bypass air – saving up an improvement if and when EU regulation came along…
The claim that someone had achieved a PUE of less than ‘one’! This still happens from time to time but less frequently. The ‘trick’ is always the same – some form of on-site generation or consumption of gas instead of electricity that has been netted-off the facility. One of the funniest was the Californian desert facility that said they had a PUE of ‘0’. In reality it was a 5kW ICT load fed with 100% solar panel array and battery combination with no grid connection
There is a longer, but I think funnier, account of a ‘PUE less than 1 is easy’ data centre presentation by a German consulting engineer in a Middle-East conference which you can find on the MCF web-site
All things considered PUE is simple and useful. You should pick a set of rules, be that TGG V2 (free to download) or ISO/IEC 30134-2 (must be paid for) and then stick to it. Do not tell anyone else you’re your PUE is unless it is to get you energy tax relief. Use your PUE to track your improvements and remember that regardless of the PUE number the aim is to reduce power consumption – so get rid of those comatose servers!