Data Center War Stories Archives - GreenMonk: the blog

Data Center War Stories talks to the Green Grid EMEA Tech Chair Harqs (aka Harkeeret Singh)

December 13, 2011 by Tom Raftery

And we’re back this week with another instalment in our Data Center War Stories series (sponsored by Sentilla).

In this episode I talked to the Green Grid’s EMEA Tech Chair, Harqs (also known as Harkeeret Singh).

The Green Grid recently published a study on the RoI of energy efficiency upgrades for data center cooling systems [PDF warning]. I asked Harqs to come on the show to discuss the practical implications of this study for data center practitioners.

Here’s the transcript of our conversation:

Tom Raftery: Hi everyone, welcome to GreenMonk TV, this is the Data Centers War Stories series sponsored by Sentilla. The guest on the show today is Harkeeret Singh aka Harqs. And Harqs is the EMEA Tech Chair of Green Grid. Harqs welcome to the show.

Harqs: Thank you Tom.

Tom Raftery: So Harqs as the Tech Chair of the Green Grid in the EMEA region, you get a larger overview of what goes on in data centers than some of the previous interviewees we?ve had on the show.

We have been talking about kind of the war stories, the issues that data centers have come across and ways they have resolved them. I imagine you would have some interesting stories to tell about what can go on in data centers globally. Do you want to talk a bit about some of that stuff?

Harqs: The Green Grid undertook a study which implements a lot of good practices that we all talk about in terms of improving air flow management and increasing temperature, putting in variable frequency drives. And what we did is after each initiative, we measure the benefit of — in terms of — from an energy consumption perspective.

And Kerry Hazelrigg from the Walt Disney Company led the study in a data center in the southeast of the US. And we believe that is representative of the most data centers, probably pre-2007, so we think there is a lot of good knowledge in here which others can learn from.

So I am going to take you through some of the changes that were made and some of the expectations but also some of the findings some of which we weren?t expecting. So starting with — there was five different initiatives, the first initiative was implementing the variable speed drives. And what we found was, they installed new CRAC units and CRAH units which they put in variable frequency drives in the standard, there is 14 of them, and then they retrofitted 24 existing CRAHs. And took out the fixed speed drives and put in the variable speed drives.

The expectation was we would find a reduction in energy consumption and fan horsepower. And also there was a potential of maybe looking always the providing of coolant to the right place in the data center. And once we put those in, we found out that they didn?t actually introduce any, any hotspots, which was a positive thing. But some of the things that were a little different from what we expected and the PUE didn?t reflect the savings. That was because there was external factors, things like the external weather which impacted the PUE figure as well. So you need to bear that in mind as you make changes. You need to look at the average across the year.

The other issue that we found was by putting in variable speed drives they found it introduced harmonics to the power systems. And that came through the monitoring tools and so they are putting — they put in filtering to help resolve those harmonics.

The last issue was also around skills, so they had to train the data center staff on using variable frequency drives and actually maintain them. This was the biggest power saving, it was a third of the overall saving and the saving in total was 9.1% of energy consumption and that?s saved some thing in the order of $300,000 in terms of real cash and the PUE went down from 1.87 down to 1.64 by doing these five initiatives.

The second issue was actually putting in the air flow management. So things like the blanking panels and the floor grommets, putting in the cold tiles where they are supposed to be, and that was for around 7 inch cabinets and the findings were that that reduced the cold aisle temperature because you have less mixing, and also increase the temperate on the hot isle in terms of the temperatures going back to the CRAH. So that was interesting.

We saw that being a key enabler to actually increase in temperature, so you have cold to cold aisles and hot to hot aisles because of those mixing. There wasn?t any energy savings for this piece in itself, but it was in this — airflow management activity is an enabler in that it allows you to then do some optimization and also to increase temperature without risk.

The third activity was relocating the sensors that the CRAHs worked off from, way from the return to CRAC and return to CRAH which is what most data centers use today to actually aligning that to sensors on the front of the cabinets. So actually moving from return air to supply and that?s the specification that ASHRAE provides, that?s what we should be controlling, the temperature and the humidity of the air going into the servers. They themselves say they don?t really care about what the temperature is coming out of back of the servers. Well the rest of us do from a — making sure that it?s not too hot for our data center operators.

So what we did was move those sensors to the front of cabinets and what that did was that optimized the fan speeds and actually started to raise the temperature and the cold air that was required by the servers. It did take them a little awhile getting the locations right, so making sure that they have them moving them around as much as possible, looking the CFD to make sure they are optimizing and putting it in the right place eventually. And that was a small improvement, but it was the — again another enabler for increasing temperate. So there was only a few percent improvement by doing that, but what it does is when you start look at increasing temperature you are increasing temperature at the right point.

Tom Raftery: So how much did it increase temperature by — was it like from 20 to 25 or… —

Harqs: That?s a good question as the next initiative they did was, they were increasing the temperature, so I was just about to — so they went from 18? C, which was what their original set point was and they took it up to 22? C. Now obviously that?s still in the middle of the ASHRAE standard. So there is still more scope there to become better. But it wasn?t just increasing the temperature in the room but it was actually increasing the temperature of the chiller plant which — where the biggest savings were, so if you increase the temperature of the room that then allows you to increase the temperature of your chiller plant.

And that?s — they increase the set point of their chiller plant from 6.7? C to just under 8. And what they found was, there was significant savings due to the reduction in compressor and condensor fan power. And what they found was for each and I’m going to do this in degree F because they calculate degree F. So they went from 44 to 46? F. For every degree F they increased the set point of the chiller, they found out that reduced just over 50 kilowatts of chiller energy consumption.

Now in terms of other people?s data centers, they are also — your mileage may vary depending on the configuration and where you are, but that?s what their significant saving was. By doing that what they found was — by doing it this way, where they put the air flow management in place and then they increased temperature in the room, increased the set points of the chiller plant they found that actually there was — that made no significant impact on the data center in terms of hot spots or anything like that. So there is no detrimental impact to the data center by doing this. Obviously the saving of the energy was a positive and saved real money.

Tom Raftery: Alright, Harqs that was great, thanks a million for coming on the show.

Follow @TomRaftery

Data Center War Stories talks to CIX’s Jerry Sweeney

November 28, 2011 by Tom Raftery

And we’re back this week with the third instalment in our Data Center War Stories series (sponsored by Sentilla).

In this episode of the series I am talking to Jerry Sweeney. Jerry is Managing Director of Cork Internet eXchange (CIX). CIX is a small, currently co-lo, data centre located in Cork, Ireland (and full disclosure – I was a co-founder of CIX).

I love Jerry’s story about the chiller compressors coming on for the first time after 12 weeks – free cooling rocks! (watch the video, or see the transcript below!).

Here’s the transcript of our conversation:

Tom Raftery: Hi everyone and welcome to GreenMonk?s DataCenter War Stories sponsored by Sentilla. The guest in the show is CIX?s Jerry Sweeney. Jerry is Director of Cork Internet eXchange. Jerry welcome to the show.

Jerry Sweeney: Thank you for having me Tom.

Tom Raftery: Jerry can you tell me a little bit about Cork Internet eXchange, how old it is, what kind of size you are talking about?

Jerry Sweeney: Cork Internet eXchange was conceived in 2006, in September of 2006, construction occurred in 2007, and it opened for business in March 2008. So it?s 3-years-old now.

We have two rooms on the technical floor area, one of them is kitted out, it?s 3,000 square feet and the other one is available for expansion and that?s also 3,000 square feet, as well as there is approximately 7,000 or 8,000 square feet for the services, offices, call center and so on.

So, whatever that works out at 12 and seven, so it?s about 19,000 square feet in total. Eventually it will be a 240 rack facility. At the moment we have about 75 occupied racks. To date it?s exclusively a collocation facility, but we are now getting into the infrastructure as a service and platform as a service business.

Tom Raftery: In the building of a facility of that size what kind of — what are the most pressing kind of issues you come across typically day to day?

Jerry Sweeney: I suppose your question had the concept of size and — so we are a very small data center, and I suppose trying to scale the expenses against our revenue stream is probably an issue with a company this size. So running 24/7 shifts, so I would say scale is probably our biggest single problem, and having people with the right resources, and having the facility occupied. If you have a 1000 racks okay, then you can spread those costs over a greater number of customers and a greater number of racks.

Tom Raftery: Any interesting things that you — any interesting problems you happened to cross and solutions you came up with to solve them?

Jerry Sweeney: We live in a city Tom with 160,000, 170,000 people. We — all of the data centers in Ireland are basically clustered around Dublin, all of the connectivity that comes into Ireland is located or lands in Dublin.

So, remoteness and scale okay were huge problems for us when we started off. And one of the big issues for us okay was to get adequate connectivity into the building so that we would be taken seriously. And we came up with a strategy very early on and the strategy was to — initially before we focused on being a data center that we focused on being a regional internet connectivity center.

And the name of the business is very interesting; the name of the business is Cork Internet eXchange. We registered the URL which was the Cork Data Center, but we never used it, and the reason for that is because Cork Internet eXchange was more vital to us at start up then the Cork Data Center.

So, in order to justify gigabit connectivity in the back-haul costs around that, we had to get serious volumes of IP transit through the building first. And we have a 30 meter, it was 24 meters initially, but we just added six meters to it this year, our address is Hollyhill and that?s a clue, we are on top of a hill. That enabled us to sign up every single wireless internet service provider in the region.

So, all of the non-incumbent supply broadband homes and businesses in Cork take their connectivity out of here and we see that as being about 20,000 homes and businesses. So that was a huge win for us in the 2000 — in early 2009. By the time we got to say March 2009, which would be a year after we opened, we had our IP transit up in the gigabits and that made cost effective procurement of transit sensible.

And it was at that time that we noticed a growth in the — people took us more seriously as a data center, because of the connectivity. We had the resilience from design in, what we didn?t have is, we didn?t have connectivity at a price okay, and at a quality level that made us attractive.

So, I think that probably was the? and if we hadn?t been successful of getting that connectivity issue; then, I don?t think we would have been able to scale as a data center.

Tom Raftery: Can you talk a little bit about some of the interesting concepts that went into the design of the data center?

Jerry Sweeney: The concept of building the data center was started in September 2006, and we made a decision in 2006 to go for cold aisle containment and today that seems like a really kind of standard idea — the argument is now do you go for hot aisle or cold aisle containment. But in 2006, it was actually even alternate hot and cold aisles were considered novel at that time.

So it seemed like a remarkable unusual thing. So we built it from the ground up with the cold aisle containment as a strategy. Also because we are located in Cork, which is a mild ? neither hot or cold climate, we have 11 degrees as an ambient temperate, average for the year and the difference between summer and winter is not enormous, so we are able to take advantage of an awful lot of free cooling.

Even in the summer at night time we can usually do free cooling here and for much of the winter okay, our chillers never start. We know that our chillers did not start from the — from November of 2010 until a warm sunny afternoon in February. So free cooling okay, took us for whatever number of weeks, that is six and six — about 12 weeks, without ever starting a compressor.

We were shocked when the compressor cut in, what?s that noise, okay.

Tom Raftery: Jerry it?s been fantastic. Thanks a million for coming on the show.

Jerry Sweeney: Yeah, it?s my pleasure; Tom, thank you.

Follow @TomRaftery

Data Center War Stories talks to SAP’s Jürgen Burkhardt

November 21, 2011 by Tom Raftery

And we’re back this week with the second instalment in our Data Center War Stories series (sponsored by Sentilla).

This second episode in the series is with Jürgen Burkhardt, Senior Director of Data Center Operations, at SAP‘s HQ in Walldorf, Germany. I love his reference to “the purple server” (watch the video, or see the transcript below!).

Here’s a transcript of our conversation:

Tom Raftery: Hi everyone welcome to GreenMonk TV. Today we are doing a special series called the DataCenter War Stories. This series is sponsored Sentilla and with me today I have Jürgen Burkhardt. Jürgen if I remember correctly your title is Director of DataCenter Operations for SAP is that correct?

Jürgen Burkhardt: Close. Since I am 45, I am Senior Director of DataCenter Operations yes.

Tom Raftery: So Jürgen can you give us some kind of size of the scale and function of your DataCenter?

Jürgen Burkhardt: All together we have nearly 10,000 square meters raised floor. We are running 18,000 physical servers and now more than 25,000 virtual servers out of this location. The main purpose is first of all to run the production systems of SAP. The usual stuff FI, BW, CRM et cetera, they are all support systems, so if you have ABAP on to the SAP in marketplace, you, our service marketplace, this system is running here in Waldorf Rot, whatever you see from sap.com is running here to a main extent. We are running the majority of all development systems here and all training — the majority of demo and consulting system worldwide at SAP.

We have more than 20 megawatt of computing power here. I mentioned the 10,000 square meters raised floor. We have 15 — more than 15 petabyte of usable central storage, back up volume of 350 terabyte a day and more than 13,000 terabyte in our back up library.

Tom Raftery: Can you tell me what are the top issues you come across day to day in running your DataCenter, what are the big ticket items?

Jürgen Burkhardt: So one of the biggest problems we clearly have is the topic of asset management and the whole logistic process. If you have so many new servers coming in, you clearly need very, very sophisticated process, which allows you to find what we call the Purple Server, where is it, where is the special server? What kind of — what it is used for? Who owns it? How long is it already used? Do we still need it and all that kind of questions is very important for us.

And this is also very important from an infrastructure perspective, so we have so many stuff out there, if we start moving servers between locations or if we try to consolidate racks, server rooms and whatsoever, it’s absolutely required for us to know exactly where something is, who owns it, what it is used for etcetera, etcetera. And this is really one of our major challenges we have currently.

Tom Raftery: Are there any particular stories that come to mind, things issues that you’ve hit on and you’ve had to scratch your head and you’ve resolved them, that you want to talk about?

Jürgen Burkhardt: I think most people have a problem with their cooling capacity. Even if we are — we are running a very big data center. We have a long capacity down the other side. There was room for improvement. So what we did is we implemented a cool aisle containment system by ourselves.

So there are solutions available, you can purchase from various companies. So what we did is, so first of all we measured our power and cooling capacity and consumption in very detail, and on basis of that we figured out a concept to do that by ourselves.

So the first important thing is today I think it’s standard. We have to change our requisitions, especially in the DataCenter which is ten years old, and which now also got the premium certificate. That data center, the rack positions were back front, back front, back front and we had thousands of servers in that data center.

So what we are now doing, already did to some extent in that data center is, we had to change the rack positions front, front to implement the cold aisle containment system. And we did — so IT did that together with facility management. So we had a big project running to move surplus shutdown, racks, turn whole — the racks in whole rows, front to front and then built together with some external companies, it was very, very normal easy method. Buying stock in the next super market more or less, build the containment systems and that increased where we have implemented it the cooling capacity by 20%.

Tom Raftery: Is there anything else you want to mention?

Jürgen Burkhardt: Within the last three to four years we crashed against every limit you can imagine from the various type of devices which are on the — available on the market, because of our growth in size. The main driver for our virtualization strategy is the low utilization of our development and training servers. So what we are currently implementing is more or less a corporate cloud.

When a few years ago, we had some cost saving measures, our board said, you know what, we have a nice idea, we shutdown everything, which has a utilization below 5% and we said well, that might not be a good idea, because in that case we have to shutdown everything, more or less. And the reason if you imagine a server and an SAP running, system running on it and a database for development purpose, maybe a few developers are logging in, this is from a CPU utilization, you hardly see it, you understand.
So the normal consumption of the database and the system itself are creating most of the load and the little bit development of the developers is really not worth mentioning. And even if they are sometimes running some test cases, it’s not really a lot. The same is true for training, during the training sessions there is a high load on the systems.

But on the other side these systems are utilized maybe 15% or 20% maximum, because the training starts on Monday, stops at — from 9:00 to 5:00. Some trainings even go only two or three days. So there is a very low utilization. And that was the main reason for us to say, we need virtualization, we need desperately and we achieved a lot of savings with that now and currently we are already live with our corporate cloud.

And we are now migrating more and more of our virtual systems and also the physical systems which are now migrated to virtualization into the corporate cloud. With a fully automated self service system and a lot of functionality which allows us to park systems, unpark systems, create masters and also the customers by himself. This is very interesting and this really gives us savings in the area of 50% to 60%.

Tom Raftery: Okay Jürgen that’s been fantastic, thanks a million for coming on the show.

Follow @TomRaftery

Data Center War Stories – Maxim Samo from UBS

November 14, 2011 by Tom Raftery

Back at the start of the summer I mentioned that Sentilla had asked us to run a series of chats with data center practitioners to talk about their day-to-day challenges.

This was very much a hands-off project from Sentilla – I sourced all the interviewees, conducted all the interviews and am publishing the results without any Sentilla input or oversight. This had advantages and disadvantages – obviously, from an independence perspective, this was perfect but from a delivery point of view it was challenging. It turns out that data center practitioners are remarkably camera shy, so it far longer than anticipated to produce this series, however, finally I’m ready with the first of the series, with more to come every week in the coming weeks.

This first episode in the series is with Maxim Samo, who is responsible for managing the European data center operations of global financial services firm UBS.

Here’s a transcript of our conversation:

Tom Raftery: Hi everyone and welcome to GreenMonk?s Data Centers War Stories series, sponsored by Sentilla. This week?s guest is Maxim Samo who works for the Swiss financial services company UBS. Maxim, do you want to start off by telling us what you do for UBS and what the kind of size of your datacenter fleet is?

Maxim Samo: Yeah, I run the Swiss and European operation for UBS, at the moment we have five datacenters in Switzerland and three outside of Switzerland spread around Europe, the total size or capacity probably being around six megawatts.

Tom Raftery: Okay and what kind of age is the fleet, is it like you know the last five years or 10, or is that — it?s obviously a variety that you didn?t build all eight in the one go.

Maxim Samo: Right, it?s anywhere between — they were built anywhere between 1980 and 2004, there is a couple of colo?s that are probably newer than that, but yeah.

Tom Raftery: So if they were built starting in 1980, I mean I assume that this is one of the reasons why you think more in terms of power as supposed to space because your — they weren?t optimized around power at that time I?m sure.

Maxim Samo: Oh not at all, exactly. They were built with a density of around 300 watts per square meter or even less right, I mean they were mainframe datacenters and we kind of ? well, we did some refurbishments in there and as a matter of fact one of those datacenters is undergoing a major renovation right now to increase the amount of power that we can put in there.

Tom Raftery: Power is obviously one of the more pressing issues you guys are running up against, but what are the other kind of issues you have in the datacenters in the day-to-day operations?

Maxim Samo: So the way our datacenters are built in Europe at least within UBS is that, we don?t have like big data halls, but we have a number of smaller rooms within the datacenter building and in order to be cost effective you know we don?t have every single network available in all the rooms, we don?t have every single storage device and storage network in terms of production storage or tester development storage available in all the rooms.

So some of our constraints or else or around that we have to — not only do we have to manage the capacity, but we have to figure out which rooms the servers come in and then try to get adequate forecasts of how much the business and the developers want to put into what datacenter rooms and try to juggle the capacity around that.

Tom Raftery: We are calling the show the Datacenter War Story. So, have you any interesting problems that you came across in the last number of years and resolved any interesting issues that you hit up against?

Maxim Samo: So, in terms of war stories I guess we are — one thing is we are going to have the interesting issue about switching the electrical network of the datacenter that is undergoing renovation and we are currently looking at the options of how we can do that.

One option would be that we would switch — well, that we would put both ups into utility bypass, runoff the utility, and then switch over the network, where of course you have the risk of a power blip coming through which takes down your datacenter. So, in order to mitigate that we are also talking about a full scale shutdown of the datacenter, which right now is being received very well by the people involved, so that?s going to be an interesting one.

So we had, actually very recently we had a very funny case where we — what we do is, we conduct black star tests, black star test is when you almost, you like pull the plug and see what happens, right. So you literally cut off the utility network, your ups will carry the power, the diesel generators will start and you make sure everything works smoothly.

The last time we did this test that was a week ago on the weekend when the diesel generator started it created so much smoke that a pedestrian out on the street actually called the fire department and we had the fire department come in and lot of people were panicking and asking what is going on, we have a fire in the datacenters, like no, we just tested our diesel generators, that was a very funny instance.

I can really remember a war story in terms of the datacenter going down luckily that has not happened for a very long time, we absolutely, we probably — well, we did have a partial failure at one point where pretty big power switch within the switch gear has failed and brought down one side of the power.

However, since most of our servers and IT equipment is dual power attachsed it did not have any impact on their production.

Tom Raftery: Great, that?s been fantastic. Max, thanks for coming on the show.

Maxim Samo: All right, thanks Tom.

Disclaimer – Maxim asked me to mention that any views he expressed in this video are his own, and not those of his employer, UBS AG.

Follow @TomRaftery