In today’s scenario number of Data Centres is increasing day by day due to the ever increasing boom in the IT industry. IT is the back bone of every business, every business needs virtual presence over Internet, business operations require faster Internet connectivity, and data storage requirement is also increasing, for such requirement businesses around the world are either building their own data centers or they are acquiring space in big data centers to cater the IT requirement of their businesses.
One very critical aspect of the data centre is managing its electromechanical operations. In this article, we will discuss, how different it is to manage data centre operations in comparison with normal building operations.
Differentiation of electromechanical operations of data centre from building operations is the main objective of this article, where we will look into the different aspects of electromechanical operations, and try to distinguish the significance of each aspect in data centre environment as well as in building operations environment.
Managing and maintaining the health of all critical equipment, maintaining the clean environment of the premises and doing this in the cost effective manner – are the main aspects of electromechanical operations – and these aspects can be categorised into the following functions:
- HVAC operations
- Environment
- Cost effectiveness of operations
Although these functions remain same for Data Centre as well as building operations, but there is a difference in approach. You cannot have the same approach for Data Centre and building operations. Once we go through the content of this article, we will come to know about the right approach.
Air-Conditioning Operations
Air-conditioning is a critical component of electromechanical operations. In today’s scenario air-conditioning is an extremely critical factor in building as well as data centre operations – because besides maintaining an interior climate it also contributes to the major portions of the operations cost, so it becomes increasingly important for operations team to maintain the air-conditioning operations more efficiently and in cost effective manner. Modern air-conditioning technology provides the precisely tuned solution for different operational requirements and different indoor ambient types. To discuss different operational requirements (Data centre and building operational requirement) let’s categorize Air- Conditioning operations into three different categories as shown in Fig. 1.
Temperature
In normal buildings, we are maintaining temperature as per comfort level of the people occupying the working floors. Generally, 24-26 degree celsius has been maintained.
In any data centre, we need to maintain the temperature as per the comfort level of the servers or switches housed there. What is the correct set-point of the data centre – is the most common question.
Earlier each IT equipment manufacturer defined its own thermal requirement of IT equipment, which were in the range of 20 to 21 degree Celsius. These set points are more stringent in nature, and results into lower data center operating efficiency. In desire of increasing the operating efficiency it becomes essential to increase the temperature and humidity ranges.
To increase operating envelope of temperature and humidity, in 2008 thermal guidelines have been revised by ASHRAE. As per the provided guidelines the temperature set- points, which are used to be 20 to 21OC are now 18 to 27OC.
This new envelope has helped data centre operators run their operations in more efficient way (Fig. 2).
Humidity
Humidity is another important factor in HVAC operations, and has significant difference in data centre operations as compared to normal building operations.
Humans can be comfortable within a wide range of humidity depending on the temperature – from thirty to seventy per cent – but ideally between 50 to 60%.
Maintaining a right range of humidity is very critical for IT equipments. High Level of humidity and temperature can affect dielectric properties of PCB dielectric materials. The dielectric provides the electrical isolation within board signals. With either increased humidity or higher temperature levels transmission line losses increases, which results in signal degradation. In addition to these, higher humidity level causes condensation that may result into failure of electrical and mechanical devices due to short circuiting and corrosion.
Whereas low level of humidity can cause Electrostatic Discharge (ESD), which can cause electronic equipment failure.
Air Flow
The air should reach where it is intended to be!!!! Well let’s analyse why?
Air flow plays a vital role in providing the comfortable temperature to the building occupants. As long as air flow distribution is consistent throughout the air-conditioned area the comfort level of occupant will remain high. If air flow distribution is not consistent we are bound to have hot pockets inside the airconditioned area, by hot pockets we mean, the small area of the floor – where comfortable temperature is not maintained despite the right set-points and equipment is in healthy state.
The main reason of hot pockets is modification in floor layout, in modifying the floor layout people generally forgets about how we maintain the temperature in that modified area. While doing a modification of floor layout, we just need to focus on the air flow, whether or not we have sufficient air flow to get the cooling of the modified area – or we need to lay additional duct for the cooling to avoid the creations of hot pockets.
As we need to cool the servers in data centre, the air flow is very critical. In data centre generally racks placed in rows adjacent to each other and – we need to provide cool air at the inlet of the rack whereas cooling fans inside the servers throws the hot air from the back side. To get the consistent distribution of the cold air at the inlet of the racks is extremely important – and for doing this we need to contain the cold air. Cold air containment is the hot topic of data centre industry.
In cold air containment the infrastructure of the data centre is such that – the area from where cool air is reaching the servers is contained (Cold Aisle) in such manner that should be separated from the area where server fans are throwing the hot air (Hot Aisle) as shown in the Fig. 3.
What happens if we are not containing the cold air? Generally, in typical data centre the supply of cold air is from bottom and as stated earlier hot air is at the back side of the rack, in the scenario where cold air is not contained, the cold air mixes with hot air which increases the temperature at the server level. At the return path of the air that goes to the sensors at the PAC (Precision Air Conditioning) senses the mixed air, which disturbs temperature set points. Disturbance in the temperature set points could result into server failure and it also contribute to the higher energy consumption of the PAC machines.
Thus, approach towards managing airflow should be the same as we are managing temperature and humidity. Most importantly airflow is vital in improving operational efficiency of data centre. Operational efficiency of data centre where cold air containment arrangement has been made is much higher in comparison with the data centre where cold air containment is not implemented.
Environment
Environment is another very critical aspect of operations. In normal building scenario, operations team needs to maintain the environment, which is healthy for the people of the building. In general, indoor air quality tests being conducted in normal buildings – where point analysis is that there should not be any toxic gaseous particulates in air, which are harmful for humans.
For indoor quality, ASHRAE has defined permissible limits for different parameters. If range of these parameters remain within the permissible limits defined by ASHREA, we safely assume that indoor air quality of the building safe for humans. (Refer ASHRAE 62.1.2007 for further details)
Toxic gaseous particulates in air lead to copper and silver corrosion, which in turn affects the server and computer equipment. In fact, higher level of corrosion can lead to failure of IT servers. This issue of toxic environment is very common in India, especially in those places where you’ve open sewers and gutters.
The impact of toxic environment is visible especially after RoHS (Restriction of Hazardous Substance) compliance is implemented.
RoHS (Restriction of Hazardous Substance) compliance came into effect from 1st July 2006 in EU markets for electrical and electronic components – when lead was replaced by silver as a soldering medium. Lead is poisonous in nature and hence is hazardous for human health – but this is highly resistant to corrosion. On the other hand, silver is non toxic but gets corroded in presence of some toxic gases (like H2S) in environment with high humidity (> 50%).
Studies and analysis have proved that the hardware failure in data centre has two main reasons. One of them is bad/toxic environment. The other being the poor power quality.
The focus of analysis remains on presence of hydrosulfide in atmosphere. Commonly Reactivity testing is being carried for the analysis of pollutant air inside the data centre. In this analysis as the name suggests, reaction of hydrosulfide is measured on copper and silver. Coupons of copper and silver are installed inside the data centre and after a particular time period, generally after one month, decomposition of copper and silver has been measured for the conclusion.
Instrumentation Systems and Automation Society (ISA) standard ISA – 71.04 – 1985, which is on environmental conditions for process measurement and control systems, air borne contaminants have provided the general guidelines of permissible limits of contaminants in space, and classifies the environment into 4 broader categories:
- G1 – Mild < 300A/month
- G2 – Moderate 300 – 1000A/ month
- G3 – Harsh 1000 – 2000A/month
- GX – Severe > 2000A/month.
We can argue here, that why we have a pollutant air when we have our data centre airconditioned and dust free. The answer is air leakage inside the data centre. When air from outside environment is getting into the data centre, it actually pollutes the internal environment of the data centre. So, the very first step is to seal every point from where outside air coming into the data centre. This exercise is called creating a positive pressure inside the data centre.
Air purification machines are installed to purify the pollutant air inside the data centre. To get the best possible results from air purification machines, positive pressure has to be maintained inside the data centers.
Cost Effectiveness
Cost of operations is a very important indicator of operations’ effectiveness. Electricity and diesel consumptions are believed to be the main operations expenses. The most important factor of cost effective operations is the upkeep of all critical equipments. When all critical equipments are healthy and in good condition, they will give you the best operating efficiency, which in turn reduces the cost of operations. We can refer the below example for better understanding:
We measure chiller (Chiller Capacity 200 TR) efficiency in IKW/TR that is indicated kW
consumed per TR.
- (A) Suppose we have designed IKW/TR of chiller is 0.65 that means chiller needs 0.65 kW to produce one TR
- (B) And after few years of operations IKW/TR is 1.1. Similarly, chiller needs 1.1 kW to produce one TR.
For A power consumption will be
0.65 * 200= 130 kW or 3120 kWH for one day.
For B power consumption will be
1.1 * 200=220 kW or 5280 kWH for one day.
Difference is 2160 kWH just when we are calculating for one day operations. It is to be noted here that it is very difficult to maintain the efficiency as per design of the equipment – as there are lot of factors on which efficiency depends. If we take the chiller example, the major hurdle, we face in maintaining the chiller efficiency is quality of water that we are using for the condenser cooling. But still by taking proper measures and doing timely maintenance, we can maintain the equipment efficiency, which is very close to design efficiency.
When we talk about cost of operations in data centre, we generally calculate two important factors:
- PUE (Power Usage Effectiveness)
- DCIE (Data Centre Infrastructure Efficiency)
PUE is a measure of how efficiently a computer data centre uses energy; specifically, how much energy is used by the computing equipment (in contrast to cooling and other overhead).
PUE is the ratio of total amount of energy used by a data centre facility to the energy delivered to computing equipment. It’s developed by a consortium called The Green Grid. PUE is the inverse of the Data Center Infrastructure Efficiency (DCIE). An ideal PUE is 1.0. Anything that is not considered in a computing device in a data centre (i.e., lighting, cooling etc.) falls into the category of facility energy consumption. Although, ideal value of PUE is 1.0, even after considering all the external factors, it’s very difficult to maintain 1.0 PUE. As we discussed earlier, equipment health and upkeep helps in maintaining the PUE nearer to 1.0.
Data Centre Infrastructure Efficiency (DCIE), is a performance improvement metric used to calculate the energy efficiency of a data centre. DCIE is the percentage value derived, by dividing information technology equipment power by total facility power.
Conclusion
Now, we’ve understood different aspects of operations and their significances in data center and building operations environments.
We also want to keep one very important thing in mind while managing (although different people keep on reminding you about the same) data center i.e., what is the impact of downtime? When you are managing a data center a downtime means a big financial loss to the organisation.
So, we need to take different approaches while managing data centers. Also, we should not put ourselves under lot of pressure – as managing data centre is not a rocket science, it’s just few aspects of the operations which’re different and need thorough understanding. As we all know basics of electrical and mechanical engineering will remain the same.