China Mobile identified energy efficiency as a major problem for 5G. Massive MIMO - 4G or 5G - is a major energy saver. The MMIMO advances are not quite enough to keep up with expanding usage but important. 

China Mobile estimates each upgraded cell requires 50% more power. When you are installing 2 million cells, that adds up. (China plans 600,000-800,000 5G cells in 2020.) I've seen higher estimates.

Of course, each cell carries three to ten times as much data so energy and cost per bit are much lower. The 5G standard aims for a 90% reduction of energy per bit, although I don't believe that has been achieved outside the laboratory.

With an infinite number of antennas and other requirements rarely met in the field, interference can be reduced to very little.

Ngo, Larsson, and Marzetta developed a theoretical model in 2012 that show much better performance is possible. 

Huawei believes their radios use 20% less power. It points out that for many hours a day, a cell is less than 25% utilized. Why not reduce power? Its latest radios do that dynamically. Huawei has also developed a dedicated chip for the base station, more efficient than the FPGAs initially used in 5G.

20 years ago, MIMO inventor Arogyawami Paulraj predicted a 100X improvement in performance in wireless. 

Here are two papers that addressed the efficiency problem in the early days of Massive MIMO. 

Energy and Spectral Efficiency of Very Large Multiuser MIMO Systems

Hien Quoc Ngo, Erik G. Larsson, and Thomas L. Marzetta

Abstract A multiplicity of autonomous terminals simultaneously transmits data streams to a compact array of antennas. The array uses imperfect channel-state information derived from transmitted pilots to extract the individual data streams. The power radiated by the terminals can be made inversely proportional to the square-root of the number of base station antennas with no reduction in performance. In contrast if perfect channel-state information were available the power could be made inversely proportional to the number of antennas. Lower capacity bounds for maximum-ratio combining (MRC), zero-forcing (ZF) and minimum mean-square error (MMSE) detection are derived. A MRC receiver normally performs worse than ZF and MMSE. However as power levels are reduced, the cross-talk introduced by the inferior maximum-ratio receiver eventually falls below the noise level and this simple receiver becomes a viable option. The tradeoff between the energy efficiency (as measured in bits/J) and spectral efficiency (as measured in bits/channel use/terminal) is quantified. It is shown that the use of moderately large antenna arrays can improve the spectral and energy efficiency with orders of magnitude compared to a single-antenna system.

Designing Multi-User MIMO for Energy Efficiency: When is Massive MIMO the Answer?

Emil Bjornson ¨ ∗†, Luca Sanguinetti∗‡, Jakob Hoydis§ , and Merouane Debbah

Abstract—Assume that a multi-user multiple-input multipleoutput (MIMO) communication system must be designed to cover a given area with maximal energy efficiency (bit/Joule). What are the optimal values for the number of antennas, active users, and transmit power? By using a new model that describes how these three parameters affect the total energy efficiency of the system, this work provides closed-form expressions for their optimal values and interactions. In sharp contrast to common belief, the transmit power is found to increase (not decrease) with the number of antennas. This implies that energy efficient systems can operate at high signal-to-noise ratio (SNR) regimes in which the use of interference-suppressing precoding schemes is essential. Numerical results show that the maximal energy efficiency is achieved by a massive MIMO setup wherein hundreds of antennas are deployed to serve relatively many users using interference-suppressing regularized zero-forcing precoding. I. INTRODUCTION The design of current wireless networks (e.g., based on the Long-Term Evolution (LTE) standard) have been mainly driven by enabling high spectral efficiency due to the spectrum shortage and rapidly increasing demand for data services [1]. As a result, these networks are characterized by poor energy efficiency (EE) and large disparity between peak and average rates. The EE is defined as the number of bits transferred per Joule of energy and it is affected by many factors such as (just to name a few) network architecture, spectral efficiency, radiated transmit power, and circuit power consumption [1]– [3]. Motivated by environmental and economical costs, green radio is a new research direction that aims at designing wireless networks with better coverage and higher EE [2]. In this work, we consider the downlink of a multi-user MIMO system (broadcast channel) and aim at bringing new insights on how the number M of base station (BS) antennas, the number K of active user equipments (UEs), and the transmit power must be chosen in order to maximize EE. As discussed in [1], a precise power consumption model is crucial to obtain reliable guidelines for EE optimization. For example, the total consumption has been traditionally modeled as a linear or affine function of the transmit power [3]. However, this simple model cannot be adopted in systems where M E. Bjornson is funded by an International Postdoc Grant from the Swedish ¨ Research Council. L. Sanguinetti is funded by the People Programme (Marie Curie Actions) FP7 PIEF-GA-2012-330731 Dense4Green. This research has been supported by the ERC Starting Grant 305123 MORE. Parts of this work was performed in the framework of the FP7 project ICT-317669 METIS. might be very large as it would lead to an unbounded EE when M → ∞ [4]. This is because the circuit power consumed by digital signal processing and analog filters for radio-frequency (RF) and baseband processing scales with M and K. Hence, it can be taken as a constant in small multi-user MIMO systems while the variability plays a key role when modeling so-called massive MIMO systems in which M K 1