Big Data Analytics for Utilities

ByErynn Reitmayer for ZE datawatch

The amount of data available to utilities is set to experience explosive growth as smart grids and other automated technologies become commonplace.

For several years, big data has been something of a buzz word in the world of IT, but for all the chatter surrounding it, there is still a notable lack of clarity as to what “big data” actually entails, and rampant misconceptions regarding its use and significance.

The many vague definitions of big data describe it, quite simply, as a database that is too large to be managed through traditional data management applications. Conventional databases are usually built entirely on hash and chained structures, which were originally implemented to minimize response time for primary inquiries. However, hashing wastes storage–particularly compared with the indexed structures of a relational database. Even the best hashing algorithms will result in an unreasonably high number of insertions, ending up in overflow when the database requires mass storage and density.[1] While the scope of manageable data within such applications varies depending on the level of human resources at an organization’s disposal, there is no doubt that across all sectors, data volume is increasing at an accelerated pace, making storage and accessibility issues of critical importance.

A recent study by MGI found that 15 out of 17 sectors in the U.S. already have more data stored per company than the entire U.S. Library of Congress.[2] Utilities are currently at the lower end of the spectrum in terms of the data volume being stored per business, with an average of 1,507 terabytes stored per firm with more than 1,000 employees.[3] To put this information in context, it helps to envision a terabyte of data in concrete form–for example, a single terabyte of data contains about the same amount of information as 60 piles of typed paper, stacked as high as the Eiffel tower.

While this may be a smaller database than those maintained by the government or manufacturing firms, there is no doubt the volume is formidable. Further, the amount of data available to utilities is set to experience explosive growth as smart grids and other automated technologies become commonplace.

In the next five years, enterprise data is expected to increase by more than 650%–and the vast majority of this data will be unstructured.[4] These statistics suggest that even if corporations and utilities are currently able to manage their data initiatives through traditional tools, their methodology will likely need to evolve to enforce order and reap value from this chaotic data flow.

Indeed, as we look to the future of data and its usefulness in business, structure and order will be crucial. Simply having a large database is worthless; business users’ ability to create analyses and gain insight from data gives organizations’ data collections any value.

The Growth of Grid Data

Technological advancement is unequivocally the force driving data growth. From mobile phones to automobiles, millions of networked sensors are becoming part of the physical world–and these sensors are capturing, creating, and communicating data.[5] For utilities, the implementation of sensors and the automation of energy management is part of a greater concept: the smart grid.

The smart grid is a modern vision of the electrical grid which uses digital information and communications technology to gather information and initiate automated responses to rapid changes in electric demand.[6] This two-way communication between generation sources and the electrical grid is expected to yield significant economic benefits and greatly improve grid efficiency.

The deployment of smart meters and smart appliances is a growing initiative. The U.S. Department of Energy, for example, has implemented funding programs to accelerate investments into grid modernization. As of 2012, private investment and government funding resulted in nearly $8 billion worth of investment into smart grid development projects.[7] Research estimates that the number of connected nodes–sometimes referred to as machine-to-machine devices–at work in the world will grow more than 30% annually within the next decade.[8]

Of course, what we refer to as machine-to-machine communication is simply a transfer of data.  Smart grids will expand the scope of data collected within the electric grid to include data regarding the usage patterns of customers within the grid; outages or other service disruptions; and the conditions of equipment within the grid. Moreover, this information is only the tip of the iceberg of the new data that smart grid implementation will reveal.

Drivers of Data Collection

Smart grids provide utility companies with much more accurate usage data, enabling improved load forecasting. Accurate forecasts provide the foundation for daily operations and distribution planning, and having advanced knowledge of load fluctuations enables utility providers to more readily implement renewable energy sources into the grid. In fact, this information will increase the efficiency of all energy production–be it from renewable sources, coal, natural gas, or nuclear.

Similarly, the smart grid will provide utilities with more current information on usage, enabling faster demand response. Thus, if the grid experiences unexpected increases or decreases in demand, utility companies can quickly respond by increasing or decreasing generation.

Electric vehicle integration is another key endeavor of the smart grid. The Federal Energy Regulatory Commission (FERC) has explained its hope “that smart grid interoperability standards would ultimately accommodate a wide array of advanced vehicle integration with the grid.”[9] If and when electric vehicles become more widespread, smart grids will assist in maintaining reliable operations of the system, and help providers to monitor and control when and how electric vehicles are charged.

The Implications of Growing Data

The growth of unstructured data in grid databases presents further challenges and opportunities. Unstructured data refers to information that does not easily fit into a pre-defined data model, and therefore does not work well with relational tables, which most advanced database systems use. Such data is often text and numeral heavy; it is also typically very ambiguous, as it can include such varied data as text from social media channels or voice recordings from customer service calls that have been converted to text. Useful information can be found in unstructured data, but extracting and managing it is wrought with complication due to this data’s irregularity.

If harnessed efficiently, this influx of data can give tremendous insight to decision makers which will lead to gains for their companies. Current studies project that one in three business leaders frequently make decisions without the information they would need to make a truly wise, strategic move, whereas companies with mature business analytics and optimization experience 49% higher revenue growth, 20 times the profit growth, and 30% higher return on investment capital than their less-equipped peers.[10] These numbers indicate that while there is significant value to be gained from data, businesses must be prepared to implement advanced analytics.

The Role of Analytics

The need to be able to perform analysis of data increases at the same pace as the data influx itself. For this reason, forward-thinking organizations are investing in software and applications that will facilitate analysis. In the 2011 IBM Global CIO study, “83% of CIO’s surveyed said that applying analytics and business intelligence to their IT operations is the most important element of their strategic growth plans over the next three to five years.”[11] The use of data is becoming one of the key ways for a company to outperform its peers–and as such, data-driven strategies will continue to gain ground as a smart way to innovate and compete.

Utilities are no exception, and the need for robust analytical tools has not gone unnoticed by the industry. In North America, utility spending on data analytics is expected to grow 29% each year, totaling over $2 billion by 2016. Worldwide, this investment is even more staggering, with research indicating cumulative spending on smart grid data analytics will reach well over $34 billion by 2020.[12]

Even so, it is just that–an investment: “Utility analytics experts will tell you that the volume, velocity, and variety of data streaming in from smart meters, transformers, and substations are like gold mines waiting to be tapped.”[13]

The valuable data that will come in from smart grids will enable utilities to:

  • Streamline operational efficiency.
  • Pinpoint equipment failures or other outages.
  • Easily recognize areas of leakage or thefts.
  • Enhance pricing and improve customer relationships.
  • Forecast trends and demand.
  • Improve planning of the use of renewables.

Regions that have implemented components of smart grids are already reporting these benefits. One of the largest electric delivery companies in Texas, Oncor, is one such utility. Since implementation, Oncor has been alerted to more than 20% of outages before customers called, and many households have noted a 10% decrease in energy usage.[14]

Roadblocks and Challenges

Well-harnessed data can certainly enhance an organization’s strategy and profits, but there are significant challenges to overcome before data can bring value. Companies will not only have to solve the difficulties surrounding data collection and storage, but will also have to find ways of analyzing the information before they can take action that will bring a return on the investment. Thus, there will be an initial expense in implementing smart grid technology and the tools and software required to optimize its use.

While the brunt of costs will be absorbed by distribution companies, followed by transmission and substations, customers will also see an increase in their bill. The percentage increase varies according to the type of customer and region, but general estimates place increases at between 8.4-11.8% for residential customers; between 9.1-12.8% for commercial customers; and between .01-1.6% each month for industrial customers in North America.[15]

As a result, some regions have experienced backlashes after implementing smart grid technology. In fact, a total of 57 cities and counties in California alone stand in opposition to smart meter installation.[16] As a result of this public outcry, authorities in some regions have ordered utilities to provide customers with the option to not participate.

It will be critical for utilities to explicitly detail the benefits of smart grids to consumers. Pilot projects have shown that customers using smart meters can more readily monitor consumption and therefore, more easily curtail unnecessary usage to reduce their bill. However, customers that achieved reductions did so as a result of their individual initiative; if a customer doesn’t care to take advantage of the information available to them to save, then smart meters will likely be of little to no benefit.

Another significant road block that utilities must navigate in harnessing big data is that, with the deployment of smart grid technology, the need to integrate data from multiple sources will only increase. The diversity of these vast databases further hinders the ability of business users to understand the scale of the database, and what it is composed of. Most organizations dealing with data will have certain issues surrounding volume, but these matters of scale will be decided by other factors; knowing the content of the database is the most significant issue. The value derived from a database has less to do with its size, and much more to do with selecting the right data to work with.

The reality is that the majority of businesses aren’t making good use of the data they currently have. As Mike Healey of InformationWeek noted in his report “6 Big Data Lies,” more data doesn’t fix bad analysis, and data management is not simply about the collection and storage of data; it’s about what data should be studied and analyzed.[17]Once the content of the data is defined, and users have a clear idea of how it might be used, the remaining challenge will be choosing the proper analytical tools to meet these needs.

Issues and Concerns

Aside from these challenges of use, there are other potentially problematic issues that can arise for organizations dealing with large volumes of data. Key among these concerns is security. Data, like any other corporate asset, has value. As such, there is a threat of data theft or leakage.

Data theft is particularly sensitive when it involves data about customers. Many people view the collection of data about their lives and habits as invasive, so when such data is lost or leaked, the event can become a public concern. The larger the scale of the organization and the more lives it affects, the more this is likely to be true. In the case of utilities, it might seem that customer data is of little interest or importance; however, this data reveals fairly significant details about the day-to-day habits, lifestyles, and comings and goings of each customer.

Moreover, data loss is extremely common and is becoming even more so. According to KPMG’s Data Loss Barometer, incidents of corporate data loss increased by 40% between 2011 and 2012.[18] These incidents can create reputational marks that businesses may never recover from, so it is crucial that businesses involved in big data initiatives address security through technology and policy.

The other critical concern is quality of data. If an organization has a database and has sophisticated analytics, they still will not see any value from their investment unless the data is accurate.  The key to obtaining ROI from data is “selecting the right data, validating it, and delivering it in a consistent and actionable format across the entire organization.”[19]

Final Words

There is no doubt that big data is here to stay, so how can utilities ensure that they are managing their data efficiently? Companies must develop an integrated strategy for the entire enterprise that addresses data collection and validation, storage architecture and the integration of disparate data sets, security and compliance, and the analytical needs of the business.

Sophisticated analytics can have a substantial impact on decision making and risk management, not to mention that through depth analysis, business users can gain insights that would otherwise be buried in the mysterious mass of data.

ZEMA is an end-to-end enterprise data management system that features robust data collection, integration, and analytical capabilities.  All data coming into the ZEMA database is automatically validated and is normalized using a consistent metadata structure. These features enable users to access the data they need quickly and easily, knowing that it is current and correct.

Through its administrative security feature, ZEMA enables secure access to data and management of users and groups. ZEMA creates audit trails and reports on entitlements and usage patterns. Businesses can easily manage internal and external data and set expirations on data access that will keep their databases secure and compliant. To learn more about ZEMA, visit

References and Bibliography

This article is republished by permission from ZE datawatch® All rights reserved.


Did You Like this Article? Get All the Energy Industry News Delivered to Your Inbox

Subscribe to an email newsletter today at no cost and receive the latest news and information.

 Subscribe Now


Logistics Risk Management in the Transformer Industry

Transformers often are shipped thousands of miles, involving multiple handoffs,and more than a do...

Secrets of Barco UniSee Mount Revealed

Last year Barco introduced UniSee, a revolutionary large-scale visualization platform designed to...

The Time is Right for Optimum Reliability: Capital-Intensive Industries and Asset Performance Management

Imagine a plant that is no longer at risk of a random shutdown. Imagine not worrying about losing...

Going Digital: The New Normal in Oil & Gas

In this whitepaper you will learn how Keystone Engineering, ONGC, and Saipem are using software t...

Latest PennEnergy Jobs

PennEnergy Oil & Gas Jobs