Understanding the 6 Dimensions of Data

Water companies face many challenges on a day to day basis, including those relating to asset management. Of course, it is easy to point out that better Asset Management processes will help any water utility achieve goals such as reducing Non-Revenue Water (NRW), optimising asset usage, and reducing expenditures. But there’s a deeper story there. After all, you can’t achieve those goals without data.

This brings us straight to the heart of this article: the importance of Data Registration, Data Quantity, Data Quality, Data Integrity, Data Management and Data Integration, also known as the 6 dimensions of data.

Now, it is no secret that the adage “garbage in – garbage out” still rings true today. So, we need to define the playing field to accurately capture what data is needed. There are many definitions out there, obviously, but this one, in particular, stands out:

“In computing, data is information that has been translated into a form that is efficient for movement or processing. Relative to today’s computers and transmission media, data is information converted into binary digital form. It is acceptable for data to be used as a singular subject or a plural subject. Raw data is a term used to describe data in its most basic digital format.”

In essence, there would be nothing to facilitate movement or processing without data, which means visualisation, analysis and reporting have no value. A product might make a wonderful picture, but the true value of the picture would have no data-related meaning. That means we have to take on the responsibility of ensuring that we have a viable dataset to work with before software tools can be functional and useful.

  1. Data Registration

Data Registration is critical to the success of any software product, from Point-of-Sale solutions through to Spatial Asset Management solutions. Firstly, data registration requires that a proper data acquisition process be defined and applied to a data model based on industry standards. The database should be specifically designed to use industry standard data models defined by regulatory bodies, associations and evolutions in innovation. Those industry standards are available from national and international organisations that ensure a framework to organise your data properly. From a Spatial Asset Management perspective, that means having all of the appropriate configurations per asset, from metadata to proper topology to industry-standard geometries and so on. In short, you must define the asset as it is.

Now, as the evolution of data technology progresses, and the databases themselves become more robust, it is becoming increasingly evident that perspectives are changing. Systems like RDBMS, once deemed to be industry standards, are less suited to their tasks in today’s world. In an object-oriented environment (as we will discuss here, at least), you define all the water and spatial asset management characteristics at the data model level allowing for multiple geometric representations. Functionalities on that data model are agile and thus easy to enhance, and performance is never compromised. Other solutions require you to manage this at the application level. This means that as these older technologies evolve, a customer’s migration path is much more extensive – because those applications will need to be updated.  The result is that your total cost of ownership is not contained.

Lastly, many utilities are faced with having to perform data acquisition processes. In many cases, existing data is incomplete and must be supplemented, with utilities looking at mobile solutions to assist in this process. Ensuring this newly-acquired data is validated properly before being disseminated within the organisation requires a strong financial commitment and a strong data management process. There are many mobile tools available, but a significant number of them are not readily configured to manage the data management process.

  1. Data Quantity

Data Quantity cannot drive success in and of itself. Although achieving data quantity is quite easy, simply having a lot of data does not help – especially if that data is not qualified or organized. In fact, it clouds the decision-making process, increasing costs and reducing effectiveness and efficiency.

Once your data model is defined, the process of data migration – if any of that data is available – or data conversion must take place. In either case, you must establish a solid plan beforehand to implement the data acquisition process to an industry standard data model, which can be enhanced to meet your specific requirements.

In doing so, however, it is important to recognise that your utilities may face data volume issues. The more data there is, the more you must consider how this can decrease performance, especially when you add more users and factor in availability – in terms of location and continuity (i.e. 24/7 access). However, in properly designed environments, that does not have to be the case. Good architecture will allow you to handle any quantity of data with ease.

When considering data quantity, you must also consider the impact and implications big data will have for your business. Big data, be it related to machine-generated data or real-time data, can be analysed in predictive analytics and human behavioural patterns. While the amount of data derived from these environments is obviously quite substantial, it is never organised in a way that provides insights on its own. It must always be processed. Raw data can be turned into information.

Still, from the perspective of Spatial Asset Management, this data can be used to gain valuable insights into asset failure, performance metrics and asset optimisation. Big data can also be used in an object-oriented environment. It can be analysed to create a clearer picture of performance issues and user behaviour, allowing improvements to be achieved by optimising business processes.

  1. Data Quality

Achieving a high standard of Data Quality is expensive. We could simply use the database endlessly, allowing it to grow without restriction. But to look at the cost in terms of Data Quality, one needs to weigh waste and value.

  • We can define waste as any activity that takes longer (or costs more) because of a low-quality data set. Waste can be defined in different ways: wasted storage media, space or time, faulty decision-making or broken business processes.
  • Meanwhile, value is the importance assigned to a particular dataset – its place within the organisation. How valuable is a Spatial Asset Management system populated with data that will not validate or save? If the asset teams are working with antiquated data, how valuable is that information when planning improvements?

When we look at how data is handled, the ‘cheapest’ option is always the most wasteful. It consumes a large portion of the IT budget and results in the lowest value. Worse still, any bad data within a database will spread to all integrated systems, derailing attempts to integrate those systems to improve efficiency. That is why supplying a data quality module is so important to support data validation. While it does consume more time in terms of registration, you will at least know the data is correct.

Within every business process, data should inform a decision and a correct one at that. If it cannot immediately do that, you have a data quality problem.

  1. Data Integrity

Data Integrity is, essentially, the maintenance and assurance of data accuracy and consistency over the entire data life cycle. It is, therefore, a critical aspect of the design, implementation and usage of any system that is intended to provide a consistent means of storing, processing and retrieving data.

In an object-oriented database, data is managed in a consistent fashion because all of its characteristics are defined at the data model level – unlike many other solutions, which manage that behaviour at an application level.

  1. Data Management

Data Management is the development and implementation of architectures, policies, practices and processes that properly manage the full data life cycle needs for a given entity while also supporting the business processes of that entity. Data transactions – both short and long term – must be fully supported and without data validation protocols in place, one cannot support the business processes properly. All too often, becoming enamoured with what is being visualised hides the deficiencies of managing the data properly.

Data Management should also facilitate the storage of time-series data. An object-oriented database easily manages time series data because it is scalable. However, with large amounts of data in a relational format, for example, the potential for performance degradation is practically a given. Should an organization wish to expose that data, data warehouses should be considered.

One of the main advantages of an object-oriented based system is its ability to manage different versions of that data. This facilitates phased completion, providing hundreds of thousands of checkpoints, as well as scalability and multiple designs. It also allows simple rollback functions to view historical situations.

  1. Data Integration

Data Integration combines data residing in different sources, thereby providing users with a unified view of them – the ultimate single source of truth. For the purpose of this article, however, we should take this one step further.

In the ideal scenario, which is attainable in a data warehouse environment, the integration goes beyond simply combining data. Instead, it leverages this unified view to improve business processes by integrating more qualitative data and making this data readily available at all appropriate levels of your organisation.

This prevents data silos, ensures coordination between departments and helps establish a shared vision for the organisation.

The value of data mastery

To become future-proof, modern businesses are necessarily required to assess their existing datasets and determine how best to organise that data. During this process, the scalability of graphic and textual information, number of users, integration possibilities with other systems and version management capabilities must all be carefully investigated to contain the total cost of ownership.

By implementing commercial-off-the-shelf solutions that take care of all these aspects from the outset, utility companies will have a far greater degree of control when it comes to managing their data, processes and total cost of ownership, all while reducing data latency and facilitating all of their users and constituents. In an increasingly data-driven world, that is the key to success.

Contact us to find out more.

Contact us to find out more.

2036