Data Cleaning

Data cleaning is the key to an efficient and accurate fibre rollout

The datacleaning proces
01
Duplicate data
02
Complete missing data
03
Standardising formats
04
Removal of irrelevant data

We search for and remove duplicate data. Often, data is recorded multiple times or duplicates in different systems. This can lead to confusion or errors in planning.

We supplement the data with missing data. Missing data must be supplemented, as incomplete information can lead to delays or wrong assumptions during the rollout.

We standardise data formats. Data coming from different sources can be inconsistent in formatting (e.g. address formatting). Standardising these makes it easier to work with large datasets.

We remove irrelevant and outdated data. Old or irrelevant data can interfere with the accuracy of the rollout. Outdated data should be cleaned up to ensure that only current and relevant data is used.

Clean data is key to efficient and accurate fibre rollout.

 

Data types

Types of data that we clean for a guaranteed successful roll-out.

Geospatial data

Geospatial data includes information about the physical terrain such as roads, buildings and existing infrastructure such as power grids or other cables. This data is often collected through GIS (Geographic Information Systems), public databases, scanning trucks, helicopters, drones, or satellites. To ensure a smooth fibre deployment, it is important that this data is accurate and up-to-date. Data cleaning corrects erroneous data, such as incorrect or outdated maps, the incomplete localisation of underground pipes.

Client data

For network providers, accurate customer data is essential, especially to determine which areas to connect first and how to match infrastructure to demand. By cleaning up this data, providers can better determine which households and businesses should be connected, what subscription options they need, and how best to expand the network.

 

Technical network data

Technical network data includes information about the existing fibre infrastructure, such as where cables and connection points are located, what the capacity of the network is, and it provides technical specifications of the equipment. Incorrect technical data cannot only lead to design and installation problems, such as insufficient network capacity or incorrect connection of cables. But also causes unnecessary costs afterwards when the network is being maintained. Cleaning this data ensures that engineers and construction partners have reliable and consistent data to work with.

 

Outcome and results

Innovative
partnerships

We partner with innovative and leading data capturing and processing companies that know the industry just like we do.

Quality determines success
In the process of fibre rollouts, data cleaning plays an indispensable role. Fibre rollouts require an extensive amount of important data to plan, design and construction. Whether it is geospatial data, technical network information or customer data, the quality of this data largely determines the success of the rollout. This is why we at Heta attach great importance to data cleaning.

Fast and cost efficient
Ensuring that all relevant data is accurate, consistent and up-to-date can prevent errors during the rollout, save costs, and make the rollout faster and smoother. In addition, a lot of data is not always accurate or equally available, thinking about the number of living units per street, the type of surface, pre-existing infrastructure of other utilities,...