01.03.2023 Views

NC Feb-Mar 2023

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

FEATURE: DATA PERSPECTIVES<br />

petabytes of data globally within an<br />

enterprise and for most, the challenge can<br />

seem daunting. Successful enterprises can<br />

navigate through these requirements,<br />

however - but it can be costly.<br />

Businesses in regulated industries must<br />

prove that they can effectively store a<br />

certain type of sensitive data, but also<br />

must be able to prove that, when<br />

permitted, this data no longer exists<br />

anywhere. If this data has been deleted,<br />

but still shows up in any form or location,<br />

the data can still be recalled, which can<br />

bring on litigation cases against the<br />

business. It does not matter whether the<br />

enterprise is aware of the existence of the<br />

rogue data or not, they are still liable. The<br />

Data Life Cycle Management process is<br />

the sequence of the creation, use,<br />

retention, and eventual erasure of data. In<br />

some industries, the duration of this life<br />

cycle can span decades or more.<br />

The associated costs and burden which<br />

businesses have of managing data,<br />

tracking its movements, replication, and<br />

its locations, can place a massive strain<br />

on the ability of an organisation to<br />

conduct the business they need to do. The<br />

issue is the "use" phase of the data's life<br />

cycle - how do businesses make data<br />

useful, accessible, and analysable across<br />

a vast web of multinational regulations,<br />

without losing track of it? The answer is<br />

perhaps simpler than expected - leave the<br />

data in place, where it is safe and<br />

controllable - leave the original as the<br />

original.<br />

Even though analysing data-in-place<br />

sounds like an easy solution to this<br />

industry problem, it is not the first time this<br />

approach has been tried. The problem of<br />

network latency comes into play - it is not<br />

sufficient to just access the original data<br />

where it persists from anywhere. The race<br />

between network latency and data size<br />

has been a back-and-forth struggle<br />

throughout the history of computer<br />

networking. Even as the world gets<br />

digitally smaller, network latencies can<br />

make accessing data seem too far away<br />

to be efficiently analysed with high<br />

performance analytical databases engines<br />

which are already on the market.<br />

WHAT IS THE SOLUTION TO<br />

NETWORK LATE<strong>NC</strong>Y?<br />

There are three primary types of network<br />

latency which are; latency caused by<br />

distance, latency caused by congestion,<br />

and latency caused by the network design<br />

itself, intentionally or by accident.<br />

Combinations of these latency types in the<br />

same network, makes the issue much<br />

worse. All three types can, however, cause<br />

analytic access to data to be too slow to<br />

be useful, which reduces the usable<br />

throughput, which is required to gain<br />

insight from critical data, to outright<br />

intolerable.<br />

The instinctive solution is to place the<br />

data near the processing engines, where it<br />

is needed. This means copying data to<br />

local storage locations to give the data<br />

local performance access. However, this<br />

creates a whole new set of issues.<br />

Needing to keep track of where all of<br />

these data copies are located and when<br />

the use of the data is completed and<br />

removing the data from all locations, can<br />

be difficult and costly. This includes<br />

tracking down potential locally backed up<br />

copies and any off-site media copies, and<br />

local disaster recovery replicas in those<br />

remote locations.<br />

The simplest and most practical<br />

solution is to leave the original data in<br />

place. This is possible today with the<br />

combination of technologies<br />

which are already on the<br />

market. When businesses<br />

choose the right<br />

combination, they could optimise<br />

latencies in Wide Area Networks (WAN)<br />

and potentially increase throughput by<br />

over seven-times when directly compared<br />

to the same WAN by itself.<br />

Businesses could be able to use as much<br />

as 95 percent of the WAN connection to<br />

analyse data where it is stored versus<br />

copying and staging the data closer to<br />

their analytic engines. That is global<br />

analytics with data-in-place and at scale.<br />

This frees up IT teams to solve bigger<br />

issues, rather than needing to keep track<br />

of where sensitive data is being copied.<br />

They can manage and control data where<br />

they need to. This also has the potential<br />

to minimise regulatory requirements as<br />

some regulations allow the transient<br />

inflight use of data versus the persistence<br />

of data in other countries.<br />

The cost savings and reduced<br />

management spent could also play a role<br />

in planning data access methods.<br />

Combining the right technology is perfect<br />

for on-premises, private, hybrid, public<br />

and multi-cloud environments where long<br />

network latency might halt enterprises<br />

from being able to fully leverage access to<br />

their sensitive data. <strong>NC</strong><br />

WWW.NETWORKCOMPUTING.CO.UK @<strong>NC</strong>MagAndAwards FEBRUARY/MARCH <strong>2023</strong> NETWORKcomputing 27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!