17.08.2020 Views

ST Jul-Aug 2020

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>ST</strong>OR<br />

MAGAZINE<br />

<strong>ST</strong>ORAGE<br />

The UK’s number one in IT Storage<br />

<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

Vol 20, Issue 4<br />

YOUR FLEXIBLE FRIEND:<br />

The benefits of Cloud Data Warehousing<br />

<strong>ST</strong>RATEGY:<br />

Hardware-defined storage is dead<br />

RESEARCH:<br />

Covid-19 increases pressures on I.T.<br />

TECHNOLOGY:<br />

What happens when your SSD dies?<br />

COMMENT - NEWS - NEWS ANALYSIS - CASE <strong>ST</strong>UDIES - OPINION - PRODUCT REVIEWS


Plug-and-Protect<br />

Direct-to-Cloud Backup Appliance<br />

Try-then-Buy Program<br />

FREE for 45 Days *


YOUR FLEXIBLE FRIEND:<br />

<br />

<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

Vol 20, Issue 4<br />

CONTENTS<br />

<strong>ST</strong>OR<br />

MAGAZINE<br />

<strong>ST</strong>ORAGE<br />

CONTENTS<br />

<strong>ST</strong>RATEGY:<br />

<br />

RESEARCH:<br />

<br />

TECHNOLOGY:<br />

<br />

<br />

COMMENT….....................................................................4<br />

Something for everyone<br />

06<br />

HARDWARE-DEFINED <strong>ST</strong>ORAGE IS DEAD...............…6<br />

Enterprises should not be afraid to look past the limitations of block- and file-based<br />

storage and to the revolutionary potential of modern storage systems, argues Jerome<br />

M. Wendt of analyst firm DCIG<br />

CASE <strong>ST</strong>UDY: UNIVERSITY OF READING..................…8<br />

<strong>ST</strong>RATEGY: CLOUD….....................................................10<br />

Gareth John of Q Associates examines the issues around migrating systems to the<br />

cloud, and the growing shift towards a hybrid multi-cloud model<br />

RESEARCH: <strong>ST</strong>ORAGE TRENDS..............................…12<br />

CASE <strong>ST</strong>UDY: TORIX..................................................…13<br />

14<br />

REVIEW: KING<strong>ST</strong>ON TECHNOLOGY DATA CENTER<br />

DC1000M.......................................................................14<br />

MANAGEMENT: DATA PROTECTION.........................16<br />

Sarah Doherty of iland underlines the threats to organisational data and the need to<br />

future-proof infrastructure with resilient data protection strategies<br />

INDU<strong>ST</strong>RY FOCUS: MEDIA........................................…18<br />

Nick Pearce-Tomenius of Object Matrix looks at some of the potential compliance issues<br />

surrounding long term storage of raw footage for TV and media production companies<br />

16<br />

BACKUP TO THE FUTURE.........................................…20<br />

Bill Andrews of ExaGrid examines the journey from simple tape backups to tiered disk<br />

backups that use adaptive deduplication for fast, reliable and affordable backup and<br />

restore solutions<br />

CLOUD: YOUR FLEXIBLE FRIEND.............................…24<br />

What is a Cloud Data Warehouse and why is it important? Rob Mellor of WhereScape<br />

shares some insights<br />

CASE <strong>ST</strong>UDY: CINESITE.............................................…26<br />

18<br />

POWER PLAY……........................................................…28<br />

Rainer Kaese of Toshiba shares some insights from a recent experimental project<br />

undertaken at the company into the energy consumption of disk drives<br />

PEOPLE: THE WEAKE<strong>ST</strong> LINK..................................…32<br />

Florian Malecki of StorageCraft warns that organisations need to beware 'the vulnerability<br />

from within': human error<br />

TECHNOLOGY: SSD…….......................................…33<br />

Recovering data from failed solid-state drives can be more challenging than with hard<br />

disks, explains Philip Bridge, President of Ontrack<br />

24<br />

RESEARCH: <strong>ST</strong>ORAGE <strong>ST</strong>RATEGIES......................…34<br />

Survey uncovers the limitations imposed by traditional IT infrastructures, exacerbated<br />

by remote working during Covid-19 pandemic<br />

www.storagemagazine.co.uk @<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

^<br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

03


COMMENT<br />

EDITOR: David Tyler<br />

david.tyler@btc.co.uk<br />

SUB EDITOR: Mark Lyward<br />

mark.lyward@btc.co.uk<br />

REVIEWS: Dave Mitchell<br />

PRODUCTION MANAGER: Abby Penn<br />

abby.penn@btc.co.uk<br />

PUBLISHER: John Jageurs<br />

john.jageurs@btc.co.uk<br />

LAYOUT/DESIGN: Ian Collis<br />

ian.collis@btc.co.uk<br />

SALES/COMMERCIAL ENQUIRIES:<br />

Lyndsey Camplin<br />

lyndsey.camplin@storagemagazine.co.uk<br />

Stuart Leigh<br />

stuart.leigh@btc.co.uk<br />

MANAGING DIRECTOR: John Jageurs<br />

john.jageurs@btc.co.uk<br />

DI<strong>ST</strong>RIBUTION/SUBSCRIPTIONS:<br />

Christina Willis<br />

christina.willis@btc.co.uk<br />

PUBLISHED BY: Barrow & Thompkins<br />

Connexions Ltd. (BTC)<br />

35 Station Square, Petts Wood<br />

Kent BR5 1LZ, UK<br />

Tel: +44 (0)1689 616 000<br />

Fax: +44 (0)1689 82 66 22<br />

SUBSCRIPTIONS:<br />

UK £35/year, £60/two years,<br />

£80/three years;<br />

Europe: £48/year, £85 two years,<br />

£127/three years;<br />

Rest of World: £62/year<br />

£115/two years, £168/three years.<br />

Single copies can be bought for £8.50<br />

(includes postage & packaging).<br />

Published 6 times a year.<br />

No part of this magazine may be<br />

reproduced without prior consent, in<br />

writing, from the publisher.<br />

©Copyright <strong>2020</strong><br />

Barrow & Thompkins Connexions Ltd<br />

Articles published reflect the opinions<br />

of the authors and are not necessarily those<br />

of the publisher or of BTC employees. While<br />

every reasonable effort is made to ensure<br />

that the contents of articles, editorial and<br />

advertising are accurate no responsibility<br />

can be accepted by the publisher or BTC for<br />

errors, misrepresentations or any<br />

resulting effects<br />

SOMETHING FOR EVERYONE<br />

BY DAVID TYLER<br />

EDITOR<br />

Welcome to the <strong>Aug</strong>ust issue of Storage magazine, where the usual Summer lull<br />

doesn't seem to have affected our contributors - in fact despite the ongoing<br />

disruption from the Covid-19 pandemic, we've seen a fairly frantic few weeks<br />

in terms of people wanting to be included in our pages. And that's good news for<br />

readers, as it means a broad selection of articles covering topics from right across the<br />

storage spectrum.<br />

Toshiba's Rainer Kaese reports on a fascinating exercise in measuring the energy<br />

usage of hard disks - a key consideration as enterprises and cloud providers alike try to<br />

manage the rising costs of their data centres. Can you imagine powering a petabyte of<br />

storage using less power than five old 100W light bulbs? See how it can be done on<br />

page 28.<br />

Elsewhere DCIG's Jerome Wendt puts the cat amongst the pigeons with his contention<br />

that hardware-defined storage is well and truly past its use-by date: "Failing to declare<br />

the death of hardware-defined storage serves no good purpose. Enterprises need to<br />

wake up to the plethora of features that modern storage systems deliver that make so<br />

many of their current tasks obsolete." Wendt argues that in fact most of the tasks that<br />

take up the working days of storage administrators could and should be being<br />

managed automatically by more modern storage arrays.<br />

In a focus on the broadcast media industry we hear from Object Matrix's Nick Pearce-<br />

Tomenius, who looks at how proper practices and appropriate storage solutions can<br />

help news and reality TV makers protect the integrity of their productions - and perhaps<br />

even solve the growing issues of 'Deepfake' videos. He comments: "Good digital<br />

content governance, a mix of process and technology, can ensure that content is<br />

protected, instantly accessible and proven to be authentic at any time in the future. It<br />

can also help organisations to beat Deepfake or disprove manipulated images."<br />

This issue also includes a couple of complementary bylines around cloud-related<br />

topics, including a piece on cloud migration - and specifically the shift towards hybrid<br />

multi-cloud models - from Gareth John of Q Associates. As he says: "Nowadays<br />

organisations are typically deploying all-flash storage systems in on-prem data centres<br />

and cold data is not a good fit for this medium. Intelligently archiving cold data to a<br />

cloud object store can ensure that hot data enjoys the high performance of flash whilst<br />

exploiting a low-cost scalable cloud tier for inactive data."<br />

I'm confident that, even more so than usual, this issue really does contain something<br />

for everyone.<br />

David Tyler<br />

david.tyler@btc.co.uk<br />

^<br />

04 <strong>ST</strong>ORAGE<br />

MAGAZINE<br />

<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk


ANALYSIS:<br />

ANALYSIS: HARDWARE-DEFINED <strong>ST</strong>ORAGE<br />

HARDWARE-DEFINED <strong>ST</strong>ORAGE IS DEAD<br />

ENTERPRISES SHOULD NOT BE AFRAID TO LOOK PA<strong>ST</strong> THE LIMITATIONS OF BLOCK- AND FILE-BASED<br />

<strong>ST</strong>ORAGE AND TO THE REVOLUTIONARY POTENTIAL OF MODERN <strong>ST</strong>ORAGE SY<strong>ST</strong>EMS, ARGUES<br />

JEROME M. WENDT, PRESIDENT AND FOUNDER OF ANALY<strong>ST</strong> FIRM DCIG<br />

Enterprises, regardless of their size,<br />

largely agree they want any storage<br />

solutions they deploy to deliver<br />

flexibility. They may look for this flexibility in<br />

multiple ways to include its availability,<br />

performance, reliability, replication,<br />

scalability, self-healing or self-tuning<br />

capabilities, and more. However, as they<br />

choose storage solutions that deliver the<br />

flexibility they need and want, another truth<br />

quickly becomes evident: hardware-defined<br />

storage is dead.<br />

A WORKING DEFINITION<br />

Simply speaking, hardware-defined storage<br />

arrays present a storage target to a physical<br />

or virtual machine. All hardware-defined<br />

storage arrays include some type of<br />

firmware on them that virtualises its<br />

underlying HDDs or SDDs. That firmware<br />

then, in turn, presents this virtualised<br />

storage as a volume or a folder to one or<br />

more physical or virtual machines.<br />

In this respect, most storage arrays fall<br />

under this working definition of hardwaredefined<br />

storage. Most storage arrays<br />

deliver one or both of these storage<br />

interfaces quite well. Further, almost any<br />

enterprise that acquires a storage array<br />

expects it to deliver block-based storage,<br />

file-based storage, or both.<br />

Having reached this level of maturity, it is<br />

time to declare hardware-defined storage<br />

as dead. Modern storage arrays and<br />

storage solutions offer so many more<br />

features. Block- and file-based storage<br />

should only serve as a starting point, not an<br />

end game. In only using block and/or file<br />

storage services on a storage array or<br />

solution, enterprises do themselves a<br />

disservice.<br />

EVIDENCE OF DEATH<br />

Failing to declare the death of hardwaredefined<br />

storage serves no good purpose.<br />

Enterprises need to wake up to the plethora<br />

of features that modern storage systems<br />

deliver that make so many of their current<br />

tasks obsolete. Consider the following<br />

scenarios and see if you answer "Yes" to any<br />

of them:<br />

<br />

Are you still contacting support for<br />

break/fix issues? My question to you is,<br />

"Why has your storage vendor not<br />

called you to tell you that the hardware<br />

problem was already diagnosed and<br />

fixed?" Multiple modern storage systems<br />

06 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


ANALYSIS:<br />

ANALYSIS: HARDWARE-DEFINED <strong>ST</strong>ORAGE<br />

"Block- and file-based storage should only serve as a starting<br />

point, not an end game. In only using block and/or file<br />

storage services on a storage array or solution, enterprises<br />

do themselves a disservice. Failing to declare the death of<br />

hardware-defined storage serves no good purpose.<br />

Enterprises need to wake up to the plethora of features<br />

that modern storage systems deliver that make so<br />

many of their current tasks obsolete."<br />

<br />

<br />

include features that diagnose the<br />

underlying issue and may resolve it<br />

before you even know about it.<br />

Are you still manually troubleshooting<br />

performance issues? Again, I ask,<br />

"Why are you not allowing the storage<br />

system to help diagnose and resolve<br />

performance issues?" Granted, you<br />

can throw more flash storage at the<br />

problem (and many do.) However,<br />

flash may only mask underlying issues.<br />

Using storage arrays that include<br />

artificial intelligence can equip<br />

enterprises to directly address the root<br />

causes behind these performance<br />

issues. In so doing, they can help<br />

prevent them from recurring.<br />

Can your applications communicate<br />

directly with the storage array and<br />

request and return storage as needed?<br />

This feature represents an entirely new<br />

generation of functionality where<br />

enterprises may bypass the needs for<br />

tasks such as LUN masking, zoning,<br />

and setting security permissions. Where<br />

is the business value in any of these<br />

administrative tasks? (Dirty little secret:<br />

there is little or none!) Look for new<br />

<br />

<br />

storage systems that expose their APIs<br />

so applications can obtain and rescind<br />

storage according to their needs.<br />

Are you still guessing at future capacity<br />

requirements and tying up capital by<br />

purchasing that capacity up front?<br />

Multiple storage vendors now deliver<br />

their solutions "as a service". The<br />

vendors offer flexible capacity that ties<br />

cost to actual usage, and they manage<br />

the underlying storage array for the<br />

enterprise. This frees IT staff to manage<br />

the data rather than the infrastructure.<br />

Are you creating a new silo of storage<br />

and storage management headaches<br />

when migrating workloads to the<br />

cloud? Look for storage vendors that<br />

offer their storage solutions as softwaredefined<br />

offerings in the cloud. This<br />

extends existing, familiar, data<br />

management and protection<br />

capabilities to workloads in the cloud.<br />

A WAKE-UP CALL<br />

Do not think for one second that I think<br />

enterprises will stop using hardwaredefined<br />

storage or vendors will stop<br />

shipping it tomorrow. Neither will occur. If<br />

anything, I expect both block-based and<br />

file-based storage to outlive and outlast<br />

me. Hardware-defined storage works and<br />

many applications and operating systems<br />

will need it for the foreseeable future.<br />

That said, declaring the death of<br />

hardware-defined storage serves as a<br />

wake-up call to enterprises. DCIG just<br />

completed and released its <strong>2020</strong>-21<br />

Enterprise All-flash Array Buyer's Guide. In<br />

evaluating these arrays, DCIG only refers<br />

to them as "storage arrays" in the very<br />

broadest sense of the term.<br />

These arrays do so much more than<br />

provide block- and/or file-based storage<br />

targets. Many offer powerful software<br />

features that revolutionise how enterprises<br />

allocate and manage storage.<br />

By putting a stake in the ground and<br />

declaring hardware-defined storage as<br />

dead, DCIG is not trying to kill hardwaredefined<br />

storage. Rather, DCIG desires<br />

that enterprises take a long, hard look at<br />

how the modern storage solutions found<br />

in this Guide can enable them to<br />

transform their business.<br />

More info: www.dcig.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

07


CASE <strong>ST</strong>UDY:<br />

CASE <strong>ST</strong>UDY: UNIVERSITY OF READING<br />

THE UNIVERSITY CHALLENGE<br />

THE UNIVERSITY OF READING HAS BEEN ABLE TO BOO<strong>ST</strong> ITS ACADEMIC RESEARCH CAPABILITIES<br />

SINCE DEPLOYING A SOFTWARE-DEFINED SCALE-OUT FILE <strong>ST</strong>ORAGE SOLUTION<br />

Founded in the 19th century the<br />

University of Reading has become one<br />

of the foremost research-led<br />

universities in the UK. It has over 50<br />

research centres, many recognised as<br />

international centres of excellence, in areas<br />

including agriculture, biological and<br />

physical sciences and meteorology.<br />

RESEARCH WORKLOADS<br />

While similar in many respects, the IT<br />

requirements of university research teams<br />

are often far removed from those of<br />

commercial workloads. In addition to<br />

vastly higher compute and storage<br />

demands, for example, research<br />

workloads can be a lot harder to predict<br />

and liable to change significantly at very<br />

short notice, as Ryan Kennedy, Academic<br />

Computing Team Manager at the<br />

University of Reading explains.<br />

"IT has become a key research tool and<br />

it's not unusual for academics to request<br />

access to hundreds of VMs connected to<br />

terabytes of storage one day, only to dump<br />

them and start over the next," he said.<br />

"Delivering that kind of ad-hoc scalability<br />

using conventional servers and storage<br />

platforms is both complex and time<br />

consuming, especially for IT staff employed<br />

to support the research, not manage the<br />

infrastructure."<br />

Against that background Kennedy and his<br />

team were finding it increasingly difficult to<br />

deliver the IT resources research users were<br />

demanding. Moreover, with virtualisation a<br />

key part of the solution, licensing costs were<br />

becoming an issue and, while big projects<br />

could afford to finance new infrastructure, it<br />

was hard to justify spending to meet the<br />

needs of those with limited funds. A simpler<br />

and more agile solution was clearly<br />

required and one which could be shared<br />

more equitably and automated to allow for<br />

greater hands-off management.<br />

PUBLIC CLOUD OR ON-PREM?<br />

Among several alternatives investigated the<br />

public cloud was an obvious candidate but<br />

not necessarily a good fit as Kennedy<br />

outlined: "While the public cloud could<br />

deliver the on-demand agility and selfservice<br />

management we were after, the<br />

unpredictable workloads would make it<br />

more expensive and, potentially, harder<br />

and more time consuming for us to<br />

manage. There were also concerns about<br />

data protection and compliance, especially<br />

given the sensitive nature of the data<br />

involved and the need to protect<br />

intellectual copyright."<br />

A brief and costly trial using Azure proved<br />

the validity of these concerns, at which<br />

point Kennedy persuaded the University to<br />

instead consolidate its existing infrastructure<br />

- then spread across multiple sites - into<br />

one on-premise data centre. Moreover,<br />

rather than simply upgrading the existing<br />

infrastructure, the decision was taken to<br />

switch to the Nutanix Enterprise Cloud OS<br />

software running on Dell EMC XC series in<br />

order to deliver the same on-demand and<br />

self-service benefits as the public cloud, but<br />

in a more affordable, secure and<br />

manageable manner.<br />

The decision was also taken to switch<br />

virtualisation platform, from VMware to the<br />

AHV hypervisor included as part of the<br />

Nutanix Enterprise Cloud software stack. A<br />

bold move with the promise of huge cost<br />

08 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


CASE <strong>ST</strong>UDY:<br />

CASE <strong>ST</strong>UDY: UNIVERSITY OF READING<br />

"As well as lower cost, speed and simplicity were seen as the main plus points of<br />

Nutanix Files. With our legacy NAS software, for example, new shares had to be set<br />

up by the support team using specialist interfaces but with Nutanix Files anyone can<br />

do it and it's easy to automate. It's also a lot quicker with shares available online in<br />

seconds and none of the performance bottlenecks associated with separate server<br />

and storage platforms."<br />

savings, which has also paid off in terms of<br />

an easy migration and simpler, unified,<br />

management. "Migrating old VMs to the<br />

Nutanix hypervisor was trouble free and we<br />

have yet to find a workload that AHV can't<br />

handle," commented Kennedy. "The AHV<br />

hypervisor is also fully integrated and<br />

managed from the same Prism console as<br />

the rest of the Enterprise Cloud software<br />

making it easy to build the self-service<br />

portal we wanted and allow academics to<br />

provision their own resources."<br />

Another key reason for choosing the<br />

Nutanix Enterprise Cloud Platform, the<br />

integrated Prism Self-Service Portal (SSP)<br />

can be used by customers to build a<br />

custom web-based interface that empowers<br />

users to create and manage both VMs and<br />

storage directly - much as they would using<br />

a public cloud platform, but in a strictly<br />

controlled and supervised manner. To this<br />

end administrators create projects to which<br />

they assign compute and storage resources,<br />

including shared VM templates and<br />

software images, for end-user consumption.<br />

Fine-grained access controls can also be<br />

applied with additional tools to gather<br />

usage statistics and raise alerts when<br />

specific thresholds are breached.<br />

Another important decision was to switch<br />

from legacy NAS storage to the integrated<br />

Nutanix Files - a software-defined scale-out<br />

file storage solution for unstructured data.<br />

This would enable Reading University to<br />

configure over a petabyte of usable storage<br />

using six load-balanced virtual file servers<br />

all in the same rack and managed from the<br />

same single pane of management provided<br />

by Nutanix Prism. "As well as lower cost,<br />

speed and simplicity were seen as the main<br />

plus points of Nutanix Files," Kennedy<br />

explains. "With our legacy NAS software,<br />

for example, new shares had to be set up<br />

by the support team using specialist<br />

interfaces but with Nutanix Files anyone can<br />

do it and it's easy to automate. It's also a<br />

lot quicker with shares available online in<br />

seconds and none of the performance<br />

bottlenecks associated with separate server<br />

and storage platforms."<br />

MIGRATION IN A WEEKEND<br />

Following an initial proof of concept trial<br />

using just five nodes, the scalability of the<br />

Nutanix Enterprise Cloud was immediately<br />

put to the test when one of the university's<br />

legacy IT infrastructure suppliers went out of<br />

business. Faced with having no support for<br />

key storage appliances an additional 10<br />

nodes were quickly delivered, enabling<br />

Kennedy and his team to migrate fully to<br />

the Nutanix infrastructure over a weekend<br />

and configure 400TB of storage in just 10<br />

minutes.<br />

"It was a real eye-opener," he said. "With<br />

our legacy storage it would have taken<br />

weeks to put in new servers and storage but<br />

once the Nutanix nodes were racked we<br />

just hit the expand button and, 10 minutes<br />

later, it was all done. Why couldn't we have<br />

done it this way before?"<br />

As well as simpler scalability and<br />

enhanced storage performance, another<br />

benefit is much more efficient use of<br />

available storage with, in the case of<br />

Reading University, a 16:1 reduction in<br />

physical storage overheads thanks to builtin<br />

deduplication, erasure coding and<br />

compression technologies.<br />

That doesn't mean that extra nodes<br />

haven't been needed as according to<br />

Kennedy uptake of the Reading Research<br />

Cloud has been 'massive' and is still<br />

growing. Despite that, there have been no<br />

availability issues with the Reading team<br />

opting to take advantage of the inherent<br />

redundancy of the Nutanix architecture and<br />

use the integrated Cloud Connect<br />

capability to take snapshots to Microsoft<br />

Azure for backup and disaster recovery.<br />

Ryan Kennedy is hugely appreciative and<br />

proud of what the Nutanix Enterprise Cloud<br />

has allowed the University IT team to<br />

achieve, pointing to not just the scalability<br />

and ease of use of the platform as key<br />

enablers but the professionalism and high<br />

level of support provided by Nutanix and its<br />

partners: "The Nutanix platform really has<br />

transformed the way we work," he<br />

commented. "Most of the time we don't<br />

even have to touch it - it just runs itself!"<br />

More info: www.nutanix.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

09


<strong>ST</strong>RATEGY: CLOUD CLOUD<br />

THE EVER-CHANGING IT<br />

LANDSCAPE<br />

GARETH JOHN, SOLUTIONS ARCHITECT AT Q ASSOCIATES, EXAMINES<br />

THE ISSUES AROUND MIGRATING SY<strong>ST</strong>EMS TO THE CLOUD, AND THE<br />

GROWING SHIFT TOWARDS A HYBRID MULTI-CLOUD MODEL<br />

The IT landscape is changing. It hasn't<br />

just evolved from what it was five years<br />

ago, or even one year ago; it is in a<br />

state of constant flux, mostly due to the<br />

cloud aspect of an IT strategy - Cloud<br />

Strategy - and this can change from month<br />

to month as organisations adapt to the<br />

proliferation of new tools and services that<br />

are on offer.<br />

There is a definite trend that is seeing<br />

workloads being moved from the on-prem<br />

data centre to some sort of cloud, whether<br />

that be IaaS, PaaS, or SaaS; into a hyperscaler<br />

or by consuming a service from a<br />

smaller provider. And there are many good<br />

reasons for this trend, especially in relation<br />

to the hyper-scalers: near infinite and instant<br />

elasticity where you can scale up or scale<br />

back and only pay for what you use, offloading<br />

of hardware maintenance, taking<br />

advantage of cloud-based data analytics,<br />

utilisation of the substantial and evergrowing<br />

compendium of services, and<br />

more.<br />

Cloud adoption, however, should not be<br />

hurried. Testament to this are the many<br />

organisations who had adopted an<br />

aggressive cloud-first strategy and<br />

discerned the resultant increasing costs,<br />

that are now trying to reverse out of the<br />

public cloud - and incurring yet more<br />

expense. Just as there are many potential<br />

benefits of public cloud, there are also<br />

many valid concerns including<br />

connectivity, security, data sovereignty,<br />

lock-in, and of course cost.<br />

Organisations need to carefully assess<br />

their existing IT estate to ascertain which<br />

workloads are appropriate for cloud<br />

transition. There will almost certainly be<br />

workloads that are unsuitable for the<br />

transition and the ones that are<br />

appropriate will suit different cloud<br />

models. In this light, most customers that I<br />

talk to are looking to adopt a hybrid<br />

multi-cloud model (see diagram).<br />

The first step is usually to move<br />

previously on-prem applications to SaaS<br />

offerings; Microsoft 365 is a prominent<br />

example of this where people can off-load<br />

everything (including hardware<br />

maintenance, O.S. and application<br />

versioning, resilience and interoperability)<br />

to a full-stack service that includes the<br />

application and its data. Note that while<br />

the data will reside on resilient<br />

infrastructure, it still needs to be backed<br />

up to protect against corruption,<br />

unintended change or deletion.<br />

RUNNING HOT AND COLD<br />

Cold data (data that is rarely used) is also<br />

considered low-hanging fruit for cloud<br />

utilisation. Nowadays organisations are<br />

typically deploying all-flash storage<br />

systems in on-prem data centres and cold<br />

data is not a good fit for this medium.<br />

Intelligently archiving cold data to a cloud<br />

object store can ensure that hot data<br />

enjoys the high performance of flash<br />

whilst exploiting a low-cost scalable cloud<br />

tier for inactive data. This cloud object tier<br />

10 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


<strong>ST</strong>RATEGY: <strong>ST</strong>RATEGY: CLOUD<br />

Connectivity is also an important factor;<br />

as organisations move workloads off to<br />

various cloud services, connectivity needs<br />

to be considered to ensure that bandwidth<br />

and latency requirements are met once the<br />

workload has been moved. In this arena<br />

we're seeing a lot more interest in software<br />

defined WAN (SD-WAN) initiatives aiming<br />

to simplify and orchestrate routing over an<br />

assortment of disparate WAN connections.<br />

is also a good location to store an off-site<br />

copy of backup data that can then be<br />

utilised as part of a cloud-based DR<br />

strategy.<br />

The way in which public cloud services<br />

are consumed is fast becoming the de<br />

facto standard: users can log on to a<br />

portal, select the services they require and<br />

have these services instantiated in<br />

minutes. This is the reason that<br />

organisations should consider<br />

transitioning their on-prem infrastructure<br />

into a private cloud, so that their<br />

resources can be consumed in a much<br />

more cloud-like fashion.<br />

It's a lot more complicated than this, but<br />

it will involve deploying a framework that<br />

provides a service catalogue, automated<br />

fulfilment, and a billing engine. It will also<br />

require mapping SLAs to resource pool<br />

utilisation, organisational changes, and<br />

procedural standardisation amongst other<br />

things.<br />

Whilst public cloud is great for burstable<br />

workloads (due to the inherent elasticity<br />

where you only pay for what you use) one<br />

mistake that we regularly see is the liftand-shift<br />

of on-prem applications into<br />

public IaaS offerings. Having all VMs, that<br />

would normally reside on on-prem<br />

infrastructure, running in the cloud 24/7<br />

could see a significant cost increase.<br />

REARCHITECT FOR SUCCESS<br />

In order to realise the full value of public<br />

cloud, applications really need to be<br />

rearchitected to utilise things like database<br />

services (rather than running full database<br />

VMs) and serverless code services (where<br />

you only pay for the compute time that you<br />

consume). Automatically turning VMs off<br />

when they are not being used will also be<br />

financially advantageous.<br />

Q Associates has been helping<br />

customers with all of these schemes for<br />

some time, but until recently we have had<br />

to rely on partnerships to ensure that we<br />

utilise the best specific skills and<br />

knowledge in any particular area. With the<br />

recently-announced acquisition of Apex<br />

Group, we now have premium in-house<br />

skills in all of these fields and can provide<br />

our customers with a holistic delivery of<br />

infrastructure and services, from design<br />

and implementation through to support<br />

and management. The acquisition will also<br />

help us to evolve at speed, with<br />

widespread internal hybrid multi-cloud<br />

skills and knowledge, to ensure that we<br />

stay relevant to our customers in this<br />

rapidly shifting environment.<br />

More info: www.qassociates.co.uk<br />

"Nowadays organisations are typically deploying all-flash<br />

storage systems in on-prem data centres and cold data is<br />

not a good fit for this medium. Intelligently archiving cold<br />

data to a cloud object store can ensure that hot data<br />

enjoys the high performance of flash whilst exploiting a<br />

low-cost scalable cloud tier for inactive data. This cloud<br />

object tier is also a good location to store an off-site<br />

copy of backup data that can then be utilised as part of<br />

a cloud-based DR strategy."<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

11


RESEARCH: <strong>ST</strong>ORAGE TRENDS<br />

SPECTRA PUBLISHES "DIGITAL DATA<br />

<strong>ST</strong>ORAGE OUTLOOK <strong>2020</strong>"<br />

FIFTH ANNUAL DATA <strong>ST</strong>ORAGE REPORT AIDS INDU<strong>ST</strong>RY IN NAVIGATING THE BUDGETARY AND<br />

INFRA<strong>ST</strong>RUCTURE CHALLENGES OF CAPTURING, SHARING AND PRESERVING DATA<br />

Disk manufacturers are closing in on<br />

delivery of HAMR and MAMR<br />

technologies that will allow them to<br />

initially provide disk drives of 20TB<br />

while also enabling a technology<br />

roadmap that could achieve 50TB or<br />

greater over the next 10 years.<br />

Spectra Logic has published the fifth<br />

edition of its "Digital Data Storage<br />

Outlook" report. The <strong>2020</strong> report<br />

delves into the management, access, use<br />

and preservation of the world's everexpanding<br />

volumes of data, capturing the<br />

impact of the Covid-19 pandemic on<br />

trends and technology during this<br />

unprecedented time in history. The report<br />

outlines future strategies, technologies,<br />

applications, use cases and costs for<br />

more accurate evaluation and planning of<br />

data management and preservation<br />

strategies.<br />

Spectra's Digital Data Storage Outlook<br />

<strong>2020</strong> predicts that, while there could be<br />

some restrictions in budgets and<br />

infrastructure, only a small likelihood<br />

exists for a constrained supply of storage<br />

to meet the needs of the digital universe<br />

through 2030.<br />

Storage device providers will continue to<br />

innovate with higher speeds and<br />

capacities to meet increasing growth<br />

demand, with every data storage<br />

category, including flash, persistent<br />

memory, disk, tape and cloud, exhibiting<br />

technology improvements. This<br />

momentum will be dependent upon<br />

projected technology advancements and<br />

any slowdown in one category, such as<br />

disk, will provide an opportunity for<br />

others, such as flash and tape.<br />

Highlights from the <strong>2020</strong> report include:<br />

Economic concerns will push<br />

infrequently accessed data from tier<br />

one storage, made up of flash, to a<br />

second tier, made up of spinning disk,<br />

object storage, cloud and tape. This<br />

method employs data movers to<br />

migrate data for ongoing cost savings.<br />

<strong>2020</strong> will see a 10% to 40% price<br />

increase for flash. After experiencing<br />

18 months of oversupply of flash in<br />

the market, resulting in substantial<br />

price reductions, <strong>2020</strong> will see<br />

reductions in supply versus demand.<br />

The third generation of 3D XPoint<br />

technology will become the latest<br />

high-performance standard for<br />

database storage.<br />

The need for tape in the long-term<br />

archive market continues to grow.<br />

Tape will achieve storage capacities of<br />

100TB or higher on a single cartridge<br />

in the next decade.<br />

Cloud providers will consume, in<br />

terms of both volume and revenue, an<br />

increasingly larger portion of the<br />

storage required to support the digital<br />

universe.<br />

"The year <strong>2020</strong> is one like no other due<br />

to Covid-19, which makes accurate<br />

market forecasting especially challenging<br />

in these extraordinary times," said Spectra<br />

Logic CEO Nathan Thompson. "That said,<br />

as businesses become increasingly datadriven,<br />

it is even more crucial that IT<br />

professionals understand the factors<br />

impacting their organisations, so they can<br />

anticipate the trends, technologies and<br />

challenges they will face in order to<br />

protect their data and derive maximum<br />

value from it for the long-term."<br />

The full report can be downloaded from<br />

https://spectralogic.com/data-storageoutlook-report/<br />

More info: www.spectralogic.com<br />

12 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


CASE <strong>ST</strong>UDY: CASE <strong>ST</strong>UDY: TORIX<br />

MODERN, FLEXIBLE, HIGH-PERFORMING<br />

NON-PROFIT INTERNET EXCHANGE TORIX LOOKED TO <strong>ST</strong>ORMAGIC FOR A HYPERCONVERGED<br />

SOLUTION THAT WOULD BE EASY FOR ITS I.T. TEAM TO MANAGE<br />

In 1998, Toronto Internet Exchange<br />

(TorIX), the first non-profit internet<br />

exchange in Toronto, was created to<br />

directly connect the internet traffic of<br />

Canadian businesses by using local<br />

network infrastructure. A group of experts<br />

collaborated to establish TorIX with the<br />

intention of overcoming the cost and<br />

latency issues of having Canadian traffic<br />

flow through the United States. Today, TorIX<br />

has over 250 organisations connected with<br />

access to direct routes from many diverse<br />

peering partners.<br />

As a non-profit organisation, TorIX focuses<br />

on investing funds into infrastructure, so<br />

that its technology can stay up to date to<br />

remain at the forefront of the Internet<br />

Exchange Point (IXP) industry. Previously,<br />

TorIX was using a VMware installation with<br />

no replication, however it wanted to avoid<br />

large, hardware-dependent installations<br />

associated with vSAN and to find a solution<br />

better fitted for its long-term needs.<br />

TorIX began the process of evaluating<br />

market options for an infrastructure solution<br />

to power its IT operations that was highperforming,<br />

simple and easy to manage.<br />

More specifically, TorIX was searching for a<br />

solution that it could trust with managing<br />

all of its critical external services for<br />

customers, including its online portal<br />

systems, telemetry data, web and mail<br />

applications. At the top of TorIX’s priority<br />

list was a hyperconverged solution that was<br />

easy for its IT team to manage, which is<br />

why the company turned to SvSAN.<br />

EASY TO MANAGE & UPGRADE<br />

To power its non-profit internet exchange,<br />

TorIX needed a hyperconverged solution<br />

between its two data centres with high<br />

performance and availability. After<br />

evaluating multiple options, TorIX found<br />

that StorMagic SvSAN best suited its needs<br />

because it was simple for its IT team to<br />

manage and easy to upgrade, while still<br />

remaining cost effective and modern.<br />

Furthermore, SvSAN's stretch/metro cluster<br />

capability enabled TorIX to site its two<br />

SvSAN nodes 3 kilometres apart with no<br />

impact on performance thanks to SvSAN's<br />

low bandwidth requirements.<br />

MAXIMUM UPTIME<br />

TorIX now has a two-node cluster consisting<br />

of Cisco servers and VMware vSphere as<br />

the hypervisor. With SvSAN, TorIX can easily<br />

manage its IT infrastructure with 100<br />

percent redundancy and high availability.<br />

SvSAN powers all of TorIX's critical external<br />

services for customers, such as web and<br />

mail applications, online portal systems and<br />

telemetry data.<br />

TorIX has reported maximum uptime in<br />

operations, delivering powerful direct<br />

internet routes to peering partners without<br />

interruptions. In addition, throughout the<br />

pre- and post-implementation process,<br />

TorIX found StorMagic's world-class, 24/7<br />

customer support highly responsive and<br />

helpful with technical expertise. TorIX found<br />

that SvSAN is reliable and simple to<br />

manage for its day-to-day operations. High<br />

data availability is critical to TorIX and their<br />

loyal customers.<br />

"TorIX is driven to directly connect<br />

Canadian business' internet traffic through<br />

the local network infrastructure, while<br />

maintaining strong network performance<br />

and low latency," commented Jon Nistor,<br />

Board Director, TorIX. "To deliver this to<br />

customers, we prioritise investing in modern<br />

technology for our IT infrastructure, so that<br />

we can remain at the forefront of the<br />

industry. This is why we selected StorMagic<br />

SvSAN, so that TorIX can now power<br />

operations with a modern system that is easy<br />

to manage, flexible and high-performing.<br />

"We have been 100% satisfied with<br />

StorMagic, which we trust to power all of<br />

our critical external services for our<br />

customers and have the peace of mind that<br />

our systems will never fail."<br />

More info: www.stormagic.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

13


PRODUCT REVIEW REVIEW<br />

KING<strong>ST</strong>ON TECHNOLOGY DATA CENTER DC1000M<br />

As data centre applications and<br />

workloads demand ever greater<br />

storage performance, enterprises are<br />

finding that NVMe SSDs are the only way<br />

to go. These high-performance devices are<br />

perfect for businesses running dataintensive<br />

workloads and those that need to<br />

replace legacy SATA or SAS SSD server<br />

storage and arrays as they deliver very high<br />

throughputs and low latency in a familiar<br />

form factor.<br />

The Data Center DC1000M series of<br />

NVMe U.2 SSDs from Kingston offer a<br />

tempting proposition as they deliver a<br />

finely balanced combination of<br />

performance and value. Available in a<br />

choice of four capacities ranging from<br />

960GB to 7.68TB, we reviewed Kingston's<br />

1.92TB model which has a very affordable<br />

sub-£400 price.<br />

The DC1000M series clearly shows<br />

Kingston's intentions as it has been moving<br />

firmly into the data centre storage space<br />

for some time. Combining these with its<br />

new DC1000B NVMe boot drive plus the<br />

DC450R and DC500 series of SATA SSDs<br />

allows it to offer one of the most<br />

comprehensive ranges of highperformance<br />

data centre storage solutions<br />

on the market.<br />

The 1.92TB model looks fast on paper<br />

with Kingston quoting sequential read and<br />

write speeds of 3,100MB/sec and<br />

2,600MB/sec. Along with low sub-1ms<br />

latencies, throughput looks good with it<br />

claiming rates for random read and write<br />

operations of 540,000 IOPS and<br />

205,000 IOPS respectively.<br />

These numbers make the DC1000M very<br />

versatile and ideal for mixed-use scenarios<br />

in the data centre. Typical applications<br />

Kingston is targeting range from HPC,<br />

OLTP and virtualisation to cloud services,<br />

web host caching and HD media capture.<br />

The DC1000M employs the latest 3D TLC<br />

(triple level cell) NAND flash technology.<br />

This is far superior to older 2D NAND as it<br />

allows the cells to be stacked in layers thus<br />

enabling much higher storage densities<br />

with a lower cost per bit and reduced<br />

power consumption.<br />

Other key features that will appeal to<br />

enterprises are hot-plug support and<br />

SMART monitoring for tracking reliability,<br />

usage, remaining life, wear levelling and<br />

operational temperatures. The DC1000M<br />

also incorporates onboard power loss<br />

protection (PLP) through capacitors and<br />

firmware to avoid potential data loss<br />

caused by power failures.<br />

For performance testing, we used the<br />

lab's Dell PowerEdge T640 tower server<br />

equipped with dual 22-core 2.1GHz Xeon<br />

Scalable Gold 6152 CPUs plus 384GB of<br />

DDR4 memory and running Windows<br />

Server 2019. Our server has an eight-bay<br />

PCI-e NVMe Gen 3 U.2 cage and we had<br />

no problems fitting the DC1000M in the<br />

server's hot-plug carrier where it was<br />

correctly recognised by the OS as a new<br />

NVMe bus storage device.<br />

We used a range of benchmarking apps<br />

starting with Iometer which reported raw<br />

sequential read and write rates of<br />

3,070MB/sec and 2,663MB/sec. The read<br />

rate is slightly below the claimed speed<br />

while the write rate is marginally better and<br />

the CrystalDiskMark app agreed closely<br />

with these numbers.<br />

For random read and write rates, Iometer<br />

returned 2,990MB/sec and 1,600MB/sec.<br />

Changing Iometer to small 4K block sizes,<br />

we ran our tests for a number of hours until<br />

they had achieved a steady state.<br />

Once throughput had settled, we<br />

recorded random read and write rates of<br />

486,900 IOPS and 225,100 IOPS. As with<br />

our sequential tests, read throughput was<br />

slightly below the quoted number whereas<br />

write rates were a little higher. Overall,<br />

these performance results are great and<br />

latency is also very low as during our I/O<br />

throughout tests, both Iometer and the AS<br />

SSD Benchmark app reported average<br />

latencies of less than 1ms.<br />

Product: Data Center DC1000M<br />

Supplier: Kingston Technology<br />

Web site: www.kingston.com<br />

Tel: +44 (0) 1932 738888<br />

Price: 1.92TB - £377 exc VAT<br />

VERDICT: The DC10000M is clearly capable of handling very demanding enterprise workloads and is more than a match for<br />

competing NVMe storage products costing substantially more, making it excellent value as well.<br />

14 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


MANAGEMENT: DATA PROTECTION<br />

THE 3-2-1 RULE OF DATA PROTECTION<br />

SARAH DOHERTY, PRODUCT MARKETING MANAGER AT ILAND, UNDERLINES THE THREATS TO<br />

ORGANISATIONAL DATA AND THE NEED TO FUTURE-PROOF INFRA<strong>ST</strong>RUCTURE WITH RESILIENT DATA<br />

PROTECTION <strong>ST</strong>RATEGIES<br />

IIn today's world, a<br />

major challenge for<br />

organisations is<br />

protecting their data.<br />

Whether an<br />

organisation is in a<br />

regulated industry<br />

mandated by law to<br />

retain x number of years<br />

of data, or one more<br />

acutely concerned with<br />

employees accidentally<br />

deleting files, the first<br />

pain point that<br />

customers usually have<br />

is focused on data protection.<br />

There are several reasons for<br />

companies to resort to backing<br />

up their data via the cloud.<br />

Firstly, with ransomware attacks<br />

more frequent than ever before<br />

and hardware failure still an<br />

issue, organisations traditionally<br />

have local backup as their<br />

primary means of<br />

protecting data.<br />

However, local backup<br />

is still vulnerable for<br />

several reasons such as<br />

SAN failure, double<br />

disk fault or power<br />

loss.<br />

Secondly, backups<br />

are necessary and<br />

mandatory, but local<br />

backups might not<br />

save organisations in<br />

certain situations.<br />

What if the power in<br />

the building goes out?<br />

How will they restore their data? If the<br />

hardware is broken and it takes four<br />

weeks for the hardware to recover, that<br />

doesn't help an organisation to get back<br />

up and running to continue with 'business<br />

as usual'.<br />

Thirdly, IT resilience is the ability to<br />

quickly bring organisations online so they<br />

can continue to run their business no<br />

matter what the issue.<br />

Whatever the situation is, organisations<br />

need to be able to quickly get IT<br />

infrastructure back in operation, no matter<br />

what is going on in their data centre.<br />

IT resilience and Disaster Recovery as a<br />

Service (DRaaS) has always been a<br />

challenge for companies because, in the<br />

old days, organisations would have to<br />

have a secondary data site, or use old<br />

hardware, replicate all data and runbooks<br />

and plans, and have to test it, etc. It was<br />

just absurd and only the largest enterprise<br />

organisations could afford to do it.<br />

With the cloud's model of 'pay for what<br />

you use' and 'pay for what you need',<br />

companies of any size can replicate their<br />

data, infrastructure and entire application<br />

stack to the cloud more cost effectively<br />

than buying additional data centre space<br />

or running on-premise backup and DR.<br />

THE 3-2-1 RULE<br />

The 3-2-1 backup rule is an easy-toremember<br />

shorthand for a common<br />

approach to keeping organisations' data<br />

safe in almost any failure scenario.<br />

The rule is: keep at least three (3) copies<br />

16 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


MANAGEMENT: DATA PROTECTION<br />

"IT resilience and Disaster Recovery as a Service has always been a challenge for<br />

companies because, in the old days, organisations?would have to have a<br />

secondary data site, or use old hardware, replicate all data and runbooks and<br />

plans, and have to test it, etc. It was just absurd and only the largest enterprise<br />

organisations could afford to do it. With the cloud's model of 'pay for what you use'<br />

and 'pay for what you need', companies of any size can replicate their data,<br />

infrastructure and entire application stack to the cloud more cost effectively than<br />

buying additional data centre space or running on-premise backup and DR."<br />

of the organisation's data, one being the<br />

production environment. Then store two<br />

(2) backup copies, which is usually initial<br />

backup on different storage media such as<br />

a tape, snapshot, hard drive etc. Then<br />

store one (1) of them located offsite.<br />

There are several reasons why the last<br />

stage is important. If you think about<br />

ransomware, nowadays it has the ability to<br />

find locally attached backups and encrypt<br />

them. Or organisations could have a<br />

power failure where, if everything is in the<br />

same building, they are left with no<br />

backup at all.<br />

Historically, a lot of companies would<br />

resort to trading copies of their tapes,<br />

putting them on a truck and sending them<br />

somewhere else. That introduces all sorts<br />

of challenges with humidity, transportation<br />

of the tape, where it is being stored, will<br />

they have the same tape type and will we<br />

be able to access it in two years?<br />

Organisations still want to have that airgapped<br />

copy of their data, but cloud<br />

introduces a whole new way of<br />

addressing that as it is easily accessible<br />

by anyone, anywhere.<br />

HOW TO FUTUREPROOF<br />

INFRA<strong>ST</strong>RUCTURE<br />

Cloud is an elegant solution to address these<br />

data protection and business continuity<br />

issues, and one that is within the capabilities<br />

and budgets of every organisation. By using<br />

cloud to follow the 3, 2, 1 rule of data<br />

availability, organisations gain the confidence<br />

that they can have a failure and still be able<br />

to recover their data.<br />

Data centre mobility and cloud enable those<br />

business-critical workloads to continue no<br />

matter what the scenario: a new norm,<br />

global pandemic etc. The cloud allows<br />

organisations to meet their business needs<br />

whilst protecting their data. It allows<br />

organisations to spin up VMs and virtual<br />

assets, and quickly connect to their<br />

infrastructure whether on-premises or in<br />

another cloud. It also lets companies<br />

continue to work remotely in the middle of a<br />

pandemic or other physically disruptive crisis,<br />

such as an extreme weather event, at a lower<br />

price point.<br />

RETAINING PROTECTION <strong>ST</strong>ANDARDS<br />

Organisations can migrate their data to the<br />

cloud for cost and continuity purposes. Once<br />

data is migrated, it is still critical to focus on<br />

data protection. The data will be protected<br />

with the help of the CSP, but they can't stop<br />

doing backup or IT resilience testing.<br />

By supplementing the production<br />

environment with backup and DR in the<br />

cloud, the organisation can ensure that they<br />

have those multiple copies, and air-gapped<br />

back-ups, that can be failed over to almost<br />

instantaneously should an issue occur with<br />

the primary infrastructure.<br />

As an increasing number of organisations<br />

want to get out of the business of managing<br />

their data and just focus on delivering<br />

business value with their IT assets, the cloud is<br />

providing the answer for both primary and<br />

backup infrastructure.<br />

The 3-2-1 backup rule is a good start in<br />

building any data protection system - a way<br />

to protect an organisation's data from<br />

loss/corruption and to control risks in all the<br />

aforementioned situations. The cloud offers<br />

incredibly effective and resource-efficient<br />

ways of achieving this and improving<br />

business continuity and resilience at a time<br />

when events are showing us it has never been<br />

more important.<br />

More info: www.iland.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

17


INDU<strong>ST</strong>RY FOCUS: MEDIA FOCUS: MEDIA<br />

COULD RUSHES BE KEY TO DISPROVING<br />

'DEEPFAKE' VIDEO?<br />

NICK PEARCE-TOMENIUS OF OBJECT MATRIX LOOKS AT SOME OF THE<br />

POTENTIAL COMPLIANCE ISSUES SURROUNDING LONG TERM<br />

<strong>ST</strong>ORAGE OF RAW FOOTAGE FOR TV AND MEDIA PRODUCTION<br />

COMPANIES<br />

Arecent article in the Guardian raised<br />

the possibility of footage on the Jeremy<br />

Kyle show having been altered in order<br />

to tell the story that the producers wanted to<br />

be told, saying: "The family has concerns that<br />

the footage is polished and edited, and does<br />

not represent the totality of the footage that<br />

would have been recorded on all cameras<br />

on the day."<br />

The lack of retention of 'rushes' in a drama is<br />

unlikely to have a negative impact on society<br />

in future years but as the Kyle story highlights<br />

the retention of original footage needs to be<br />

taken more seriously where factual content is<br />

being edited or manipulated.<br />

Another example where studio footage was<br />

key in a criminal prosecution is the "Who<br />

Wants To Be A Millionaire" cheating case, as<br />

Wikipedia recounts: "In court, Ingram claimed<br />

the videotape of his appearance on<br />

Millionaire was 'unrepresentative of what I<br />

heard', and he continues to assert that it was<br />

'unfairly manipulated'. A video recording, with<br />

coughing amplified relative to other sounds<br />

including Ingram's and Tarrant's voices, was<br />

prepared by Celador's editors for the<br />

prosecution and 'for the benefit of the jury'<br />

during the trial."<br />

Given its nature live action content is difficult<br />

to manipulate even with the 'broadcast delay"<br />

but not so if the delay is in the minutes, hours,<br />

days or months, as is typical for reality based<br />

programming. This raises three questions for<br />

those producing factual content and also<br />

presents a real challenge for those<br />

organisations in terms of retaining the potential<br />

hundreds of hours of raw footage that goes<br />

into producing an hour of finished content:<br />

1. How are production companies and<br />

broadcasters protecting rushes or footage<br />

captured by studio cameras on the day?<br />

2. Can they prove authenticity of those rushes<br />

in the years to come?<br />

3. Is it even possible to retain the original<br />

footage and find the clips you need when<br />

required?<br />

Protecting rushes/dailies is not new in highly<br />

regulated industries like financial institutions<br />

that typically are required to adhere to internal<br />

or external regulations. They typically have to<br />

implement platforms and processes that ensure<br />

content security, access control and availability<br />

of historical data.<br />

Imagine the scenario where an analyst from a<br />

global bank gives an interview where the<br />

advice imparted during broadcast differs from<br />

the advice given on camera at the time of<br />

shooting - advice that might bankrupt<br />

individuals, companies or even countries. This<br />

manipulation of the message or story can be<br />

achieved with subtle editing or more recently<br />

the advances in Deepfake technology.<br />

A flip side to Deepfake video or manipulation<br />

in the edit is that people, politicians in<br />

particular, could use the fact that the<br />

technology exists to vehemently deny ever<br />

having said or done something on camera, as<br />

highlighted in a recent article by Daniel<br />

Thomas (BBC News): "The first risk is that<br />

people are already using the fact Deepfakes<br />

exist to discredit genuine video evidence. Even<br />

though there's footage of you doing or saying<br />

something you can say it was a Deepfake and<br />

it's very hard to prove otherwise."<br />

It would appear that being able to prove the<br />

authenticity of raw footage has never been<br />

more important.<br />

HOW IS IT DONE TODAY?<br />

Production companies who own the IP and<br />

rights for shows like "Jeremy Kyle" and<br />

"Millionaire" typically rent the studios and pay<br />

for the services of post-production companies<br />

18 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


INDU<strong>ST</strong>RY FOCUS: INDU<strong>ST</strong>RY FOCUS: MEDIA<br />

to get the show made. Those studio and post<br />

companies will generally be responsible for<br />

protecting the rushes until the show has aired<br />

and many will hold on to them for longer<br />

periods of time until they no longer have the<br />

physical space or resources to manage them.<br />

These cases highlight the need to find<br />

content from a show aired several years<br />

ago - a task that cannot always be done<br />

quickly, if at all. The Jeremy Kyle rushes<br />

were protected by the post-production<br />

company involved, but that is not always<br />

the case. Most organisations simply do not<br />

have the technology platforms nor the<br />

processes in place.<br />

One of the main concerns will always be<br />

"What is the business model?" Keeping<br />

finished content in an archive requires<br />

resources and long term investment but there<br />

is a value in exploiting that content. Doing the<br />

same for thousands of hours of raw footage<br />

has a less obvious return on investment.<br />

The only way companies will feel compelled<br />

to archive rushes forever is via regulation or<br />

as an insurance requirement to assist should<br />

any future litigation occur. If such regulations<br />

are introduced companies will be expected to<br />

find and produce evidential content within<br />

reasonable time frames or get fined.<br />

DIGITAL CONTENT GOVERNANCE<br />

CAN HELP<br />

Good Digital Content Governance, a mix of<br />

process and technology, can ensure that<br />

content is protected, instantly accessible and<br />

proven to be authentic at any time in the<br />

future. It can also help organisations to beat<br />

Deepfake or disprove manipulated images.<br />

Ensuring content is authentic: DCG<br />

platforms make multiple copies of content<br />

on ingest using checksums (digital<br />

fingerprints) to ensure its integrity from day<br />

one and throughout the lifetime of the<br />

content. DCG can place retention policies<br />

on the data such that not even<br />

administrators can accidentally delete it.<br />

Protecting data: Digital Preservation<br />

processes ensure your content is protected<br />

at ingest and remains protected throughout<br />

its lifetime. However, this requires regular<br />

integrity checking which can be a costly<br />

exercise with legacy technology such as<br />

LTO. DCG platforms handle all aspects of<br />

good digital preservation practice, from<br />

continuous content protection and multiple<br />

copy protection (on and off-site) to<br />

business rules support.<br />

Access: Providing searchable audits of<br />

every action during the lifetime of the<br />

media is essential, as it means you can<br />

track exactly what has happened to that<br />

content and who has accessed it. DCG<br />

platforms offer native, searchable audits of<br />

every action from ingest, moves, deletions,<br />

attempted deletions and most importantly,<br />

read. It has to be said that audit is also<br />

possible with public cloud accounts if the<br />

user logins are granular to individuals<br />

performing the actions.<br />

Search: Find is key. With the increasing<br />

volume of data in and out of a facility,<br />

metadata management is as important as<br />

protecting the content itself. The ability to<br />

search for content based on up-to-date<br />

and relevant metadata will unlock the<br />

value of content for many organisations.<br />

Loosely coupled metadata and content will<br />

always make Find an inefficient or<br />

impossible process. DCG platforms protect<br />

the metadata along with the essence for<br />

the lifetime of the content. Using APIs<br />

enables future proof, integrated and<br />

automated workflows that ensure content<br />

can be found even if media asset<br />

management is not available. DCG<br />

platforms can also automate the extraction<br />

and indexing of any embedded metadata<br />

which can vastly increase search efficiency.<br />

Business Continuity: Using incumbent<br />

platforms that rely on legacy archive and<br />

backup practices does not guarantee<br />

continuity of business operations. It is a fact<br />

that loss of data or access to data can lead<br />

to catastrophic loss of revenue for any<br />

sized company. DCG platforms provide<br />

automated and integrated business<br />

continuity functionality ensuring work can<br />

continue despite any outages.<br />

Implementing automated, asynchronous<br />

replication of metadata, data and user<br />

access information ensures that everything<br />

that is needed will be available at the DR<br />

location. Integration of DCG platforms into<br />

the end user ecosystem (i.e. they do not<br />

have to learn new skills) also makes this a<br />

non-disruptive process.<br />

As detailed above, implementing a good<br />

DCG platform that is integrated into media<br />

workflows will bring value to the organisation<br />

and ensure content can be found under any<br />

circumstances.<br />

In summary there are some technical,<br />

commercial and cultural issues to address in<br />

the creative video community if raw footage<br />

and archive content is to be protected in<br />

accordance with internal or external<br />

regulations. One of the biggest challenges will<br />

be the physical resources needed to archive<br />

thousands of hours of potentially 4 and 8k<br />

raw footage.<br />

One potential option is to create a<br />

mezzanine or proxy version of those rushes, in<br />

a certified transformation workflow, that take<br />

up much less space than the originals but<br />

retain enough quality for video processing to<br />

be applied at future dates. Metadata can be<br />

captured during the ingest and transformation<br />

process or that metadata can be generated<br />

later on using AI platforms.<br />

Keeping those rushes on LTO or SAN/NAS<br />

platforms will not be sufficient in terms of good<br />

Digital Content Governance nor the ability to<br />

efficiently process the files in automated<br />

workflows. These rushes will need to be kept in<br />

an object storage or cloud storage platform<br />

whose automated technologies ensure that<br />

good DCG is followed and ensures that rushes<br />

are instantly available and searchable.<br />

More info: www.object-matrix.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

19


<strong>ST</strong>RATEGY: BACKUP BACKUP<br />

BACKUP TO THE FUTURE<br />

BILL ANDREWS, PRESIDENT & CEO OF EXAGRID, EXAMINES THE JOURNEY FROM SIMPLE TAPE<br />

BACKUPS TO TIERED DISK BACKUPS THAT USE ADAPTIVE DEDUPLICATION FOR FA<strong>ST</strong>, RELIABLE AND<br />

AFFORDABLE BACKUP AND RE<strong>ST</strong>ORE SOLUTIONS<br />

An organisation cannot function without<br />

its data. As a result, data is backed up at<br />

least five days per week at virtually every<br />

company around the world. Data backup<br />

guards against short term operational and<br />

external events as well as for legal, financial,<br />

and regulatory business requirements:<br />

Restore files that were deleted, overwritten<br />

or from before a corruption event<br />

Recovery from a primary storage<br />

ransomware attack<br />

Keep retention/historical data for legal<br />

discovery, financial and regulatory audits<br />

Replicate to a second location to guard<br />

against disasters at the primary data<br />

location, such as earthquake, electrical<br />

power grid failure, fire, flood or extreme<br />

weather conditions<br />

Due to all of these requirements, backup<br />

retention points are kept so that organisations<br />

have a copy of the data at various points in<br />

time. Most keep a number of weekly, monthly,<br />

and yearly backups. As an example, if an<br />

organisation keeps 12 weekly copies, 24<br />

monthly copies, and 5 yearly copies - that<br />

amounts to about 40 copies of the data. This<br />

means that the backup storage capacity<br />

required is 40 times the primary storage<br />

amount.<br />

Since backup policies require keeping<br />

retention copies, and the storage needed for<br />

backup is far greater than the primary storage,<br />

the industry has evolved over time to reduce<br />

the amount of storage required in order to<br />

reduce the cost of backup storage.<br />

PHASE 1: TAPE<br />

Backups were sent to tape for about 50 years<br />

because if organisations were going to keep<br />

30, 40, 50, 60 copies of the data (retention<br />

points), then the only cost-effective way to keep<br />

those copies was to use a media that was very<br />

inexpensive per gigabyte. Tape solved the cost<br />

problem as it was inexpensive - but it was also<br />

unreliable because it was subject to dirt,<br />

humidity, heat, wear, etc. Tape also required a<br />

lot of management, including storing tapes in<br />

cartons and shipping a set of tapes offsite each<br />

week to another location or a third-party tape<br />

storage facility. Tape backups were great for<br />

cost but had many issues.<br />

PHASE 2: LOW-CO<strong>ST</strong> DISK<br />

Disk solved all the problems of tape, as it was<br />

reliable, and it was secure since it was in a<br />

data centre rack behind physical and network<br />

security. Organisations could encrypt the data<br />

to replicate to a second data centre (no<br />

physical media to ship).<br />

Disk was far too expensive per gigabyte until<br />

the year 2000 when enterprise-quality SATA<br />

drives were introduced. This dropped the price<br />

of backing up to disk dramatically, as SATA was<br />

reliable enough for backup storage. However,<br />

even at a lower cost, disk was still too<br />

expensive when you did the math of keeping<br />

dozens of copies.<br />

All of the backup applications added writing<br />

to volumes or NAS shares to their products so<br />

that disk could be used. Disk was used as disk<br />

staging in front of tape, but not tape<br />

elimination. Backup applications would write<br />

one or two backups to disk for fast and reliable<br />

backups and restores but still write to tape for<br />

longer-term retention due to cost.<br />

PHASE 3: DATA DEDUPLICATION<br />

APPLIANCES<br />

Although SATA disk was lower in price than any<br />

other enterprise storage media, it was still too<br />

expensive to keep all the retention on disk. In<br />

the 2002-2005 time frame a new technology,<br />

data deduplication, entered the market. Data<br />

deduplication compared one backup to<br />

another and only kept the changes from<br />

backup to backup, which typically is about 2%<br />

change per week. The backups were no longer<br />

kept as full backups as only the unique blocks<br />

were kept, greatly reducing the storage.<br />

Data deduplication did not have much<br />

impact if there were only two or three copies<br />

and in fact, was not much different from just<br />

compressing the data. However, at 18 copies<br />

20 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


<strong>ST</strong>RATEGY: <strong>ST</strong>RATEGY: BACKUP<br />

"There is no free lunch here and the different storage methods are just<br />

pushing the problem around. Why is that? Because unless you build a<br />

solution that includes deduplication and also solves the backup<br />

performance, restore performance, storage efficiency, and scalability<br />

issues - then no matter where the deduplication lives, the solution will<br />

still be broken. The answer is a solution that is architected to<br />

use disk in the appropriate way for fast backups and<br />

restores, uses data deduplication for long-term retention<br />

and scale-out all resources as data grows."<br />

the amount of disk used was 1/20th that of not<br />

using data deduplication. You could store 1TB<br />

in a deduplicated form what it would normally<br />

take 20TB of disk to store without<br />

deduplication. The term 20:1 data reduction<br />

was used (assumed at 18 copies of retention).<br />

If the retention was longer, the data reduction<br />

ratio was even greater.<br />

At this point, organisations could eliminate<br />

tape as the amount of disk required was greatly<br />

reduced, bringing the cost of backup storage<br />

close to that of tape. However, while these<br />

appliances added data deduplication in order<br />

to reduce storage they did not factor in the<br />

trade-off of the compute impact. These<br />

"deduplication appliances" performed the<br />

deduplication inline which means between the<br />

backup application and on the way to the disk.<br />

Data deduplication compares billions of blocks<br />

and therefore is extremely compute-intensive.<br />

This compute-intensive inline deduplication<br />

process then, actually slows backups down and<br />

constitutes about one third the performance of<br />

writing to disk. Since the backups are inline all<br />

the data is deduplicated on the disk, which<br />

means each time you restore the data it has to<br />

be put back together to restore it, which is<br />

called rehydration. This rehydration process is<br />

slow and can take up to 20 times longer than<br />

restoring un-deduplicated data from disk.<br />

The deduplication appliance used block level<br />

deduplication which creates a very large hash<br />

tracking table that needs to be kept in a single<br />

front-end controller. As a result, as data grows,<br />

only storage is added to the controller. If the<br />

data doubles, triples, quadruples, etc. then the<br />

amount of deduplication that has to occur also<br />

needs to be increased, but with a front end<br />

controller the resources (CPU, memory,<br />

network ports) are fixed, and therefore the<br />

same resources are used for four times the<br />

data as were used for one times the data.<br />

As a result, the backup window grows and<br />

grows until you are forced to buy a bigger and<br />

more powerful front-end controller, called a<br />

forklift upgrade, which adds cost over time. The<br />

front-end controller approach relies on fixed<br />

resources and it fails to keep up with data<br />

growth, so the controllers are continuously<br />

being obsoleted to add more resources.<br />

Even though inline scale-up (front-end<br />

controller with disk shelves) appliances lower<br />

the amount of storage resulting in lower<br />

storage costs, they greatly slow down<br />

backups due to inline deduplication, slow<br />

down restores due to only keeping<br />

deduplicated data (rehydration process), and<br />

don't scale, forcing future forklift upgrades<br />

and product obsolescence, adding long term<br />

costs. The net result is that they fix the storage<br />

cost problem but add to backup and restore<br />

performance issues and are not architected<br />

for data growth (scalability).<br />

PHASE 4: DATA DEDUPLICATION IN<br />

BACKUP APPLICATIONS<br />

Customers used and still use data<br />

deduplication appliances however; the<br />

backups applications went through a phase<br />

where they tried to eliminate data<br />

deduplication appliance by integrating the<br />

data deduplication process into the backup<br />

media servers. The idea here was to just buy<br />

low cost disk and have deduplication as a<br />

feature in the backup application. This created<br />

many challenges.<br />

The first challenge is that data deduplication<br />

is compute-intensive and the media server<br />

already has the task of taking all the backups<br />

and writing them to the media so that all<br />

compute resources are already being used. By<br />

adding deduplication to a media server, the<br />

CPU is crushed, and the backups jobs slow to<br />

a crawl. To solve this, backup applications<br />

increase the deduplication block size to do less<br />

comparison and use less CPU. Instead of using<br />

block sizes of 8kb they used (for example)<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards<br />

<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

21


<strong>ST</strong>RATEGY: BACKUP BACKUP<br />

worlds, both disk without deduplication for fast<br />

backups and restores, and deduplication to<br />

lower the overall storage costs. The first tier is<br />

a disk-cache (Landing Zone) where backups<br />

are being written to standard disk in their<br />

native format (no deduplication to slow it<br />

down).<br />

128kb. Instead of achieving the 20:1<br />

deduplication ratio of a deduplication<br />

appliance, they achieve a rate of about 5:1 or<br />

6:1. Also, they are slowing down the media<br />

server and all data is deduplicated on the disk<br />

so restores are still slow.<br />

Lastly, the same scaling issues remain. Some<br />

of the backup application companies<br />

packaged up the media server with<br />

deduplication with a server and disk to create a<br />

turnkey appliance however the challenges still<br />

exist: slow backups, slow restores, scalability<br />

issues, and the cost is higher since they use a<br />

lot more disk than a deduplication appliance<br />

because they have a lower deduplication ratio<br />

due to a larger block size.<br />

WHERE DOES THIS LEAVE US?<br />

There is no doubt that disk is the right medium.<br />

It is reliable and lives in a data centre rack with<br />

both physical and network security, both onsite<br />

and offsite. If data is backed up to disk without<br />

data deduplication the backup and restore<br />

performance is great, however the cost is high<br />

due to the sheer amount of disk required.<br />

Using an inline deduplication appliance, you<br />

can reduce the high cost of storage due to the<br />

20:1 deduplication ratio. However, all of these<br />

appliances are slow for backups due to inline<br />

deduplication processing, slow for restores due<br />

to only keeping deduplicated data that needs<br />

to be rehydrated with each request, and they<br />

don't scale as data grows which grows the<br />

backup window over time and forces costly<br />

forklift upgrade and product obsolescence.<br />

If deduplication is used in a backup<br />

application the performance is even slower<br />

than a deduplication appliance as the CPU is<br />

being shared between the deduplication<br />

process and media server functionality. The<br />

backups applications can improve this with<br />

incremental backups only but there are other<br />

trade-offs. In addition, far more disk is required<br />

as the deduplication ratio is more in the range<br />

of 5:1 to 10:1 rather than 20:1.<br />

There is no free lunch here and the different<br />

storage methods are just pushing the problem<br />

around. Why is that? Because unless you build<br />

a solution that includes deduplication and also<br />

solves the backup performance, restore<br />

performance, storage efficiency, and scalability<br />

issues - then no matter where the deduplication<br />

lives, the solution will still be broken. The<br />

answer is a solution that is architected to use<br />

disk in the appropriate way for fast backups<br />

and restores, uses data deduplication for longterm<br />

retention and scale-out all resources as<br />

data grows.<br />

PHASE 5: THE FUTURE - TIERED BACKUP<br />

<strong>ST</strong>ORAGE<br />

Tiered backup storage offers the best of both<br />

This allows for fast backups and fast restores<br />

as there is no deduplication process in<br />

between the backup and the disk, and the<br />

most recent backups are stored in an undeduplicated<br />

format. As the backups are<br />

being written to disk, in parallel with backups<br />

coming in, the data is deduplicated into a<br />

second tier for longer-term retention storage.<br />

This is called Adaptive Deduplication (it is not<br />

inline, and it is not post process). The system is<br />

comprised of individual appliances that each<br />

have CPU, memory, networking, and storage,<br />

and as data grows all resources are added<br />

which keeps the backup window fixed in<br />

length as data grows and eliminates both<br />

forklift upgrades and product obsolescence.<br />

The net is:<br />

Backups are as fast as writing to disk<br />

as there is no deduplication process in<br />

the way<br />

Restores are fast as there is no data<br />

rehydration process, because the most<br />

recent backups are in a nondeduplicated<br />

form<br />

Cost is low upfront because all long-term<br />

retention data is deduplicated in the longterm<br />

repository tier<br />

Backup window stays fixed in length as<br />

data grows as the architecture is scaleout,<br />

adding all resources and not just disk<br />

as data grows<br />

Long-term costs are low as the scale-out<br />

architectural approach eliminates forklift<br />

upgrades and product obsolescence<br />

In summary then, backup storage has taken<br />

a long journey and has arrived with tiered<br />

backup storage that provides fast and reliable<br />

backups and restores, with a low cost up front<br />

and over time.<br />

More info: www.exagrid.com<br />

22 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


TECHNOLOGY:<br />

TECHNOLOGY: CLOUD DATA WAREHOUSING<br />

CLOUD: YOUR FLEXIBLE FRIEND<br />

WHAT IS A CLOUD DATA WAREHOUSE AND WHY IS IT IMPORTANT? ROB MELLOR, VP AND GM EMEA,<br />

WHERESCAPE, SHARES SOME INSIGHTS<br />

We are seeing business<br />

expectations for on-demand data<br />

explode, with many data<br />

warehousing teams beginning to transition<br />

their data warehousing efforts to the<br />

cloud. With the need to efficiently pull<br />

together data from a wide range of everevolving<br />

data sources and present it in a<br />

consumable way to a broadening<br />

audience of decision makers, cloud data<br />

warehousing is proving invaluable.<br />

In this article, we are going to cover the<br />

basics and explore cloud data<br />

warehousing; how the cloud data<br />

warehouse compares to the traditional<br />

data warehouse, and the benefits of a<br />

cross-cloud solution.<br />

WHAT IS A CLOUD DATA<br />

WAREHOUSE?<br />

A cloud data warehouse is a database<br />

service hosted online by a public cloud<br />

company. It has the functionality of an onpremises<br />

database but is managed by a<br />

third party, can be accessed remotely and<br />

its memory and compute power can be<br />

shrunk or grown instantly.<br />

TRADITIONAL VS. CLOUD<br />

A traditional data warehouse is an<br />

architecture for organising, storing and<br />

accessing ordered data, hosted in a data<br />

centre on premises owned by the<br />

organisation whose data is stored within it.<br />

It is of a finite size and power and is<br />

owned by that organisation.<br />

A cloud data warehouse is a flexible<br />

volume of storage and compute power,<br />

which is part of a much bigger public<br />

cloud data centre and is accessed and<br />

managed online. Storage and compute<br />

power is merely rented. Its physical<br />

location is largely irrelevant apart from for<br />

countries and/or industries whose<br />

regulations dictate their data must be<br />

stored in the same country.<br />

BENEFITS OF THE CLOUD<br />

APPROACH<br />

The benefits of a Cloud Data Warehouse<br />

can be summarised in five main points:<br />

1. Access<br />

Rather than having only physical access to<br />

databases in data centres, cloud data<br />

warehouses can be accessed remotely<br />

from anywhere. As well as being<br />

convenient for staff who live near the data<br />

centre, who can now troubleshoot from<br />

home or anywhere out of hours if needed,<br />

this access means companies can hire staff<br />

based anywhere, which opens up talent<br />

pools that were previously unavailable.<br />

Cloud data warehousing is self-service<br />

and so its provision does not depend on<br />

the availability of specialist staff.<br />

2. Cost<br />

Data centres are expensive to buy and<br />

maintain. Property to store them in needs<br />

to be properly cooled, insured and<br />

expertly staffed, and the databases<br />

themselves come at a huge cost. Cloud<br />

data warehousing allows the same service<br />

to be enjoyed, but you only pay for the<br />

computing and storage power you need,<br />

when you need it. Now with elastic cloud<br />

services such as Snowflake, compute and<br />

storage can be bought separately, in<br />

different amounts. You really only have to<br />

pay for what you are using, and you can<br />

instantly close or downsize capabilities you<br />

do not need.<br />

24 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


TECHNOLOGY:<br />

TECHNOLOGY: CLOUD DATA WAREHOUSING<br />

"Hosting data in a Cloud data warehouse means you<br />

can switch providers if and when it suits changes in<br />

business strategy. Staying database-agnostic means you<br />

have the agility to upsize, downsize or switch completely.<br />

Metadata-driven automation software allows you to lift<br />

and shift entire data infrastructures on and off of a<br />

Cloud data warehouse if desired, and allows different<br />

teams within the same company to work with the<br />

database and hybrid cloud structure that best suits<br />

their needs."<br />

3. Performance<br />

Cloud service providers compete to offer<br />

use of the most performant hardware for a<br />

fraction of the cost that would be incurred<br />

to reproduce such power on-premises.<br />

Upgrades are performed automatically, so<br />

you always have the latest capabilities and<br />

do not experience downtime in upgrading<br />

to the latest 'version'. Some on premises<br />

databases offer faster performance, but not<br />

at the cost and availability of the<br />

'Infrastructure-as-a-service' that Cloud<br />

providers offer.<br />

4. Scalability<br />

Opening a Cloud data warehouse is as<br />

simple as opening an account with a<br />

provider such as Microsoft Azure, AWS<br />

Redshift, Google BigQuery and Snowflake.<br />

The account can be grown, shrunk, or even<br />

closed instantly. Users are aware of the<br />

costs involved before they change the<br />

amount of compute or storage they rent.<br />

This scalability has led to the coining of the<br />

phrase 'Elastic Cloud'.<br />

5. Agility<br />

Hosting data in a Cloud data warehouse<br />

means you can switch providers if and when<br />

it suits changes in business strategy. Staying<br />

database-agnostic means you have the<br />

agility to upsize, downsize or switch<br />

completely. Metadata-driven automation<br />

software allows you to lift and shift entire<br />

data infrastructures on and off of a Cloud<br />

data warehouse if desired, and allows<br />

different teams within the same company to<br />

work with the database and hybrid cloud<br />

structure that best suits their needs.<br />

CHOOSING A SOLUTION<br />

A cost analysis is vital in estimating how<br />

much money a Cloud Data Warehouse<br />

might save a business. Different Cloud<br />

providers have different pricing structures<br />

that need bearing in mind. More<br />

established providers such as Amazon and<br />

Microsoft rent nodes and clusters, so your<br />

company uses a defined section of the<br />

server. This makes pricing predictable and<br />

constant, but sometimes maintenance to<br />

your particular node is needed.<br />

Snowflake and Google offer a 'serverless'<br />

system, which means the cluster locations<br />

and numbers are not defined and so are<br />

irrelevant. Instead the customer is charged<br />

for the exact amount of compute or<br />

processing power it consumes. However, in<br />

bigger companies it is often difficult to<br />

predict the amount of users and size of a<br />

process before it occurs. It is possible for<br />

queries to be much bigger than was<br />

assumed and so cost much more than was<br />

expected.<br />

Each cloud provider has its own suite of<br />

supporting tools for functions such as data<br />

management, visualisation and predictive<br />

analytics, so these needs should be factored<br />

in when deciding on which provider to use.<br />

Using cloud-based data warehouse<br />

platforms, you can gather even more data<br />

from a multitude of data sources and<br />

instantly and elastically scale to support<br />

virtually unlimited users and workloads.<br />

With the ability to manage the influx of big<br />

data, using automation to aid in providing<br />

return on investment, businesses will be<br />

able to manage the influx of big data,<br />

automate manual processes and maximise<br />

the return on cloud.<br />

More info: www.wherescape.com<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

25


CASE <strong>ST</strong>UDY: CINESITE <strong>ST</strong>UDY: CINESITE<br />

RENDERING ASSI<strong>ST</strong>ANCE<br />

DIGITAL ENTERTAINMENT <strong>ST</strong>UDIO CINESITE IS ABLE TO BRING MOTION PICTURES TO AUDIENCES FA<strong>ST</strong>ER<br />

WITH CLOUD RENDERING<br />

FRU<strong>ST</strong>RATIONS DRIVE NEW<br />

THINKING<br />

During the recent production of a fulllength<br />

animated feature film, Cinesite ran<br />

into technology issues with the existing<br />

cluster it had recently purchased. That<br />

vendor's system was causing network<br />

slowdowns for unknown reasons - for up to<br />

minutes at a time.<br />

Cinesite is a leading digital<br />

entertainment studio with credits on<br />

animated feature films such as The<br />

Addams Family, Extinct and Riverdance<br />

and VFX projects such as Avengers:<br />

Endgame, Rocketman, The Witcher, and<br />

the James Bond franchise. The company<br />

employs nearly 1,000 digital artists and<br />

staff, who work from offices across<br />

London, Montreal, Berlin, Munich, and<br />

Vancouver.<br />

Cinesite's award-winning visual effects<br />

and animation teams help bring<br />

filmmakers' visions to life. To support<br />

complex and demanding workflows for<br />

visual effects, and conceiving and<br />

realising CG-animated films, Qumulo<br />

and AWS enabled Cinesite to leverage<br />

high-performance storage at scale,<br />

helping Cinesite achieve more than the<br />

studio ever thought possible, including<br />

developing scalable 16K video workflows<br />

for future applications.<br />

Cinesite's existing infrastructure included<br />

a newly-installed but older generation<br />

storage technology from another provider<br />

that supported approximately 500 render<br />

nodes in the Montreal data centre, and a<br />

workflow that leveraged AWS for<br />

occasional overflow rendering.<br />

Eventually, the slow-downs became full<br />

interruptions - freezing the productivity of<br />

465 artists for as long as an hour. The<br />

system freezes could happen at any time,<br />

and that put production schedules at risk.<br />

JUMPING INTO ACTION<br />

Frustrated with that vendor's system and its<br />

inability to solve the problem, Cinesite<br />

approached Qumulo for ideas. Qumulo<br />

quickly deployed hardware nodes onsite,<br />

and was able to get Cinesite back up and<br />

running in short order.<br />

After the immediate need was solved,<br />

Qumulo engineers worked with the Cinesite<br />

team to diagnose other issues they were<br />

facing with their legacy systems to fully<br />

restore network speed. In fact, on one<br />

occasion, the Cinesite technical team was<br />

working on a solution well into the early<br />

morning - and reached out to Qumulo's<br />

customer success team at that unusual<br />

hour. Within sixty minutes, the Qumulo<br />

team responded with suggestions for<br />

configuration changes that would further<br />

increase network performance. Cinesite<br />

implemented those suggestions, got the<br />

performance it needed, and "from that day<br />

forward, we haven't looked back," said<br />

Graham Peddie, Chief Operating Officer,<br />

Cinesite Montreal.<br />

PLANNED TO EXPAND<br />

Cinesite knows first-hand the challenges of<br />

26 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


CASE <strong>ST</strong>UDY: CASE <strong>ST</strong>UDY: CINESITE<br />

resource planning. "We can't plan for the<br />

peaks, so we plan for an average," said<br />

Peddie. That is the way planning had<br />

worked in the past. It was clear Cinesite<br />

would need a modern, cloud- native<br />

solution to move to a competitive scale.<br />

With the visual effects (VFX) and feature<br />

animation division pipelines at full capacity,<br />

and no easy way to burst to the cloud at the<br />

scale Cinesite needed for the extraordinary<br />

render and storage requirements, the studio<br />

again turned to Qumulo for a way out.<br />

To achieve the scale Cinesite was after,<br />

moving the workload to AWS US East<br />

(Virginia) region from the smaller region<br />

they had been using was required. With the<br />

existing solution, this would have been no<br />

easy feat. With Qumulo, it was seamless.<br />

"The only way we could expand to the new<br />

zone was by implementing Qumulo cloud<br />

storage," said Peddie. "This approach<br />

allowed us to spin up the machines and<br />

store data for offsite rendering on AWS US<br />

East (Virginia). Without Qumulo, we<br />

wouldn't have been able to do this or meet<br />

our deadlines."<br />

Qumulo's hybrid file software runs the<br />

same enterprise file system in the cloud as<br />

on-prem, and data can be natively and<br />

seamlessly replicated between instances or<br />

across regions. Bursting to 20, 200, or<br />

even 2,000 high-quality render nodes on<br />

AWS with Qumulo to keep pace with all<br />

that power is no problem. Instances can be<br />

spun up in minutes, and torn down just as<br />

quickly.<br />

Spencer Kuziw, Lead Systems<br />

Administrator, Cinesite Montreal, explained:<br />

"Qumulo is a huge benefit to us. We can<br />

spin up as many high quality render nodes<br />

as we need, in as many regions as we<br />

need, without impacting our local storage.<br />

And the Qumulo hybrid cloud software can<br />

handle whatever we throw at it. It is an<br />

essential part of our cloud deployment<br />

strategy."<br />

QUMULO GETS IT<br />

Customer support is another crucial benefit.<br />

Cinesite's media and entertainment clients<br />

operate within pressure-packed deadlines,<br />

and the studio has to be highly proactive to<br />

meet their needs. "Qumulo is different,"<br />

Peddie said. "When it comes to our<br />

workflows and deadlines, Qumulo gets it.<br />

They know that we're under pressure. They<br />

know that solutions can't take weeks and<br />

months. We need issues solved quickly. So,<br />

for me, Qumulo's responsive and proactive<br />

customer support was an important benefit<br />

and set the company apart from all the<br />

other vendors we've seen."<br />

FINGER ON THE PULSE<br />

Analytics and real-time visibility are also<br />

crucial to Cinesite. Qumulo's real-time<br />

analytics tools enabled the studio to identify<br />

and fix pipeline inefficiencies. During a<br />

recent migration, real-time activity and<br />

usage analytics made it immediately clear<br />

that a script was making multiple copies of<br />

a directory, eating up space.<br />

Qumulo analytics show activity in realtime,<br />

including by directory growth, most<br />

active network IPs, most active file paths,<br />

and so on, making it simple to pinpoint a<br />

problem and quickly clean it up. Typically<br />

on other systems, common issues like that<br />

go unnoticed and storage capacity simply<br />

fills up, leaving admins with the task of<br />

running reports, waiting days for them to<br />

complete, then conducting forensics.<br />

EYES ON THE FUTURE<br />

Cinesite continues to consider additional<br />

cloud options to take advantage of the<br />

latest media and entertainment<br />

technologies. The team is exploring new<br />

and exciting projects like 16K-plus file sizes<br />

and unique applications outside of cinema.<br />

Peddie said, "We could never have tackled<br />

these technological and creative challenges<br />

without a cloud solution. Qumulo has<br />

enabled us to boost Cinesite's competitive<br />

position within the industry."<br />

More info: www.qumulo.com<br />

"Qumulo is a huge<br />

benefit to us. We can<br />

spin up as many high<br />

quality render nodes as<br />

we need, in as many<br />

regions as we need,<br />

without impacting our<br />

local storage. And the<br />

Qumulo hybrid cloud<br />

software can handle<br />

whatever we throw at it. It<br />

is an essential part of our<br />

cloud deployment<br />

strategy."<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

27


TECHNOLOGY:<br />

TECHNOLOGY: ENERGY CONSUMPTION<br />

POWER PLAY<br />

RAINER KAESE, SENIOR MANAGER, <strong>ST</strong>ORAGE PRODUCTS DIVISION, TOSHIBA ELECTRONICS<br />

EUROPE, SHARES SOME INSIGHTS FROM A RECENT EXPERIMENTAL PROJECT UNDERTAKEN AT THE<br />

COMPANY INTO THE ENERGY CONSUMPTION OF DISK DRIVES<br />

Energy efficiency initiatives have<br />

driven down energy consumption<br />

significantly over the past decades.<br />

Today's homes probably consume as<br />

much energy for lighting as that required<br />

for two or three old 100 W light bulbs.<br />

But who would have thought that, using<br />

the latest generation of hard disk drives,<br />

a petabyte of storage requiring less<br />

energy than five of those old light bulbs<br />

could be achieved?<br />

With the demand for always-on, online<br />

storage capacity for databases seemingly<br />

showing no signs of abating, it is vital to<br />

develop storage systems that can keep up<br />

with this growing flood of data while<br />

simultaneously fulfilling certain criteria.<br />

Cost per capacity ($/TB) is usually the<br />

most important of these, due to the<br />

immense quantities of data involved.<br />

However, energy consumption is another<br />

aspect to consider as this impacts the<br />

long-term operational costs. This energy<br />

should also be consumed efficiently,<br />

thereby reducing the need for cooling<br />

that also incurs costs.<br />

Physical dimensions of the end solution<br />

also need to be considered. Increasing<br />

the number of disks requires a housing<br />

with increased volume. Ideally, the server<br />

housing should easily be accommodated<br />

by a standard 19" rack system, fitting into<br />

existing infrastructure of 1000 mm long<br />

racks. Performance is obviously another<br />

factor but, if the key goals are high<br />

capacity at low power consumption, it is<br />

possible to tolerate toward lower IOPS or<br />

throughput figures.<br />

28 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


TECHNOLOGY:<br />

TECHNOLOGY: ENERGY CONSUMPTION<br />

In an investigation undertaken by the<br />

research team at Toshiba Electronics<br />

Europe GmbH, a project was undertaken<br />

to see if it was possible to build 1 PB of<br />

data storage into a system consuming less<br />

than 500 W of power.<br />

CHOICE OF <strong>ST</strong>ORAGE<br />

The requirement for mass capacity is<br />

achieved most cost-effectively with the use<br />

of HDDs, the top capacity models of<br />

which have similar $/TB ratios for 12 TB,<br />

14 TB and 16 TB models. However, in<br />

order to ensure that the final system<br />

would fit into a standard 19" rack, it<br />

clearly made sense to select the largest<br />

16 TB capacity drives to keep the physical<br />

volume required to an absolute minimum.<br />

This choice also aligns well with the<br />

power consumption goal, since the power<br />

dissipation per unit capacity has<br />

successively dropped with the introduction<br />

of new HDD models (see Table 1).<br />

This is due not only to the new<br />

technology implemented, but also thanks<br />

the move to helium-filled drives (see<br />

Figure 1).<br />

The 16 TB models of Toshiba's MG08<br />

series are available with both SAS and<br />

SATA interfaces. The SAS interface<br />

provides two 12 GB/s channels that are<br />

ideally suited to systems where high<br />

availability and throughput are a priority.<br />

However, there is a power consumption<br />

cost associated with this choice since SAS<br />

drives consume around one to two watts<br />

more than their SATA counterparts. Since<br />

the goal was to reduce power<br />

consumption, the SATA interface model<br />

MG08ACA16TE was the chosen<br />

candidate for this project.<br />

The individual specifications for this<br />

particular drive, in terms of power<br />

dissipation, are shown in Table 2.<br />

SELECTING AN ENCLOSURE<br />

With the storage defined, the next step<br />

was to select a suitable enclosure. Toploader<br />

models are convenient and<br />

available as a JBOD in four unit high 19"<br />

rack format. A 60-bay model from AIC,<br />

the AIC-J4060-02, was selected for this<br />

project. The single expander version was<br />

chosen, saving on cost and power<br />

dissipation, matching with the<br />

specification of the one-channel SATA<br />

interface. Once filled with 16 TB HDDs,<br />

the solution has a gross storage of 960<br />

TB, almost one petabyte. The JBOD is<br />

then connected to the host bus adapter<br />

(HBA) or RAID controller of the server via<br />

one mini SAS-HD cable.<br />

With a length of just 810 mm, this JBOD<br />

fits into any existing rack.<br />

BASELINE TE<strong>ST</strong>ING<br />

An initial power consumption<br />

measurement was made without the<br />

HDDs via the 220 V inputs to the twin<br />

redundant power supply. With no HDDs<br />

inserted, but both the JBOD and SAS link<br />

up, an initial measurement of 80 W was<br />

made. The next step was to measure<br />

power consumption with a single drive<br />

under different workload conditions.<br />

Write workloads were chosen that<br />

simulated archiving, video recording and<br />

backup using 64 kB sequential blocks.<br />

Using the same block size, sequential<br />

reads were also undertaken, equivalent to<br />

a backup recovery and media streaming<br />

workload. To provide a further data point,<br />

4 kB random read/writes were also<br />

performed, corresponding to the agile<br />

"hot-data" workload of databases.<br />

Obviously, these do not fully correlate<br />

with the typical workload for this type of<br />

system but allowed the collection of<br />

reference data for comparison purposes.<br />

In addition to these borderline cases a<br />

test with an approximate real workload<br />

was carried out. A mix of different block<br />

sizes was read and written randomly (4kB:<br />

20%, 64kB: 50%, 256kB: 20%, 2MB:<br />

10%). In order to achieve the maximum<br />

possible performance, all synthetic loads<br />

were executed with a queue depth (QD)<br />

of 16. In addition to these tests, a<br />

standard copy process was started on a<br />

logical drive under Windows and the<br />

power dissipation measured.<br />

The results for the individual drive use<br />

case consistently shows a lower power<br />

consumption than that given in the data<br />

sheet for the selected drive (see Table 3).<br />

Another point to note is that, in<br />

opposition to the data sheet, sequential<br />

loads result in higher power consumption<br />

than random access loads. This can be<br />

traced back to the power needs of the<br />

JBOD, since the SAS expanders require<br />

more power at high bandwidths in<br />

sequential operation.<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards<br />

<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

29


TECHNOLOGY:<br />

TECHNOLOGY: ENERGY CONSUMPTION<br />

TE<strong>ST</strong>ING VARIOUS<br />

CONFIGURATIONS<br />

With all the slots of the JBOD filled, the<br />

maximum power consumption when the<br />

system was idling lay at a respectable<br />

420 W. This is slightly higher than<br />

expected (80 W + 60 x 4 W = 320 W)<br />

and can be traced back to the fact that<br />

the controller occasionally addresses the<br />

HDDs even in idle mode. On the other<br />

hand, the peak start-up power measured<br />

lay at just 720 W, significantly lower than<br />

the sum of the JBOD plus the spin-up<br />

data sheet values for the HDDs (80 W +<br />

60 x 16.85 W = ~1100 W). This can be<br />

traced back to the staggered approach to<br />

spin-up the system employs, applying<br />

power to the HDDs one after the other.<br />

The system was re-tested using the<br />

same workloads used for single drive<br />

operation. The highest power<br />

consumption of 500 W measured<br />

occurred during sequential reads of 64kB<br />

blocks, while the lowest of 445 W was for<br />

both sequential 64 kB and random 4 kB<br />

writes (see Figure 2).<br />

Two further configurations were also<br />

investigated. The first combined the 60<br />

disks into a local RAID10 with 5 subarrays<br />

to create 480 TB net storage. This<br />

was then formatted as two 240 TB logical<br />

drives under Windows Server 2016.<br />

Here, sequential accesses required less<br />

power, while random accesses essentially<br />

matched that measured in JBOD mode.<br />

Implementing a software-defined,<br />

zettabyte file system (ZFS) using<br />

JovianDSS from Open-E also resulted in<br />

improvements in power consumption for<br />

read tests, but slightly higher<br />

measurements when writing. In this<br />

configuration two 800 GB enterprise<br />

SSDs were also added as a read cache<br />

and a write log buffer, with the resulting<br />

240 TB logical drives made available<br />

over iSCSI.<br />

CONCLUSIONS<br />

Toshiba Electronics Europe GmbH<br />

estimates the total capacity of enterprise<br />

capacity (Nearline) HDDs shipped in<br />

2019 at around 500 exabytes (500,000<br />

petabytes). If all these HDDs were<br />

operated as 16TB models in 60-bay<br />

JBODs, this would result in a continuous<br />

power consumption of 225MW<br />

(equivalent to an average coal-fired<br />

power plant). However, since the majority<br />

of HDDs delivered in 2019 had even<br />

lower capacities, it can be assumed that<br />

the power consumption was even higher<br />

and it is clear that there is significant<br />

room for improvement to reduce the<br />

industry's W/TB power consumption<br />

figures.<br />

The investigations and testing<br />

undertaken by Toshiba show that, thanks<br />

to the power efficiency of the latest<br />

generation of high-capacity, helium-filled<br />

disks, petabyte storage that typically<br />

demands less then 500 W of power is<br />

indeed achievable. This is a significant<br />

milestone for data centres working to<br />

grow capacity while keeping both capital<br />

expenditure and operating costs down.<br />

Additionally, this can be achieved in a<br />

range of configurations, from pure<br />

JBOD, through RAID, to softwaredefined,<br />

and in a standard dimension 19"<br />

rack format with a commonly available<br />

enclosure.<br />

More info: www.toshiba-storage.com<br />

30 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


OPINION:<br />

OPINION: DATA PROTECTION<br />

PEOPLE: THE WEAKE<strong>ST</strong> LINK<br />

FLORIAN MALECKI OF <strong>ST</strong>ORAGECRAFT WARNS THAT<br />

ORGANISATIONS NEED TO BEWARE 'THE VULNERABILITY FROM<br />

WITHIN': HUMAN ERROR<br />

While cyber threats continue to be a<br />

massive drain on business<br />

productivity, there is another, less<br />

obvious vulnerability: unintentional employee<br />

error. Indeed, a majority of businesses say<br />

that simple human error is their leading cause<br />

of data loss, according to a survey from<br />

StorageCraft.<br />

Among survey respondents, 61% reported<br />

that their company had suffered a data loss<br />

over the last two years. More striking is that<br />

67% said human error - everyday mistakes<br />

made by employees - was the primary reason<br />

for data loss and system outages. Human<br />

error, for example, weak passwords and "dirty"<br />

work environments, can be the pathway to<br />

security hacks and have potential to wreak<br />

havoc far greater than that of a third party with<br />

malicious intent.<br />

It can be as simple as an employee<br />

misplacing a spreadsheet or spilling coffee on<br />

their laptop. It could be someone who<br />

accidentally deletes a critical file or an entire<br />

database of critical information. Then there are<br />

the real-life oddities such as dropping a laptop!<br />

These seemingly small incidents can add up<br />

and potentially cripple a business.<br />

A few years ago, software company Gliffy<br />

experienced a nightmare scenario when one<br />

of its employees pressed the wrong key and<br />

deleted the company's entire production<br />

database. The same thing happened to<br />

GitLab a few years back, resulting in a major<br />

service outage.<br />

Perhaps the most famous data-deletion story<br />

involved Pixar during the production of Toy<br />

Story 2. One of the movie's animators<br />

accidentally entered a delete command,<br />

resulting in a cascade of errors that erased<br />

90% of the production files. To make matters<br />

worse, the data-backup system failed to work<br />

properly due to inadequate disk space. For a<br />

brief moment, there were fears that the entire<br />

production would have to be scrapped. It was<br />

only a Herculean effort by the technical crew<br />

that saved the film.<br />

The data-loss problem could become even<br />

more prevalent in the current and post-COVID<br />

world, as millions of people work remotely.<br />

Moving employees, their computers, and data<br />

from a secure office environment to a lesssecure<br />

home environment presents a range of<br />

unintentional data-loss risks.<br />

The reality is that employees will continue to<br />

make mistakes, they're only human, after all.<br />

Here are three ways that organisations can<br />

protect themselves against catastrophic data<br />

loss caused by human error:<br />

Promote good data backup habits. With so<br />

many employees working remotely, it's<br />

harder for organisations to manage<br />

backups and store data on the corporate<br />

network. Encourage employees to be<br />

responsible and back up their data<br />

regularly. If they store data on a local flash<br />

drive inserted into their laptop, they should<br />

back it up to the cloud or another hard<br />

drive. If employees store their data primarily<br />

in the cloud, they should be sure to have<br />

another copy offline.<br />

Encourage stringent cyber hygiene. All<br />

employees, especially those working<br />

remotely, need to be reminded to update<br />

the software on their devices and enable all<br />

available security features, such as firewalls<br />

and anti-malware. Failing to install updated<br />

software and security patches is a wellknown<br />

employee misstep that creates gaps<br />

for malware and ransomware to seize on.<br />

Limit the number of files employees can<br />

access. Employees should only be able to<br />

access data and folders based on the<br />

principle of 'least privilege'. This gives<br />

employees enough access to perform their<br />

required jobs but prevents them from<br />

accidentally deleting or corrupting files they<br />

shouldn't have had access to in the first<br />

place, meaning the risk caused by human<br />

error is significantly reduced.<br />

A business' weakest link may well be the<br />

'danger within', albeit unintentional. With the<br />

right strategies and processes in place,<br />

businesses can limit data loss when employees<br />

inevitably make mistakes.<br />

More info: www.storagecraft.com<br />

32 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE


TECHNOLOGY: TECHNOLOGY: SSD<br />

WHAT HAPPENS WHEN YOUR<br />

SSD DIES?<br />

RECOVERING DATA FROM FAILED SOLID-<strong>ST</strong>ATE DRIVES CAN BE MORE<br />

CHALLENGING THAN WITH HARD DISKS, EXPLAINS PHILIP BRIDGE,<br />

PRESIDENT OF ONTRACK<br />

There is no doubt that the use of solidstate<br />

drives (SSDs) has gathered pace.<br />

The main benefit is that they are much<br />

faster than a legacy HDD. This is because a<br />

standard HDD consists of many moving<br />

parts, as typified by the telltale 'whirring'<br />

sound we have all become accustomed to.<br />

When data needs to be accessed, the<br />

read/write head needs to move to the<br />

correct position. SSDs, by contrast, don't<br />

have any moving parts. This speed of<br />

operation makes them perfect for<br />

environments where real-time access and<br />

transfer of data is a necessity.<br />

One of the main downsides of SSDs<br />

though is that they have a limited life span.<br />

Whilst HDDs can - in theory - last forever, an<br />

SSD has a built-in 'time of death' that you<br />

can't ignore. This is because data can only<br />

be written on the storage cells a finite<br />

number of times. After that, the cells 'forget'<br />

new data. Because of this - and to prevent<br />

certain cells from getting used all the time<br />

while others aren't - manufacturers use wearlevelling<br />

algorithms to distribute data evenly<br />

over all cells by the controller.<br />

When it comes to estimating this time of<br />

death, manufacturers use something called<br />

terabytes written (TBW). The TBW figure can<br />

rather accurately tell you how much data<br />

can be written in total on all cells inside the<br />

storage chips. A typical TBW figure for a<br />

250 GB SSD lies between 60 and 150<br />

terabytes written. To put this in perspective,<br />

to get over a TBW of 70, a user would have<br />

to write 190 GB daily over one year (in<br />

other words, to fill two-thirds of the SSD with<br />

new data every day). While in a consumer<br />

environment, this is highly unlikely, in a 21stcentury<br />

business, it is highly plausible.<br />

One of the most popular SSDs - the<br />

Samsung SSD 850 PRO SATA - is stated to<br />

be "built to handle 150 terabytes written<br />

(TBW), which equates to a 40 GB daily<br />

read/write workload over a ten-year period."<br />

Samsung promises that the product is<br />

"withstanding up to 600 terabytes written<br />

(TBW)." If we consider a normal office user<br />

to write somewhere between 10 and 35 GB<br />

a day, even if one raises this amount up to<br />

40 GB, it means that they could write for<br />

more than five years until they reach the 70<br />

TBW limit.<br />

These rates have been verified by Google<br />

and the University of Toronto who - after<br />

testing SSDs over a multi-year period - put<br />

the age limit as somewhere between five and<br />

ten years depending on usage; around the<br />

same time as the average washing machine.<br />

WOR<strong>ST</strong> CASE SCENARIO<br />

So, what do you do if the worst happens and<br />

your SSD does indeed stop working? It is no<br />

exaggeration to say that in this era where<br />

data is king, not having access to that data<br />

could prove to be catastrophic. To mitigate<br />

the impact, it is best to contact a<br />

professional data recovery service provider<br />

where possible.<br />

When it comes to a physical fault, it is not<br />

possible for a user to recover or rescue their<br />

data themselves, however well-intentioned<br />

they may be. In fact, any attempt to recover<br />

data could make matters worse and lead to<br />

permanent data loss.<br />

Even though the average SSD lifespan is<br />

longer than users may expect, using SSDs<br />

can still pose a serious threat, as recovering<br />

data from failed SSDs is distinctly<br />

challenging. Sometimes the only solution is<br />

to find an identical functioning controller<br />

chip and swap it in to gain access - which is<br />

easier said than done.<br />

More info: www.ontrack.com/uk<br />

www.storagemagazine.co.uk<br />

@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

<strong>ST</strong>ORAGE<br />

MAGAZINE<br />

33


RESEARCH:<br />

RESEARCH: <strong>ST</strong>ORAGE <strong>ST</strong>RATEGIES<br />

PANDEMIC INCREASES PRESSURES ON I.T.<br />

SURVEY UNCOVERS THE LIMITATIONS IMPOSED BY TRADITIONAL I.T. INFRA<strong>ST</strong>RUCTURES, EXACERBATED<br />

BY REMOTE WORKING DURING COVID-19 PANDEMIC<br />

Nebulon has released the results of an<br />

independent survey completed by IT<br />

decision makers at 500 companies in<br />

the IT, financial services, manufacturing, retail,<br />

distribution and transport industries across the<br />

UK, US, Germany and France. Conducted in<br />

June of this year, the survey exposes the biggest<br />

challenges enterprises face in transforming<br />

their on-premises application storage<br />

environments, which have only been<br />

exacerbated during this Covid-19 era. While IT<br />

organisations cite multiple restrictions, the<br />

survey reveals limited infrastructure automation<br />

and high CAPEX as the most significant<br />

challenges for those deploying enterprise<br />

storage array technology, forcing them to reexamine<br />

IT spending and operations even<br />

more so than usual amidst the pandemic.<br />

While increasing automation and reducing<br />

costs may seem like mainstream initiatives for<br />

any large organisation, the pandemic and<br />

resulting workforce restrictions mandate<br />

significant progress in days or weeks, versus<br />

months or quarters. The results of the survey<br />

further reinforce this as respondents also<br />

highlighted their on-premises application<br />

storage environments are difficult to maintain,<br />

and reveal that they lacked the in-house<br />

expertise necessary to manage them. Even<br />

more disconcerting, respondents indicate that<br />

their traditional external storage arrays are not<br />

suited to handle new workloads, including<br />

containers and NoSQL databases. This is<br />

unsurprising as modern workloads have been<br />

architected for local versus shared storage<br />

resources.<br />

British IT decision makers specifically ranked<br />

"expensive" highest, with 57% making this one<br />

of their top three challenges, followed by "time<br />

consuming to maintain" (50%) and "difficult to<br />

automate at scale" (49%). Respondents from<br />

smaller organisations (1,000-2,999<br />

employees) were more likely to mark "lack of inhouse<br />

expertise" highly compared to larger<br />

organisations (3,000+employees) (59%<br />

compared to 31%) while these larger<br />

companies were more likely to consider cost a<br />

top challenge (61% compared to 35%).<br />

"The impact of the pandemic is forcing<br />

CIOs worldwide to reconsider their<br />

operations," said Siamak Nazari, Co-Founder<br />

and CEO of Nebulon, Inc. "Reducing costs<br />

through server-based storage alternatives<br />

without the restrictions of hyperconverged<br />

infrastructure, and reducing operating cost<br />

pressure through cloud-based management<br />

of the application storage infrastructure are<br />

crucial initiatives for IT organisations looking<br />

to survive this new normal."<br />

For companies with a growing class of<br />

mission-critical data that cannot or should not<br />

move to the public cloud, Cloud-Defined<br />

Storage is an alternative to expensive storage<br />

arrays, offering enterprises a cloud-managed,<br />

server-based approach for mission-critical<br />

storage. By combining a cloud-based control<br />

plane, called Nebulon ON, with server-based<br />

storage that is powered by the Nebulon<br />

Services Processing Unit (SPU), Nebulon<br />

enables organisations to reduce cost for<br />

enterprise storage by up to half without<br />

compromising on enterprise data services.<br />

This is made possible by Nebulon's unique<br />

architecture that makes use of commodity SSDs<br />

in industry standard servers, Ethernet in favour<br />

of Fibre Channel, and by eliminating<br />

operational complexities by moving<br />

management to Nebulon ON with an as-aservice<br />

model. With the architectural and<br />

operational simplicity of Cloud-Defined<br />

Storage, application owners gain a self-service<br />

infrastructure provisioning that is unmatched<br />

with existing on-premises storage solutions.<br />

"IT organisations have been seeking a costeffective<br />

alternative to external storage arrays<br />

for years," said Nazari. "With our Cloud-<br />

Defined Storage offering, they have the<br />

opportunity to reduce costs while also<br />

deploying a self-service solution for<br />

application owners that also reduces the<br />

operational burden."<br />

More info: www.nebulon.com<br />

34 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />

@<strong>ST</strong>MagAndAwards<br />

www.storagemagazine.co.uk<br />

MAGAZINE

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!