ST Jul-Aug 2020
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>ST</strong>OR<br />
MAGAZINE<br />
<strong>ST</strong>ORAGE<br />
The UK’s number one in IT Storage<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
Vol 20, Issue 4<br />
YOUR FLEXIBLE FRIEND:<br />
The benefits of Cloud Data Warehousing<br />
<strong>ST</strong>RATEGY:<br />
Hardware-defined storage is dead<br />
RESEARCH:<br />
Covid-19 increases pressures on I.T.<br />
TECHNOLOGY:<br />
What happens when your SSD dies?<br />
COMMENT - NEWS - NEWS ANALYSIS - CASE <strong>ST</strong>UDIES - OPINION - PRODUCT REVIEWS
Plug-and-Protect<br />
Direct-to-Cloud Backup Appliance<br />
Try-then-Buy Program<br />
FREE for 45 Days *
YOUR FLEXIBLE FRIEND:<br />
<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
Vol 20, Issue 4<br />
CONTENTS<br />
<strong>ST</strong>OR<br />
MAGAZINE<br />
<strong>ST</strong>ORAGE<br />
CONTENTS<br />
<strong>ST</strong>RATEGY:<br />
<br />
RESEARCH:<br />
<br />
TECHNOLOGY:<br />
<br />
<br />
COMMENT….....................................................................4<br />
Something for everyone<br />
06<br />
HARDWARE-DEFINED <strong>ST</strong>ORAGE IS DEAD...............…6<br />
Enterprises should not be afraid to look past the limitations of block- and file-based<br />
storage and to the revolutionary potential of modern storage systems, argues Jerome<br />
M. Wendt of analyst firm DCIG<br />
CASE <strong>ST</strong>UDY: UNIVERSITY OF READING..................…8<br />
<strong>ST</strong>RATEGY: CLOUD….....................................................10<br />
Gareth John of Q Associates examines the issues around migrating systems to the<br />
cloud, and the growing shift towards a hybrid multi-cloud model<br />
RESEARCH: <strong>ST</strong>ORAGE TRENDS..............................…12<br />
CASE <strong>ST</strong>UDY: TORIX..................................................…13<br />
14<br />
REVIEW: KING<strong>ST</strong>ON TECHNOLOGY DATA CENTER<br />
DC1000M.......................................................................14<br />
MANAGEMENT: DATA PROTECTION.........................16<br />
Sarah Doherty of iland underlines the threats to organisational data and the need to<br />
future-proof infrastructure with resilient data protection strategies<br />
INDU<strong>ST</strong>RY FOCUS: MEDIA........................................…18<br />
Nick Pearce-Tomenius of Object Matrix looks at some of the potential compliance issues<br />
surrounding long term storage of raw footage for TV and media production companies<br />
16<br />
BACKUP TO THE FUTURE.........................................…20<br />
Bill Andrews of ExaGrid examines the journey from simple tape backups to tiered disk<br />
backups that use adaptive deduplication for fast, reliable and affordable backup and<br />
restore solutions<br />
CLOUD: YOUR FLEXIBLE FRIEND.............................…24<br />
What is a Cloud Data Warehouse and why is it important? Rob Mellor of WhereScape<br />
shares some insights<br />
CASE <strong>ST</strong>UDY: CINESITE.............................................…26<br />
18<br />
POWER PLAY……........................................................…28<br />
Rainer Kaese of Toshiba shares some insights from a recent experimental project<br />
undertaken at the company into the energy consumption of disk drives<br />
PEOPLE: THE WEAKE<strong>ST</strong> LINK..................................…32<br />
Florian Malecki of StorageCraft warns that organisations need to beware 'the vulnerability<br />
from within': human error<br />
TECHNOLOGY: SSD…….......................................…33<br />
Recovering data from failed solid-state drives can be more challenging than with hard<br />
disks, explains Philip Bridge, President of Ontrack<br />
24<br />
RESEARCH: <strong>ST</strong>ORAGE <strong>ST</strong>RATEGIES......................…34<br />
Survey uncovers the limitations imposed by traditional IT infrastructures, exacerbated<br />
by remote working during Covid-19 pandemic<br />
www.storagemagazine.co.uk @<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
^<br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
03
COMMENT<br />
EDITOR: David Tyler<br />
david.tyler@btc.co.uk<br />
SUB EDITOR: Mark Lyward<br />
mark.lyward@btc.co.uk<br />
REVIEWS: Dave Mitchell<br />
PRODUCTION MANAGER: Abby Penn<br />
abby.penn@btc.co.uk<br />
PUBLISHER: John Jageurs<br />
john.jageurs@btc.co.uk<br />
LAYOUT/DESIGN: Ian Collis<br />
ian.collis@btc.co.uk<br />
SALES/COMMERCIAL ENQUIRIES:<br />
Lyndsey Camplin<br />
lyndsey.camplin@storagemagazine.co.uk<br />
Stuart Leigh<br />
stuart.leigh@btc.co.uk<br />
MANAGING DIRECTOR: John Jageurs<br />
john.jageurs@btc.co.uk<br />
DI<strong>ST</strong>RIBUTION/SUBSCRIPTIONS:<br />
Christina Willis<br />
christina.willis@btc.co.uk<br />
PUBLISHED BY: Barrow & Thompkins<br />
Connexions Ltd. (BTC)<br />
35 Station Square, Petts Wood<br />
Kent BR5 1LZ, UK<br />
Tel: +44 (0)1689 616 000<br />
Fax: +44 (0)1689 82 66 22<br />
SUBSCRIPTIONS:<br />
UK £35/year, £60/two years,<br />
£80/three years;<br />
Europe: £48/year, £85 two years,<br />
£127/three years;<br />
Rest of World: £62/year<br />
£115/two years, £168/three years.<br />
Single copies can be bought for £8.50<br />
(includes postage & packaging).<br />
Published 6 times a year.<br />
No part of this magazine may be<br />
reproduced without prior consent, in<br />
writing, from the publisher.<br />
©Copyright <strong>2020</strong><br />
Barrow & Thompkins Connexions Ltd<br />
Articles published reflect the opinions<br />
of the authors and are not necessarily those<br />
of the publisher or of BTC employees. While<br />
every reasonable effort is made to ensure<br />
that the contents of articles, editorial and<br />
advertising are accurate no responsibility<br />
can be accepted by the publisher or BTC for<br />
errors, misrepresentations or any<br />
resulting effects<br />
SOMETHING FOR EVERYONE<br />
BY DAVID TYLER<br />
EDITOR<br />
Welcome to the <strong>Aug</strong>ust issue of Storage magazine, where the usual Summer lull<br />
doesn't seem to have affected our contributors - in fact despite the ongoing<br />
disruption from the Covid-19 pandemic, we've seen a fairly frantic few weeks<br />
in terms of people wanting to be included in our pages. And that's good news for<br />
readers, as it means a broad selection of articles covering topics from right across the<br />
storage spectrum.<br />
Toshiba's Rainer Kaese reports on a fascinating exercise in measuring the energy<br />
usage of hard disks - a key consideration as enterprises and cloud providers alike try to<br />
manage the rising costs of their data centres. Can you imagine powering a petabyte of<br />
storage using less power than five old 100W light bulbs? See how it can be done on<br />
page 28.<br />
Elsewhere DCIG's Jerome Wendt puts the cat amongst the pigeons with his contention<br />
that hardware-defined storage is well and truly past its use-by date: "Failing to declare<br />
the death of hardware-defined storage serves no good purpose. Enterprises need to<br />
wake up to the plethora of features that modern storage systems deliver that make so<br />
many of their current tasks obsolete." Wendt argues that in fact most of the tasks that<br />
take up the working days of storage administrators could and should be being<br />
managed automatically by more modern storage arrays.<br />
In a focus on the broadcast media industry we hear from Object Matrix's Nick Pearce-<br />
Tomenius, who looks at how proper practices and appropriate storage solutions can<br />
help news and reality TV makers protect the integrity of their productions - and perhaps<br />
even solve the growing issues of 'Deepfake' videos. He comments: "Good digital<br />
content governance, a mix of process and technology, can ensure that content is<br />
protected, instantly accessible and proven to be authentic at any time in the future. It<br />
can also help organisations to beat Deepfake or disprove manipulated images."<br />
This issue also includes a couple of complementary bylines around cloud-related<br />
topics, including a piece on cloud migration - and specifically the shift towards hybrid<br />
multi-cloud models - from Gareth John of Q Associates. As he says: "Nowadays<br />
organisations are typically deploying all-flash storage systems in on-prem data centres<br />
and cold data is not a good fit for this medium. Intelligently archiving cold data to a<br />
cloud object store can ensure that hot data enjoys the high performance of flash whilst<br />
exploiting a low-cost scalable cloud tier for inactive data."<br />
I'm confident that, even more so than usual, this issue really does contain something<br />
for everyone.<br />
David Tyler<br />
david.tyler@btc.co.uk<br />
^<br />
04 <strong>ST</strong>ORAGE<br />
MAGAZINE<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk
ANALYSIS:<br />
ANALYSIS: HARDWARE-DEFINED <strong>ST</strong>ORAGE<br />
HARDWARE-DEFINED <strong>ST</strong>ORAGE IS DEAD<br />
ENTERPRISES SHOULD NOT BE AFRAID TO LOOK PA<strong>ST</strong> THE LIMITATIONS OF BLOCK- AND FILE-BASED<br />
<strong>ST</strong>ORAGE AND TO THE REVOLUTIONARY POTENTIAL OF MODERN <strong>ST</strong>ORAGE SY<strong>ST</strong>EMS, ARGUES<br />
JEROME M. WENDT, PRESIDENT AND FOUNDER OF ANALY<strong>ST</strong> FIRM DCIG<br />
Enterprises, regardless of their size,<br />
largely agree they want any storage<br />
solutions they deploy to deliver<br />
flexibility. They may look for this flexibility in<br />
multiple ways to include its availability,<br />
performance, reliability, replication,<br />
scalability, self-healing or self-tuning<br />
capabilities, and more. However, as they<br />
choose storage solutions that deliver the<br />
flexibility they need and want, another truth<br />
quickly becomes evident: hardware-defined<br />
storage is dead.<br />
A WORKING DEFINITION<br />
Simply speaking, hardware-defined storage<br />
arrays present a storage target to a physical<br />
or virtual machine. All hardware-defined<br />
storage arrays include some type of<br />
firmware on them that virtualises its<br />
underlying HDDs or SDDs. That firmware<br />
then, in turn, presents this virtualised<br />
storage as a volume or a folder to one or<br />
more physical or virtual machines.<br />
In this respect, most storage arrays fall<br />
under this working definition of hardwaredefined<br />
storage. Most storage arrays<br />
deliver one or both of these storage<br />
interfaces quite well. Further, almost any<br />
enterprise that acquires a storage array<br />
expects it to deliver block-based storage,<br />
file-based storage, or both.<br />
Having reached this level of maturity, it is<br />
time to declare hardware-defined storage<br />
as dead. Modern storage arrays and<br />
storage solutions offer so many more<br />
features. Block- and file-based storage<br />
should only serve as a starting point, not an<br />
end game. In only using block and/or file<br />
storage services on a storage array or<br />
solution, enterprises do themselves a<br />
disservice.<br />
EVIDENCE OF DEATH<br />
Failing to declare the death of hardwaredefined<br />
storage serves no good purpose.<br />
Enterprises need to wake up to the plethora<br />
of features that modern storage systems<br />
deliver that make so many of their current<br />
tasks obsolete. Consider the following<br />
scenarios and see if you answer "Yes" to any<br />
of them:<br />
<br />
Are you still contacting support for<br />
break/fix issues? My question to you is,<br />
"Why has your storage vendor not<br />
called you to tell you that the hardware<br />
problem was already diagnosed and<br />
fixed?" Multiple modern storage systems<br />
06 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
ANALYSIS:<br />
ANALYSIS: HARDWARE-DEFINED <strong>ST</strong>ORAGE<br />
"Block- and file-based storage should only serve as a starting<br />
point, not an end game. In only using block and/or file<br />
storage services on a storage array or solution, enterprises<br />
do themselves a disservice. Failing to declare the death of<br />
hardware-defined storage serves no good purpose.<br />
Enterprises need to wake up to the plethora of features<br />
that modern storage systems deliver that make so<br />
many of their current tasks obsolete."<br />
<br />
<br />
include features that diagnose the<br />
underlying issue and may resolve it<br />
before you even know about it.<br />
Are you still manually troubleshooting<br />
performance issues? Again, I ask,<br />
"Why are you not allowing the storage<br />
system to help diagnose and resolve<br />
performance issues?" Granted, you<br />
can throw more flash storage at the<br />
problem (and many do.) However,<br />
flash may only mask underlying issues.<br />
Using storage arrays that include<br />
artificial intelligence can equip<br />
enterprises to directly address the root<br />
causes behind these performance<br />
issues. In so doing, they can help<br />
prevent them from recurring.<br />
Can your applications communicate<br />
directly with the storage array and<br />
request and return storage as needed?<br />
This feature represents an entirely new<br />
generation of functionality where<br />
enterprises may bypass the needs for<br />
tasks such as LUN masking, zoning,<br />
and setting security permissions. Where<br />
is the business value in any of these<br />
administrative tasks? (Dirty little secret:<br />
there is little or none!) Look for new<br />
<br />
<br />
storage systems that expose their APIs<br />
so applications can obtain and rescind<br />
storage according to their needs.<br />
Are you still guessing at future capacity<br />
requirements and tying up capital by<br />
purchasing that capacity up front?<br />
Multiple storage vendors now deliver<br />
their solutions "as a service". The<br />
vendors offer flexible capacity that ties<br />
cost to actual usage, and they manage<br />
the underlying storage array for the<br />
enterprise. This frees IT staff to manage<br />
the data rather than the infrastructure.<br />
Are you creating a new silo of storage<br />
and storage management headaches<br />
when migrating workloads to the<br />
cloud? Look for storage vendors that<br />
offer their storage solutions as softwaredefined<br />
offerings in the cloud. This<br />
extends existing, familiar, data<br />
management and protection<br />
capabilities to workloads in the cloud.<br />
A WAKE-UP CALL<br />
Do not think for one second that I think<br />
enterprises will stop using hardwaredefined<br />
storage or vendors will stop<br />
shipping it tomorrow. Neither will occur. If<br />
anything, I expect both block-based and<br />
file-based storage to outlive and outlast<br />
me. Hardware-defined storage works and<br />
many applications and operating systems<br />
will need it for the foreseeable future.<br />
That said, declaring the death of<br />
hardware-defined storage serves as a<br />
wake-up call to enterprises. DCIG just<br />
completed and released its <strong>2020</strong>-21<br />
Enterprise All-flash Array Buyer's Guide. In<br />
evaluating these arrays, DCIG only refers<br />
to them as "storage arrays" in the very<br />
broadest sense of the term.<br />
These arrays do so much more than<br />
provide block- and/or file-based storage<br />
targets. Many offer powerful software<br />
features that revolutionise how enterprises<br />
allocate and manage storage.<br />
By putting a stake in the ground and<br />
declaring hardware-defined storage as<br />
dead, DCIG is not trying to kill hardwaredefined<br />
storage. Rather, DCIG desires<br />
that enterprises take a long, hard look at<br />
how the modern storage solutions found<br />
in this Guide can enable them to<br />
transform their business.<br />
More info: www.dcig.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
07
CASE <strong>ST</strong>UDY:<br />
CASE <strong>ST</strong>UDY: UNIVERSITY OF READING<br />
THE UNIVERSITY CHALLENGE<br />
THE UNIVERSITY OF READING HAS BEEN ABLE TO BOO<strong>ST</strong> ITS ACADEMIC RESEARCH CAPABILITIES<br />
SINCE DEPLOYING A SOFTWARE-DEFINED SCALE-OUT FILE <strong>ST</strong>ORAGE SOLUTION<br />
Founded in the 19th century the<br />
University of Reading has become one<br />
of the foremost research-led<br />
universities in the UK. It has over 50<br />
research centres, many recognised as<br />
international centres of excellence, in areas<br />
including agriculture, biological and<br />
physical sciences and meteorology.<br />
RESEARCH WORKLOADS<br />
While similar in many respects, the IT<br />
requirements of university research teams<br />
are often far removed from those of<br />
commercial workloads. In addition to<br />
vastly higher compute and storage<br />
demands, for example, research<br />
workloads can be a lot harder to predict<br />
and liable to change significantly at very<br />
short notice, as Ryan Kennedy, Academic<br />
Computing Team Manager at the<br />
University of Reading explains.<br />
"IT has become a key research tool and<br />
it's not unusual for academics to request<br />
access to hundreds of VMs connected to<br />
terabytes of storage one day, only to dump<br />
them and start over the next," he said.<br />
"Delivering that kind of ad-hoc scalability<br />
using conventional servers and storage<br />
platforms is both complex and time<br />
consuming, especially for IT staff employed<br />
to support the research, not manage the<br />
infrastructure."<br />
Against that background Kennedy and his<br />
team were finding it increasingly difficult to<br />
deliver the IT resources research users were<br />
demanding. Moreover, with virtualisation a<br />
key part of the solution, licensing costs were<br />
becoming an issue and, while big projects<br />
could afford to finance new infrastructure, it<br />
was hard to justify spending to meet the<br />
needs of those with limited funds. A simpler<br />
and more agile solution was clearly<br />
required and one which could be shared<br />
more equitably and automated to allow for<br />
greater hands-off management.<br />
PUBLIC CLOUD OR ON-PREM?<br />
Among several alternatives investigated the<br />
public cloud was an obvious candidate but<br />
not necessarily a good fit as Kennedy<br />
outlined: "While the public cloud could<br />
deliver the on-demand agility and selfservice<br />
management we were after, the<br />
unpredictable workloads would make it<br />
more expensive and, potentially, harder<br />
and more time consuming for us to<br />
manage. There were also concerns about<br />
data protection and compliance, especially<br />
given the sensitive nature of the data<br />
involved and the need to protect<br />
intellectual copyright."<br />
A brief and costly trial using Azure proved<br />
the validity of these concerns, at which<br />
point Kennedy persuaded the University to<br />
instead consolidate its existing infrastructure<br />
- then spread across multiple sites - into<br />
one on-premise data centre. Moreover,<br />
rather than simply upgrading the existing<br />
infrastructure, the decision was taken to<br />
switch to the Nutanix Enterprise Cloud OS<br />
software running on Dell EMC XC series in<br />
order to deliver the same on-demand and<br />
self-service benefits as the public cloud, but<br />
in a more affordable, secure and<br />
manageable manner.<br />
The decision was also taken to switch<br />
virtualisation platform, from VMware to the<br />
AHV hypervisor included as part of the<br />
Nutanix Enterprise Cloud software stack. A<br />
bold move with the promise of huge cost<br />
08 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
CASE <strong>ST</strong>UDY:<br />
CASE <strong>ST</strong>UDY: UNIVERSITY OF READING<br />
"As well as lower cost, speed and simplicity were seen as the main plus points of<br />
Nutanix Files. With our legacy NAS software, for example, new shares had to be set<br />
up by the support team using specialist interfaces but with Nutanix Files anyone can<br />
do it and it's easy to automate. It's also a lot quicker with shares available online in<br />
seconds and none of the performance bottlenecks associated with separate server<br />
and storage platforms."<br />
savings, which has also paid off in terms of<br />
an easy migration and simpler, unified,<br />
management. "Migrating old VMs to the<br />
Nutanix hypervisor was trouble free and we<br />
have yet to find a workload that AHV can't<br />
handle," commented Kennedy. "The AHV<br />
hypervisor is also fully integrated and<br />
managed from the same Prism console as<br />
the rest of the Enterprise Cloud software<br />
making it easy to build the self-service<br />
portal we wanted and allow academics to<br />
provision their own resources."<br />
Another key reason for choosing the<br />
Nutanix Enterprise Cloud Platform, the<br />
integrated Prism Self-Service Portal (SSP)<br />
can be used by customers to build a<br />
custom web-based interface that empowers<br />
users to create and manage both VMs and<br />
storage directly - much as they would using<br />
a public cloud platform, but in a strictly<br />
controlled and supervised manner. To this<br />
end administrators create projects to which<br />
they assign compute and storage resources,<br />
including shared VM templates and<br />
software images, for end-user consumption.<br />
Fine-grained access controls can also be<br />
applied with additional tools to gather<br />
usage statistics and raise alerts when<br />
specific thresholds are breached.<br />
Another important decision was to switch<br />
from legacy NAS storage to the integrated<br />
Nutanix Files - a software-defined scale-out<br />
file storage solution for unstructured data.<br />
This would enable Reading University to<br />
configure over a petabyte of usable storage<br />
using six load-balanced virtual file servers<br />
all in the same rack and managed from the<br />
same single pane of management provided<br />
by Nutanix Prism. "As well as lower cost,<br />
speed and simplicity were seen as the main<br />
plus points of Nutanix Files," Kennedy<br />
explains. "With our legacy NAS software,<br />
for example, new shares had to be set up<br />
by the support team using specialist<br />
interfaces but with Nutanix Files anyone can<br />
do it and it's easy to automate. It's also a<br />
lot quicker with shares available online in<br />
seconds and none of the performance<br />
bottlenecks associated with separate server<br />
and storage platforms."<br />
MIGRATION IN A WEEKEND<br />
Following an initial proof of concept trial<br />
using just five nodes, the scalability of the<br />
Nutanix Enterprise Cloud was immediately<br />
put to the test when one of the university's<br />
legacy IT infrastructure suppliers went out of<br />
business. Faced with having no support for<br />
key storage appliances an additional 10<br />
nodes were quickly delivered, enabling<br />
Kennedy and his team to migrate fully to<br />
the Nutanix infrastructure over a weekend<br />
and configure 400TB of storage in just 10<br />
minutes.<br />
"It was a real eye-opener," he said. "With<br />
our legacy storage it would have taken<br />
weeks to put in new servers and storage but<br />
once the Nutanix nodes were racked we<br />
just hit the expand button and, 10 minutes<br />
later, it was all done. Why couldn't we have<br />
done it this way before?"<br />
As well as simpler scalability and<br />
enhanced storage performance, another<br />
benefit is much more efficient use of<br />
available storage with, in the case of<br />
Reading University, a 16:1 reduction in<br />
physical storage overheads thanks to builtin<br />
deduplication, erasure coding and<br />
compression technologies.<br />
That doesn't mean that extra nodes<br />
haven't been needed as according to<br />
Kennedy uptake of the Reading Research<br />
Cloud has been 'massive' and is still<br />
growing. Despite that, there have been no<br />
availability issues with the Reading team<br />
opting to take advantage of the inherent<br />
redundancy of the Nutanix architecture and<br />
use the integrated Cloud Connect<br />
capability to take snapshots to Microsoft<br />
Azure for backup and disaster recovery.<br />
Ryan Kennedy is hugely appreciative and<br />
proud of what the Nutanix Enterprise Cloud<br />
has allowed the University IT team to<br />
achieve, pointing to not just the scalability<br />
and ease of use of the platform as key<br />
enablers but the professionalism and high<br />
level of support provided by Nutanix and its<br />
partners: "The Nutanix platform really has<br />
transformed the way we work," he<br />
commented. "Most of the time we don't<br />
even have to touch it - it just runs itself!"<br />
More info: www.nutanix.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
09
<strong>ST</strong>RATEGY: CLOUD CLOUD<br />
THE EVER-CHANGING IT<br />
LANDSCAPE<br />
GARETH JOHN, SOLUTIONS ARCHITECT AT Q ASSOCIATES, EXAMINES<br />
THE ISSUES AROUND MIGRATING SY<strong>ST</strong>EMS TO THE CLOUD, AND THE<br />
GROWING SHIFT TOWARDS A HYBRID MULTI-CLOUD MODEL<br />
The IT landscape is changing. It hasn't<br />
just evolved from what it was five years<br />
ago, or even one year ago; it is in a<br />
state of constant flux, mostly due to the<br />
cloud aspect of an IT strategy - Cloud<br />
Strategy - and this can change from month<br />
to month as organisations adapt to the<br />
proliferation of new tools and services that<br />
are on offer.<br />
There is a definite trend that is seeing<br />
workloads being moved from the on-prem<br />
data centre to some sort of cloud, whether<br />
that be IaaS, PaaS, or SaaS; into a hyperscaler<br />
or by consuming a service from a<br />
smaller provider. And there are many good<br />
reasons for this trend, especially in relation<br />
to the hyper-scalers: near infinite and instant<br />
elasticity where you can scale up or scale<br />
back and only pay for what you use, offloading<br />
of hardware maintenance, taking<br />
advantage of cloud-based data analytics,<br />
utilisation of the substantial and evergrowing<br />
compendium of services, and<br />
more.<br />
Cloud adoption, however, should not be<br />
hurried. Testament to this are the many<br />
organisations who had adopted an<br />
aggressive cloud-first strategy and<br />
discerned the resultant increasing costs,<br />
that are now trying to reverse out of the<br />
public cloud - and incurring yet more<br />
expense. Just as there are many potential<br />
benefits of public cloud, there are also<br />
many valid concerns including<br />
connectivity, security, data sovereignty,<br />
lock-in, and of course cost.<br />
Organisations need to carefully assess<br />
their existing IT estate to ascertain which<br />
workloads are appropriate for cloud<br />
transition. There will almost certainly be<br />
workloads that are unsuitable for the<br />
transition and the ones that are<br />
appropriate will suit different cloud<br />
models. In this light, most customers that I<br />
talk to are looking to adopt a hybrid<br />
multi-cloud model (see diagram).<br />
The first step is usually to move<br />
previously on-prem applications to SaaS<br />
offerings; Microsoft 365 is a prominent<br />
example of this where people can off-load<br />
everything (including hardware<br />
maintenance, O.S. and application<br />
versioning, resilience and interoperability)<br />
to a full-stack service that includes the<br />
application and its data. Note that while<br />
the data will reside on resilient<br />
infrastructure, it still needs to be backed<br />
up to protect against corruption,<br />
unintended change or deletion.<br />
RUNNING HOT AND COLD<br />
Cold data (data that is rarely used) is also<br />
considered low-hanging fruit for cloud<br />
utilisation. Nowadays organisations are<br />
typically deploying all-flash storage<br />
systems in on-prem data centres and cold<br />
data is not a good fit for this medium.<br />
Intelligently archiving cold data to a cloud<br />
object store can ensure that hot data<br />
enjoys the high performance of flash<br />
whilst exploiting a low-cost scalable cloud<br />
tier for inactive data. This cloud object tier<br />
10 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
<strong>ST</strong>RATEGY: <strong>ST</strong>RATEGY: CLOUD<br />
Connectivity is also an important factor;<br />
as organisations move workloads off to<br />
various cloud services, connectivity needs<br />
to be considered to ensure that bandwidth<br />
and latency requirements are met once the<br />
workload has been moved. In this arena<br />
we're seeing a lot more interest in software<br />
defined WAN (SD-WAN) initiatives aiming<br />
to simplify and orchestrate routing over an<br />
assortment of disparate WAN connections.<br />
is also a good location to store an off-site<br />
copy of backup data that can then be<br />
utilised as part of a cloud-based DR<br />
strategy.<br />
The way in which public cloud services<br />
are consumed is fast becoming the de<br />
facto standard: users can log on to a<br />
portal, select the services they require and<br />
have these services instantiated in<br />
minutes. This is the reason that<br />
organisations should consider<br />
transitioning their on-prem infrastructure<br />
into a private cloud, so that their<br />
resources can be consumed in a much<br />
more cloud-like fashion.<br />
It's a lot more complicated than this, but<br />
it will involve deploying a framework that<br />
provides a service catalogue, automated<br />
fulfilment, and a billing engine. It will also<br />
require mapping SLAs to resource pool<br />
utilisation, organisational changes, and<br />
procedural standardisation amongst other<br />
things.<br />
Whilst public cloud is great for burstable<br />
workloads (due to the inherent elasticity<br />
where you only pay for what you use) one<br />
mistake that we regularly see is the liftand-shift<br />
of on-prem applications into<br />
public IaaS offerings. Having all VMs, that<br />
would normally reside on on-prem<br />
infrastructure, running in the cloud 24/7<br />
could see a significant cost increase.<br />
REARCHITECT FOR SUCCESS<br />
In order to realise the full value of public<br />
cloud, applications really need to be<br />
rearchitected to utilise things like database<br />
services (rather than running full database<br />
VMs) and serverless code services (where<br />
you only pay for the compute time that you<br />
consume). Automatically turning VMs off<br />
when they are not being used will also be<br />
financially advantageous.<br />
Q Associates has been helping<br />
customers with all of these schemes for<br />
some time, but until recently we have had<br />
to rely on partnerships to ensure that we<br />
utilise the best specific skills and<br />
knowledge in any particular area. With the<br />
recently-announced acquisition of Apex<br />
Group, we now have premium in-house<br />
skills in all of these fields and can provide<br />
our customers with a holistic delivery of<br />
infrastructure and services, from design<br />
and implementation through to support<br />
and management. The acquisition will also<br />
help us to evolve at speed, with<br />
widespread internal hybrid multi-cloud<br />
skills and knowledge, to ensure that we<br />
stay relevant to our customers in this<br />
rapidly shifting environment.<br />
More info: www.qassociates.co.uk<br />
"Nowadays organisations are typically deploying all-flash<br />
storage systems in on-prem data centres and cold data is<br />
not a good fit for this medium. Intelligently archiving cold<br />
data to a cloud object store can ensure that hot data<br />
enjoys the high performance of flash whilst exploiting a<br />
low-cost scalable cloud tier for inactive data. This cloud<br />
object tier is also a good location to store an off-site<br />
copy of backup data that can then be utilised as part of<br />
a cloud-based DR strategy."<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
11
RESEARCH: <strong>ST</strong>ORAGE TRENDS<br />
SPECTRA PUBLISHES "DIGITAL DATA<br />
<strong>ST</strong>ORAGE OUTLOOK <strong>2020</strong>"<br />
FIFTH ANNUAL DATA <strong>ST</strong>ORAGE REPORT AIDS INDU<strong>ST</strong>RY IN NAVIGATING THE BUDGETARY AND<br />
INFRA<strong>ST</strong>RUCTURE CHALLENGES OF CAPTURING, SHARING AND PRESERVING DATA<br />
Disk manufacturers are closing in on<br />
delivery of HAMR and MAMR<br />
technologies that will allow them to<br />
initially provide disk drives of 20TB<br />
while also enabling a technology<br />
roadmap that could achieve 50TB or<br />
greater over the next 10 years.<br />
Spectra Logic has published the fifth<br />
edition of its "Digital Data Storage<br />
Outlook" report. The <strong>2020</strong> report<br />
delves into the management, access, use<br />
and preservation of the world's everexpanding<br />
volumes of data, capturing the<br />
impact of the Covid-19 pandemic on<br />
trends and technology during this<br />
unprecedented time in history. The report<br />
outlines future strategies, technologies,<br />
applications, use cases and costs for<br />
more accurate evaluation and planning of<br />
data management and preservation<br />
strategies.<br />
Spectra's Digital Data Storage Outlook<br />
<strong>2020</strong> predicts that, while there could be<br />
some restrictions in budgets and<br />
infrastructure, only a small likelihood<br />
exists for a constrained supply of storage<br />
to meet the needs of the digital universe<br />
through 2030.<br />
Storage device providers will continue to<br />
innovate with higher speeds and<br />
capacities to meet increasing growth<br />
demand, with every data storage<br />
category, including flash, persistent<br />
memory, disk, tape and cloud, exhibiting<br />
technology improvements. This<br />
momentum will be dependent upon<br />
projected technology advancements and<br />
any slowdown in one category, such as<br />
disk, will provide an opportunity for<br />
others, such as flash and tape.<br />
Highlights from the <strong>2020</strong> report include:<br />
Economic concerns will push<br />
infrequently accessed data from tier<br />
one storage, made up of flash, to a<br />
second tier, made up of spinning disk,<br />
object storage, cloud and tape. This<br />
method employs data movers to<br />
migrate data for ongoing cost savings.<br />
<strong>2020</strong> will see a 10% to 40% price<br />
increase for flash. After experiencing<br />
18 months of oversupply of flash in<br />
the market, resulting in substantial<br />
price reductions, <strong>2020</strong> will see<br />
reductions in supply versus demand.<br />
The third generation of 3D XPoint<br />
technology will become the latest<br />
high-performance standard for<br />
database storage.<br />
The need for tape in the long-term<br />
archive market continues to grow.<br />
Tape will achieve storage capacities of<br />
100TB or higher on a single cartridge<br />
in the next decade.<br />
Cloud providers will consume, in<br />
terms of both volume and revenue, an<br />
increasingly larger portion of the<br />
storage required to support the digital<br />
universe.<br />
"The year <strong>2020</strong> is one like no other due<br />
to Covid-19, which makes accurate<br />
market forecasting especially challenging<br />
in these extraordinary times," said Spectra<br />
Logic CEO Nathan Thompson. "That said,<br />
as businesses become increasingly datadriven,<br />
it is even more crucial that IT<br />
professionals understand the factors<br />
impacting their organisations, so they can<br />
anticipate the trends, technologies and<br />
challenges they will face in order to<br />
protect their data and derive maximum<br />
value from it for the long-term."<br />
The full report can be downloaded from<br />
https://spectralogic.com/data-storageoutlook-report/<br />
More info: www.spectralogic.com<br />
12 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
CASE <strong>ST</strong>UDY: CASE <strong>ST</strong>UDY: TORIX<br />
MODERN, FLEXIBLE, HIGH-PERFORMING<br />
NON-PROFIT INTERNET EXCHANGE TORIX LOOKED TO <strong>ST</strong>ORMAGIC FOR A HYPERCONVERGED<br />
SOLUTION THAT WOULD BE EASY FOR ITS I.T. TEAM TO MANAGE<br />
In 1998, Toronto Internet Exchange<br />
(TorIX), the first non-profit internet<br />
exchange in Toronto, was created to<br />
directly connect the internet traffic of<br />
Canadian businesses by using local<br />
network infrastructure. A group of experts<br />
collaborated to establish TorIX with the<br />
intention of overcoming the cost and<br />
latency issues of having Canadian traffic<br />
flow through the United States. Today, TorIX<br />
has over 250 organisations connected with<br />
access to direct routes from many diverse<br />
peering partners.<br />
As a non-profit organisation, TorIX focuses<br />
on investing funds into infrastructure, so<br />
that its technology can stay up to date to<br />
remain at the forefront of the Internet<br />
Exchange Point (IXP) industry. Previously,<br />
TorIX was using a VMware installation with<br />
no replication, however it wanted to avoid<br />
large, hardware-dependent installations<br />
associated with vSAN and to find a solution<br />
better fitted for its long-term needs.<br />
TorIX began the process of evaluating<br />
market options for an infrastructure solution<br />
to power its IT operations that was highperforming,<br />
simple and easy to manage.<br />
More specifically, TorIX was searching for a<br />
solution that it could trust with managing<br />
all of its critical external services for<br />
customers, including its online portal<br />
systems, telemetry data, web and mail<br />
applications. At the top of TorIX’s priority<br />
list was a hyperconverged solution that was<br />
easy for its IT team to manage, which is<br />
why the company turned to SvSAN.<br />
EASY TO MANAGE & UPGRADE<br />
To power its non-profit internet exchange,<br />
TorIX needed a hyperconverged solution<br />
between its two data centres with high<br />
performance and availability. After<br />
evaluating multiple options, TorIX found<br />
that StorMagic SvSAN best suited its needs<br />
because it was simple for its IT team to<br />
manage and easy to upgrade, while still<br />
remaining cost effective and modern.<br />
Furthermore, SvSAN's stretch/metro cluster<br />
capability enabled TorIX to site its two<br />
SvSAN nodes 3 kilometres apart with no<br />
impact on performance thanks to SvSAN's<br />
low bandwidth requirements.<br />
MAXIMUM UPTIME<br />
TorIX now has a two-node cluster consisting<br />
of Cisco servers and VMware vSphere as<br />
the hypervisor. With SvSAN, TorIX can easily<br />
manage its IT infrastructure with 100<br />
percent redundancy and high availability.<br />
SvSAN powers all of TorIX's critical external<br />
services for customers, such as web and<br />
mail applications, online portal systems and<br />
telemetry data.<br />
TorIX has reported maximum uptime in<br />
operations, delivering powerful direct<br />
internet routes to peering partners without<br />
interruptions. In addition, throughout the<br />
pre- and post-implementation process,<br />
TorIX found StorMagic's world-class, 24/7<br />
customer support highly responsive and<br />
helpful with technical expertise. TorIX found<br />
that SvSAN is reliable and simple to<br />
manage for its day-to-day operations. High<br />
data availability is critical to TorIX and their<br />
loyal customers.<br />
"TorIX is driven to directly connect<br />
Canadian business' internet traffic through<br />
the local network infrastructure, while<br />
maintaining strong network performance<br />
and low latency," commented Jon Nistor,<br />
Board Director, TorIX. "To deliver this to<br />
customers, we prioritise investing in modern<br />
technology for our IT infrastructure, so that<br />
we can remain at the forefront of the<br />
industry. This is why we selected StorMagic<br />
SvSAN, so that TorIX can now power<br />
operations with a modern system that is easy<br />
to manage, flexible and high-performing.<br />
"We have been 100% satisfied with<br />
StorMagic, which we trust to power all of<br />
our critical external services for our<br />
customers and have the peace of mind that<br />
our systems will never fail."<br />
More info: www.stormagic.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
13
PRODUCT REVIEW REVIEW<br />
KING<strong>ST</strong>ON TECHNOLOGY DATA CENTER DC1000M<br />
As data centre applications and<br />
workloads demand ever greater<br />
storage performance, enterprises are<br />
finding that NVMe SSDs are the only way<br />
to go. These high-performance devices are<br />
perfect for businesses running dataintensive<br />
workloads and those that need to<br />
replace legacy SATA or SAS SSD server<br />
storage and arrays as they deliver very high<br />
throughputs and low latency in a familiar<br />
form factor.<br />
The Data Center DC1000M series of<br />
NVMe U.2 SSDs from Kingston offer a<br />
tempting proposition as they deliver a<br />
finely balanced combination of<br />
performance and value. Available in a<br />
choice of four capacities ranging from<br />
960GB to 7.68TB, we reviewed Kingston's<br />
1.92TB model which has a very affordable<br />
sub-£400 price.<br />
The DC1000M series clearly shows<br />
Kingston's intentions as it has been moving<br />
firmly into the data centre storage space<br />
for some time. Combining these with its<br />
new DC1000B NVMe boot drive plus the<br />
DC450R and DC500 series of SATA SSDs<br />
allows it to offer one of the most<br />
comprehensive ranges of highperformance<br />
data centre storage solutions<br />
on the market.<br />
The 1.92TB model looks fast on paper<br />
with Kingston quoting sequential read and<br />
write speeds of 3,100MB/sec and<br />
2,600MB/sec. Along with low sub-1ms<br />
latencies, throughput looks good with it<br />
claiming rates for random read and write<br />
operations of 540,000 IOPS and<br />
205,000 IOPS respectively.<br />
These numbers make the DC1000M very<br />
versatile and ideal for mixed-use scenarios<br />
in the data centre. Typical applications<br />
Kingston is targeting range from HPC,<br />
OLTP and virtualisation to cloud services,<br />
web host caching and HD media capture.<br />
The DC1000M employs the latest 3D TLC<br />
(triple level cell) NAND flash technology.<br />
This is far superior to older 2D NAND as it<br />
allows the cells to be stacked in layers thus<br />
enabling much higher storage densities<br />
with a lower cost per bit and reduced<br />
power consumption.<br />
Other key features that will appeal to<br />
enterprises are hot-plug support and<br />
SMART monitoring for tracking reliability,<br />
usage, remaining life, wear levelling and<br />
operational temperatures. The DC1000M<br />
also incorporates onboard power loss<br />
protection (PLP) through capacitors and<br />
firmware to avoid potential data loss<br />
caused by power failures.<br />
For performance testing, we used the<br />
lab's Dell PowerEdge T640 tower server<br />
equipped with dual 22-core 2.1GHz Xeon<br />
Scalable Gold 6152 CPUs plus 384GB of<br />
DDR4 memory and running Windows<br />
Server 2019. Our server has an eight-bay<br />
PCI-e NVMe Gen 3 U.2 cage and we had<br />
no problems fitting the DC1000M in the<br />
server's hot-plug carrier where it was<br />
correctly recognised by the OS as a new<br />
NVMe bus storage device.<br />
We used a range of benchmarking apps<br />
starting with Iometer which reported raw<br />
sequential read and write rates of<br />
3,070MB/sec and 2,663MB/sec. The read<br />
rate is slightly below the claimed speed<br />
while the write rate is marginally better and<br />
the CrystalDiskMark app agreed closely<br />
with these numbers.<br />
For random read and write rates, Iometer<br />
returned 2,990MB/sec and 1,600MB/sec.<br />
Changing Iometer to small 4K block sizes,<br />
we ran our tests for a number of hours until<br />
they had achieved a steady state.<br />
Once throughput had settled, we<br />
recorded random read and write rates of<br />
486,900 IOPS and 225,100 IOPS. As with<br />
our sequential tests, read throughput was<br />
slightly below the quoted number whereas<br />
write rates were a little higher. Overall,<br />
these performance results are great and<br />
latency is also very low as during our I/O<br />
throughout tests, both Iometer and the AS<br />
SSD Benchmark app reported average<br />
latencies of less than 1ms.<br />
Product: Data Center DC1000M<br />
Supplier: Kingston Technology<br />
Web site: www.kingston.com<br />
Tel: +44 (0) 1932 738888<br />
Price: 1.92TB - £377 exc VAT<br />
VERDICT: The DC10000M is clearly capable of handling very demanding enterprise workloads and is more than a match for<br />
competing NVMe storage products costing substantially more, making it excellent value as well.<br />
14 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
MANAGEMENT: DATA PROTECTION<br />
THE 3-2-1 RULE OF DATA PROTECTION<br />
SARAH DOHERTY, PRODUCT MARKETING MANAGER AT ILAND, UNDERLINES THE THREATS TO<br />
ORGANISATIONAL DATA AND THE NEED TO FUTURE-PROOF INFRA<strong>ST</strong>RUCTURE WITH RESILIENT DATA<br />
PROTECTION <strong>ST</strong>RATEGIES<br />
IIn today's world, a<br />
major challenge for<br />
organisations is<br />
protecting their data.<br />
Whether an<br />
organisation is in a<br />
regulated industry<br />
mandated by law to<br />
retain x number of years<br />
of data, or one more<br />
acutely concerned with<br />
employees accidentally<br />
deleting files, the first<br />
pain point that<br />
customers usually have<br />
is focused on data protection.<br />
There are several reasons for<br />
companies to resort to backing<br />
up their data via the cloud.<br />
Firstly, with ransomware attacks<br />
more frequent than ever before<br />
and hardware failure still an<br />
issue, organisations traditionally<br />
have local backup as their<br />
primary means of<br />
protecting data.<br />
However, local backup<br />
is still vulnerable for<br />
several reasons such as<br />
SAN failure, double<br />
disk fault or power<br />
loss.<br />
Secondly, backups<br />
are necessary and<br />
mandatory, but local<br />
backups might not<br />
save organisations in<br />
certain situations.<br />
What if the power in<br />
the building goes out?<br />
How will they restore their data? If the<br />
hardware is broken and it takes four<br />
weeks for the hardware to recover, that<br />
doesn't help an organisation to get back<br />
up and running to continue with 'business<br />
as usual'.<br />
Thirdly, IT resilience is the ability to<br />
quickly bring organisations online so they<br />
can continue to run their business no<br />
matter what the issue.<br />
Whatever the situation is, organisations<br />
need to be able to quickly get IT<br />
infrastructure back in operation, no matter<br />
what is going on in their data centre.<br />
IT resilience and Disaster Recovery as a<br />
Service (DRaaS) has always been a<br />
challenge for companies because, in the<br />
old days, organisations would have to<br />
have a secondary data site, or use old<br />
hardware, replicate all data and runbooks<br />
and plans, and have to test it, etc. It was<br />
just absurd and only the largest enterprise<br />
organisations could afford to do it.<br />
With the cloud's model of 'pay for what<br />
you use' and 'pay for what you need',<br />
companies of any size can replicate their<br />
data, infrastructure and entire application<br />
stack to the cloud more cost effectively<br />
than buying additional data centre space<br />
or running on-premise backup and DR.<br />
THE 3-2-1 RULE<br />
The 3-2-1 backup rule is an easy-toremember<br />
shorthand for a common<br />
approach to keeping organisations' data<br />
safe in almost any failure scenario.<br />
The rule is: keep at least three (3) copies<br />
16 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
MANAGEMENT: DATA PROTECTION<br />
"IT resilience and Disaster Recovery as a Service has always been a challenge for<br />
companies because, in the old days, organisations?would have to have a<br />
secondary data site, or use old hardware, replicate all data and runbooks and<br />
plans, and have to test it, etc. It was just absurd and only the largest enterprise<br />
organisations could afford to do it. With the cloud's model of 'pay for what you use'<br />
and 'pay for what you need', companies of any size can replicate their data,<br />
infrastructure and entire application stack to the cloud more cost effectively than<br />
buying additional data centre space or running on-premise backup and DR."<br />
of the organisation's data, one being the<br />
production environment. Then store two<br />
(2) backup copies, which is usually initial<br />
backup on different storage media such as<br />
a tape, snapshot, hard drive etc. Then<br />
store one (1) of them located offsite.<br />
There are several reasons why the last<br />
stage is important. If you think about<br />
ransomware, nowadays it has the ability to<br />
find locally attached backups and encrypt<br />
them. Or organisations could have a<br />
power failure where, if everything is in the<br />
same building, they are left with no<br />
backup at all.<br />
Historically, a lot of companies would<br />
resort to trading copies of their tapes,<br />
putting them on a truck and sending them<br />
somewhere else. That introduces all sorts<br />
of challenges with humidity, transportation<br />
of the tape, where it is being stored, will<br />
they have the same tape type and will we<br />
be able to access it in two years?<br />
Organisations still want to have that airgapped<br />
copy of their data, but cloud<br />
introduces a whole new way of<br />
addressing that as it is easily accessible<br />
by anyone, anywhere.<br />
HOW TO FUTUREPROOF<br />
INFRA<strong>ST</strong>RUCTURE<br />
Cloud is an elegant solution to address these<br />
data protection and business continuity<br />
issues, and one that is within the capabilities<br />
and budgets of every organisation. By using<br />
cloud to follow the 3, 2, 1 rule of data<br />
availability, organisations gain the confidence<br />
that they can have a failure and still be able<br />
to recover their data.<br />
Data centre mobility and cloud enable those<br />
business-critical workloads to continue no<br />
matter what the scenario: a new norm,<br />
global pandemic etc. The cloud allows<br />
organisations to meet their business needs<br />
whilst protecting their data. It allows<br />
organisations to spin up VMs and virtual<br />
assets, and quickly connect to their<br />
infrastructure whether on-premises or in<br />
another cloud. It also lets companies<br />
continue to work remotely in the middle of a<br />
pandemic or other physically disruptive crisis,<br />
such as an extreme weather event, at a lower<br />
price point.<br />
RETAINING PROTECTION <strong>ST</strong>ANDARDS<br />
Organisations can migrate their data to the<br />
cloud for cost and continuity purposes. Once<br />
data is migrated, it is still critical to focus on<br />
data protection. The data will be protected<br />
with the help of the CSP, but they can't stop<br />
doing backup or IT resilience testing.<br />
By supplementing the production<br />
environment with backup and DR in the<br />
cloud, the organisation can ensure that they<br />
have those multiple copies, and air-gapped<br />
back-ups, that can be failed over to almost<br />
instantaneously should an issue occur with<br />
the primary infrastructure.<br />
As an increasing number of organisations<br />
want to get out of the business of managing<br />
their data and just focus on delivering<br />
business value with their IT assets, the cloud is<br />
providing the answer for both primary and<br />
backup infrastructure.<br />
The 3-2-1 backup rule is a good start in<br />
building any data protection system - a way<br />
to protect an organisation's data from<br />
loss/corruption and to control risks in all the<br />
aforementioned situations. The cloud offers<br />
incredibly effective and resource-efficient<br />
ways of achieving this and improving<br />
business continuity and resilience at a time<br />
when events are showing us it has never been<br />
more important.<br />
More info: www.iland.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
17
INDU<strong>ST</strong>RY FOCUS: MEDIA FOCUS: MEDIA<br />
COULD RUSHES BE KEY TO DISPROVING<br />
'DEEPFAKE' VIDEO?<br />
NICK PEARCE-TOMENIUS OF OBJECT MATRIX LOOKS AT SOME OF THE<br />
POTENTIAL COMPLIANCE ISSUES SURROUNDING LONG TERM<br />
<strong>ST</strong>ORAGE OF RAW FOOTAGE FOR TV AND MEDIA PRODUCTION<br />
COMPANIES<br />
Arecent article in the Guardian raised<br />
the possibility of footage on the Jeremy<br />
Kyle show having been altered in order<br />
to tell the story that the producers wanted to<br />
be told, saying: "The family has concerns that<br />
the footage is polished and edited, and does<br />
not represent the totality of the footage that<br />
would have been recorded on all cameras<br />
on the day."<br />
The lack of retention of 'rushes' in a drama is<br />
unlikely to have a negative impact on society<br />
in future years but as the Kyle story highlights<br />
the retention of original footage needs to be<br />
taken more seriously where factual content is<br />
being edited or manipulated.<br />
Another example where studio footage was<br />
key in a criminal prosecution is the "Who<br />
Wants To Be A Millionaire" cheating case, as<br />
Wikipedia recounts: "In court, Ingram claimed<br />
the videotape of his appearance on<br />
Millionaire was 'unrepresentative of what I<br />
heard', and he continues to assert that it was<br />
'unfairly manipulated'. A video recording, with<br />
coughing amplified relative to other sounds<br />
including Ingram's and Tarrant's voices, was<br />
prepared by Celador's editors for the<br />
prosecution and 'for the benefit of the jury'<br />
during the trial."<br />
Given its nature live action content is difficult<br />
to manipulate even with the 'broadcast delay"<br />
but not so if the delay is in the minutes, hours,<br />
days or months, as is typical for reality based<br />
programming. This raises three questions for<br />
those producing factual content and also<br />
presents a real challenge for those<br />
organisations in terms of retaining the potential<br />
hundreds of hours of raw footage that goes<br />
into producing an hour of finished content:<br />
1. How are production companies and<br />
broadcasters protecting rushes or footage<br />
captured by studio cameras on the day?<br />
2. Can they prove authenticity of those rushes<br />
in the years to come?<br />
3. Is it even possible to retain the original<br />
footage and find the clips you need when<br />
required?<br />
Protecting rushes/dailies is not new in highly<br />
regulated industries like financial institutions<br />
that typically are required to adhere to internal<br />
or external regulations. They typically have to<br />
implement platforms and processes that ensure<br />
content security, access control and availability<br />
of historical data.<br />
Imagine the scenario where an analyst from a<br />
global bank gives an interview where the<br />
advice imparted during broadcast differs from<br />
the advice given on camera at the time of<br />
shooting - advice that might bankrupt<br />
individuals, companies or even countries. This<br />
manipulation of the message or story can be<br />
achieved with subtle editing or more recently<br />
the advances in Deepfake technology.<br />
A flip side to Deepfake video or manipulation<br />
in the edit is that people, politicians in<br />
particular, could use the fact that the<br />
technology exists to vehemently deny ever<br />
having said or done something on camera, as<br />
highlighted in a recent article by Daniel<br />
Thomas (BBC News): "The first risk is that<br />
people are already using the fact Deepfakes<br />
exist to discredit genuine video evidence. Even<br />
though there's footage of you doing or saying<br />
something you can say it was a Deepfake and<br />
it's very hard to prove otherwise."<br />
It would appear that being able to prove the<br />
authenticity of raw footage has never been<br />
more important.<br />
HOW IS IT DONE TODAY?<br />
Production companies who own the IP and<br />
rights for shows like "Jeremy Kyle" and<br />
"Millionaire" typically rent the studios and pay<br />
for the services of post-production companies<br />
18 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
INDU<strong>ST</strong>RY FOCUS: INDU<strong>ST</strong>RY FOCUS: MEDIA<br />
to get the show made. Those studio and post<br />
companies will generally be responsible for<br />
protecting the rushes until the show has aired<br />
and many will hold on to them for longer<br />
periods of time until they no longer have the<br />
physical space or resources to manage them.<br />
These cases highlight the need to find<br />
content from a show aired several years<br />
ago - a task that cannot always be done<br />
quickly, if at all. The Jeremy Kyle rushes<br />
were protected by the post-production<br />
company involved, but that is not always<br />
the case. Most organisations simply do not<br />
have the technology platforms nor the<br />
processes in place.<br />
One of the main concerns will always be<br />
"What is the business model?" Keeping<br />
finished content in an archive requires<br />
resources and long term investment but there<br />
is a value in exploiting that content. Doing the<br />
same for thousands of hours of raw footage<br />
has a less obvious return on investment.<br />
The only way companies will feel compelled<br />
to archive rushes forever is via regulation or<br />
as an insurance requirement to assist should<br />
any future litigation occur. If such regulations<br />
are introduced companies will be expected to<br />
find and produce evidential content within<br />
reasonable time frames or get fined.<br />
DIGITAL CONTENT GOVERNANCE<br />
CAN HELP<br />
Good Digital Content Governance, a mix of<br />
process and technology, can ensure that<br />
content is protected, instantly accessible and<br />
proven to be authentic at any time in the<br />
future. It can also help organisations to beat<br />
Deepfake or disprove manipulated images.<br />
Ensuring content is authentic: DCG<br />
platforms make multiple copies of content<br />
on ingest using checksums (digital<br />
fingerprints) to ensure its integrity from day<br />
one and throughout the lifetime of the<br />
content. DCG can place retention policies<br />
on the data such that not even<br />
administrators can accidentally delete it.<br />
Protecting data: Digital Preservation<br />
processes ensure your content is protected<br />
at ingest and remains protected throughout<br />
its lifetime. However, this requires regular<br />
integrity checking which can be a costly<br />
exercise with legacy technology such as<br />
LTO. DCG platforms handle all aspects of<br />
good digital preservation practice, from<br />
continuous content protection and multiple<br />
copy protection (on and off-site) to<br />
business rules support.<br />
Access: Providing searchable audits of<br />
every action during the lifetime of the<br />
media is essential, as it means you can<br />
track exactly what has happened to that<br />
content and who has accessed it. DCG<br />
platforms offer native, searchable audits of<br />
every action from ingest, moves, deletions,<br />
attempted deletions and most importantly,<br />
read. It has to be said that audit is also<br />
possible with public cloud accounts if the<br />
user logins are granular to individuals<br />
performing the actions.<br />
Search: Find is key. With the increasing<br />
volume of data in and out of a facility,<br />
metadata management is as important as<br />
protecting the content itself. The ability to<br />
search for content based on up-to-date<br />
and relevant metadata will unlock the<br />
value of content for many organisations.<br />
Loosely coupled metadata and content will<br />
always make Find an inefficient or<br />
impossible process. DCG platforms protect<br />
the metadata along with the essence for<br />
the lifetime of the content. Using APIs<br />
enables future proof, integrated and<br />
automated workflows that ensure content<br />
can be found even if media asset<br />
management is not available. DCG<br />
platforms can also automate the extraction<br />
and indexing of any embedded metadata<br />
which can vastly increase search efficiency.<br />
Business Continuity: Using incumbent<br />
platforms that rely on legacy archive and<br />
backup practices does not guarantee<br />
continuity of business operations. It is a fact<br />
that loss of data or access to data can lead<br />
to catastrophic loss of revenue for any<br />
sized company. DCG platforms provide<br />
automated and integrated business<br />
continuity functionality ensuring work can<br />
continue despite any outages.<br />
Implementing automated, asynchronous<br />
replication of metadata, data and user<br />
access information ensures that everything<br />
that is needed will be available at the DR<br />
location. Integration of DCG platforms into<br />
the end user ecosystem (i.e. they do not<br />
have to learn new skills) also makes this a<br />
non-disruptive process.<br />
As detailed above, implementing a good<br />
DCG platform that is integrated into media<br />
workflows will bring value to the organisation<br />
and ensure content can be found under any<br />
circumstances.<br />
In summary there are some technical,<br />
commercial and cultural issues to address in<br />
the creative video community if raw footage<br />
and archive content is to be protected in<br />
accordance with internal or external<br />
regulations. One of the biggest challenges will<br />
be the physical resources needed to archive<br />
thousands of hours of potentially 4 and 8k<br />
raw footage.<br />
One potential option is to create a<br />
mezzanine or proxy version of those rushes, in<br />
a certified transformation workflow, that take<br />
up much less space than the originals but<br />
retain enough quality for video processing to<br />
be applied at future dates. Metadata can be<br />
captured during the ingest and transformation<br />
process or that metadata can be generated<br />
later on using AI platforms.<br />
Keeping those rushes on LTO or SAN/NAS<br />
platforms will not be sufficient in terms of good<br />
Digital Content Governance nor the ability to<br />
efficiently process the files in automated<br />
workflows. These rushes will need to be kept in<br />
an object storage or cloud storage platform<br />
whose automated technologies ensure that<br />
good DCG is followed and ensures that rushes<br />
are instantly available and searchable.<br />
More info: www.object-matrix.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
19
<strong>ST</strong>RATEGY: BACKUP BACKUP<br />
BACKUP TO THE FUTURE<br />
BILL ANDREWS, PRESIDENT & CEO OF EXAGRID, EXAMINES THE JOURNEY FROM SIMPLE TAPE<br />
BACKUPS TO TIERED DISK BACKUPS THAT USE ADAPTIVE DEDUPLICATION FOR FA<strong>ST</strong>, RELIABLE AND<br />
AFFORDABLE BACKUP AND RE<strong>ST</strong>ORE SOLUTIONS<br />
An organisation cannot function without<br />
its data. As a result, data is backed up at<br />
least five days per week at virtually every<br />
company around the world. Data backup<br />
guards against short term operational and<br />
external events as well as for legal, financial,<br />
and regulatory business requirements:<br />
Restore files that were deleted, overwritten<br />
or from before a corruption event<br />
Recovery from a primary storage<br />
ransomware attack<br />
Keep retention/historical data for legal<br />
discovery, financial and regulatory audits<br />
Replicate to a second location to guard<br />
against disasters at the primary data<br />
location, such as earthquake, electrical<br />
power grid failure, fire, flood or extreme<br />
weather conditions<br />
Due to all of these requirements, backup<br />
retention points are kept so that organisations<br />
have a copy of the data at various points in<br />
time. Most keep a number of weekly, monthly,<br />
and yearly backups. As an example, if an<br />
organisation keeps 12 weekly copies, 24<br />
monthly copies, and 5 yearly copies - that<br />
amounts to about 40 copies of the data. This<br />
means that the backup storage capacity<br />
required is 40 times the primary storage<br />
amount.<br />
Since backup policies require keeping<br />
retention copies, and the storage needed for<br />
backup is far greater than the primary storage,<br />
the industry has evolved over time to reduce<br />
the amount of storage required in order to<br />
reduce the cost of backup storage.<br />
PHASE 1: TAPE<br />
Backups were sent to tape for about 50 years<br />
because if organisations were going to keep<br />
30, 40, 50, 60 copies of the data (retention<br />
points), then the only cost-effective way to keep<br />
those copies was to use a media that was very<br />
inexpensive per gigabyte. Tape solved the cost<br />
problem as it was inexpensive - but it was also<br />
unreliable because it was subject to dirt,<br />
humidity, heat, wear, etc. Tape also required a<br />
lot of management, including storing tapes in<br />
cartons and shipping a set of tapes offsite each<br />
week to another location or a third-party tape<br />
storage facility. Tape backups were great for<br />
cost but had many issues.<br />
PHASE 2: LOW-CO<strong>ST</strong> DISK<br />
Disk solved all the problems of tape, as it was<br />
reliable, and it was secure since it was in a<br />
data centre rack behind physical and network<br />
security. Organisations could encrypt the data<br />
to replicate to a second data centre (no<br />
physical media to ship).<br />
Disk was far too expensive per gigabyte until<br />
the year 2000 when enterprise-quality SATA<br />
drives were introduced. This dropped the price<br />
of backing up to disk dramatically, as SATA was<br />
reliable enough for backup storage. However,<br />
even at a lower cost, disk was still too<br />
expensive when you did the math of keeping<br />
dozens of copies.<br />
All of the backup applications added writing<br />
to volumes or NAS shares to their products so<br />
that disk could be used. Disk was used as disk<br />
staging in front of tape, but not tape<br />
elimination. Backup applications would write<br />
one or two backups to disk for fast and reliable<br />
backups and restores but still write to tape for<br />
longer-term retention due to cost.<br />
PHASE 3: DATA DEDUPLICATION<br />
APPLIANCES<br />
Although SATA disk was lower in price than any<br />
other enterprise storage media, it was still too<br />
expensive to keep all the retention on disk. In<br />
the 2002-2005 time frame a new technology,<br />
data deduplication, entered the market. Data<br />
deduplication compared one backup to<br />
another and only kept the changes from<br />
backup to backup, which typically is about 2%<br />
change per week. The backups were no longer<br />
kept as full backups as only the unique blocks<br />
were kept, greatly reducing the storage.<br />
Data deduplication did not have much<br />
impact if there were only two or three copies<br />
and in fact, was not much different from just<br />
compressing the data. However, at 18 copies<br />
20 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
<strong>ST</strong>RATEGY: <strong>ST</strong>RATEGY: BACKUP<br />
"There is no free lunch here and the different storage methods are just<br />
pushing the problem around. Why is that? Because unless you build a<br />
solution that includes deduplication and also solves the backup<br />
performance, restore performance, storage efficiency, and scalability<br />
issues - then no matter where the deduplication lives, the solution will<br />
still be broken. The answer is a solution that is architected to<br />
use disk in the appropriate way for fast backups and<br />
restores, uses data deduplication for long-term retention<br />
and scale-out all resources as data grows."<br />
the amount of disk used was 1/20th that of not<br />
using data deduplication. You could store 1TB<br />
in a deduplicated form what it would normally<br />
take 20TB of disk to store without<br />
deduplication. The term 20:1 data reduction<br />
was used (assumed at 18 copies of retention).<br />
If the retention was longer, the data reduction<br />
ratio was even greater.<br />
At this point, organisations could eliminate<br />
tape as the amount of disk required was greatly<br />
reduced, bringing the cost of backup storage<br />
close to that of tape. However, while these<br />
appliances added data deduplication in order<br />
to reduce storage they did not factor in the<br />
trade-off of the compute impact. These<br />
"deduplication appliances" performed the<br />
deduplication inline which means between the<br />
backup application and on the way to the disk.<br />
Data deduplication compares billions of blocks<br />
and therefore is extremely compute-intensive.<br />
This compute-intensive inline deduplication<br />
process then, actually slows backups down and<br />
constitutes about one third the performance of<br />
writing to disk. Since the backups are inline all<br />
the data is deduplicated on the disk, which<br />
means each time you restore the data it has to<br />
be put back together to restore it, which is<br />
called rehydration. This rehydration process is<br />
slow and can take up to 20 times longer than<br />
restoring un-deduplicated data from disk.<br />
The deduplication appliance used block level<br />
deduplication which creates a very large hash<br />
tracking table that needs to be kept in a single<br />
front-end controller. As a result, as data grows,<br />
only storage is added to the controller. If the<br />
data doubles, triples, quadruples, etc. then the<br />
amount of deduplication that has to occur also<br />
needs to be increased, but with a front end<br />
controller the resources (CPU, memory,<br />
network ports) are fixed, and therefore the<br />
same resources are used for four times the<br />
data as were used for one times the data.<br />
As a result, the backup window grows and<br />
grows until you are forced to buy a bigger and<br />
more powerful front-end controller, called a<br />
forklift upgrade, which adds cost over time. The<br />
front-end controller approach relies on fixed<br />
resources and it fails to keep up with data<br />
growth, so the controllers are continuously<br />
being obsoleted to add more resources.<br />
Even though inline scale-up (front-end<br />
controller with disk shelves) appliances lower<br />
the amount of storage resulting in lower<br />
storage costs, they greatly slow down<br />
backups due to inline deduplication, slow<br />
down restores due to only keeping<br />
deduplicated data (rehydration process), and<br />
don't scale, forcing future forklift upgrades<br />
and product obsolescence, adding long term<br />
costs. The net result is that they fix the storage<br />
cost problem but add to backup and restore<br />
performance issues and are not architected<br />
for data growth (scalability).<br />
PHASE 4: DATA DEDUPLICATION IN<br />
BACKUP APPLICATIONS<br />
Customers used and still use data<br />
deduplication appliances however; the<br />
backups applications went through a phase<br />
where they tried to eliminate data<br />
deduplication appliance by integrating the<br />
data deduplication process into the backup<br />
media servers. The idea here was to just buy<br />
low cost disk and have deduplication as a<br />
feature in the backup application. This created<br />
many challenges.<br />
The first challenge is that data deduplication<br />
is compute-intensive and the media server<br />
already has the task of taking all the backups<br />
and writing them to the media so that all<br />
compute resources are already being used. By<br />
adding deduplication to a media server, the<br />
CPU is crushed, and the backups jobs slow to<br />
a crawl. To solve this, backup applications<br />
increase the deduplication block size to do less<br />
comparison and use less CPU. Instead of using<br />
block sizes of 8kb they used (for example)<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
21
<strong>ST</strong>RATEGY: BACKUP BACKUP<br />
worlds, both disk without deduplication for fast<br />
backups and restores, and deduplication to<br />
lower the overall storage costs. The first tier is<br />
a disk-cache (Landing Zone) where backups<br />
are being written to standard disk in their<br />
native format (no deduplication to slow it<br />
down).<br />
128kb. Instead of achieving the 20:1<br />
deduplication ratio of a deduplication<br />
appliance, they achieve a rate of about 5:1 or<br />
6:1. Also, they are slowing down the media<br />
server and all data is deduplicated on the disk<br />
so restores are still slow.<br />
Lastly, the same scaling issues remain. Some<br />
of the backup application companies<br />
packaged up the media server with<br />
deduplication with a server and disk to create a<br />
turnkey appliance however the challenges still<br />
exist: slow backups, slow restores, scalability<br />
issues, and the cost is higher since they use a<br />
lot more disk than a deduplication appliance<br />
because they have a lower deduplication ratio<br />
due to a larger block size.<br />
WHERE DOES THIS LEAVE US?<br />
There is no doubt that disk is the right medium.<br />
It is reliable and lives in a data centre rack with<br />
both physical and network security, both onsite<br />
and offsite. If data is backed up to disk without<br />
data deduplication the backup and restore<br />
performance is great, however the cost is high<br />
due to the sheer amount of disk required.<br />
Using an inline deduplication appliance, you<br />
can reduce the high cost of storage due to the<br />
20:1 deduplication ratio. However, all of these<br />
appliances are slow for backups due to inline<br />
deduplication processing, slow for restores due<br />
to only keeping deduplicated data that needs<br />
to be rehydrated with each request, and they<br />
don't scale as data grows which grows the<br />
backup window over time and forces costly<br />
forklift upgrade and product obsolescence.<br />
If deduplication is used in a backup<br />
application the performance is even slower<br />
than a deduplication appliance as the CPU is<br />
being shared between the deduplication<br />
process and media server functionality. The<br />
backups applications can improve this with<br />
incremental backups only but there are other<br />
trade-offs. In addition, far more disk is required<br />
as the deduplication ratio is more in the range<br />
of 5:1 to 10:1 rather than 20:1.<br />
There is no free lunch here and the different<br />
storage methods are just pushing the problem<br />
around. Why is that? Because unless you build<br />
a solution that includes deduplication and also<br />
solves the backup performance, restore<br />
performance, storage efficiency, and scalability<br />
issues - then no matter where the deduplication<br />
lives, the solution will still be broken. The<br />
answer is a solution that is architected to use<br />
disk in the appropriate way for fast backups<br />
and restores, uses data deduplication for longterm<br />
retention and scale-out all resources as<br />
data grows.<br />
PHASE 5: THE FUTURE - TIERED BACKUP<br />
<strong>ST</strong>ORAGE<br />
Tiered backup storage offers the best of both<br />
This allows for fast backups and fast restores<br />
as there is no deduplication process in<br />
between the backup and the disk, and the<br />
most recent backups are stored in an undeduplicated<br />
format. As the backups are<br />
being written to disk, in parallel with backups<br />
coming in, the data is deduplicated into a<br />
second tier for longer-term retention storage.<br />
This is called Adaptive Deduplication (it is not<br />
inline, and it is not post process). The system is<br />
comprised of individual appliances that each<br />
have CPU, memory, networking, and storage,<br />
and as data grows all resources are added<br />
which keeps the backup window fixed in<br />
length as data grows and eliminates both<br />
forklift upgrades and product obsolescence.<br />
The net is:<br />
Backups are as fast as writing to disk<br />
as there is no deduplication process in<br />
the way<br />
Restores are fast as there is no data<br />
rehydration process, because the most<br />
recent backups are in a nondeduplicated<br />
form<br />
Cost is low upfront because all long-term<br />
retention data is deduplicated in the longterm<br />
repository tier<br />
Backup window stays fixed in length as<br />
data grows as the architecture is scaleout,<br />
adding all resources and not just disk<br />
as data grows<br />
Long-term costs are low as the scale-out<br />
architectural approach eliminates forklift<br />
upgrades and product obsolescence<br />
In summary then, backup storage has taken<br />
a long journey and has arrived with tiered<br />
backup storage that provides fast and reliable<br />
backups and restores, with a low cost up front<br />
and over time.<br />
More info: www.exagrid.com<br />
22 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
TECHNOLOGY:<br />
TECHNOLOGY: CLOUD DATA WAREHOUSING<br />
CLOUD: YOUR FLEXIBLE FRIEND<br />
WHAT IS A CLOUD DATA WAREHOUSE AND WHY IS IT IMPORTANT? ROB MELLOR, VP AND GM EMEA,<br />
WHERESCAPE, SHARES SOME INSIGHTS<br />
We are seeing business<br />
expectations for on-demand data<br />
explode, with many data<br />
warehousing teams beginning to transition<br />
their data warehousing efforts to the<br />
cloud. With the need to efficiently pull<br />
together data from a wide range of everevolving<br />
data sources and present it in a<br />
consumable way to a broadening<br />
audience of decision makers, cloud data<br />
warehousing is proving invaluable.<br />
In this article, we are going to cover the<br />
basics and explore cloud data<br />
warehousing; how the cloud data<br />
warehouse compares to the traditional<br />
data warehouse, and the benefits of a<br />
cross-cloud solution.<br />
WHAT IS A CLOUD DATA<br />
WAREHOUSE?<br />
A cloud data warehouse is a database<br />
service hosted online by a public cloud<br />
company. It has the functionality of an onpremises<br />
database but is managed by a<br />
third party, can be accessed remotely and<br />
its memory and compute power can be<br />
shrunk or grown instantly.<br />
TRADITIONAL VS. CLOUD<br />
A traditional data warehouse is an<br />
architecture for organising, storing and<br />
accessing ordered data, hosted in a data<br />
centre on premises owned by the<br />
organisation whose data is stored within it.<br />
It is of a finite size and power and is<br />
owned by that organisation.<br />
A cloud data warehouse is a flexible<br />
volume of storage and compute power,<br />
which is part of a much bigger public<br />
cloud data centre and is accessed and<br />
managed online. Storage and compute<br />
power is merely rented. Its physical<br />
location is largely irrelevant apart from for<br />
countries and/or industries whose<br />
regulations dictate their data must be<br />
stored in the same country.<br />
BENEFITS OF THE CLOUD<br />
APPROACH<br />
The benefits of a Cloud Data Warehouse<br />
can be summarised in five main points:<br />
1. Access<br />
Rather than having only physical access to<br />
databases in data centres, cloud data<br />
warehouses can be accessed remotely<br />
from anywhere. As well as being<br />
convenient for staff who live near the data<br />
centre, who can now troubleshoot from<br />
home or anywhere out of hours if needed,<br />
this access means companies can hire staff<br />
based anywhere, which opens up talent<br />
pools that were previously unavailable.<br />
Cloud data warehousing is self-service<br />
and so its provision does not depend on<br />
the availability of specialist staff.<br />
2. Cost<br />
Data centres are expensive to buy and<br />
maintain. Property to store them in needs<br />
to be properly cooled, insured and<br />
expertly staffed, and the databases<br />
themselves come at a huge cost. Cloud<br />
data warehousing allows the same service<br />
to be enjoyed, but you only pay for the<br />
computing and storage power you need,<br />
when you need it. Now with elastic cloud<br />
services such as Snowflake, compute and<br />
storage can be bought separately, in<br />
different amounts. You really only have to<br />
pay for what you are using, and you can<br />
instantly close or downsize capabilities you<br />
do not need.<br />
24 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
TECHNOLOGY:<br />
TECHNOLOGY: CLOUD DATA WAREHOUSING<br />
"Hosting data in a Cloud data warehouse means you<br />
can switch providers if and when it suits changes in<br />
business strategy. Staying database-agnostic means you<br />
have the agility to upsize, downsize or switch completely.<br />
Metadata-driven automation software allows you to lift<br />
and shift entire data infrastructures on and off of a<br />
Cloud data warehouse if desired, and allows different<br />
teams within the same company to work with the<br />
database and hybrid cloud structure that best suits<br />
their needs."<br />
3. Performance<br />
Cloud service providers compete to offer<br />
use of the most performant hardware for a<br />
fraction of the cost that would be incurred<br />
to reproduce such power on-premises.<br />
Upgrades are performed automatically, so<br />
you always have the latest capabilities and<br />
do not experience downtime in upgrading<br />
to the latest 'version'. Some on premises<br />
databases offer faster performance, but not<br />
at the cost and availability of the<br />
'Infrastructure-as-a-service' that Cloud<br />
providers offer.<br />
4. Scalability<br />
Opening a Cloud data warehouse is as<br />
simple as opening an account with a<br />
provider such as Microsoft Azure, AWS<br />
Redshift, Google BigQuery and Snowflake.<br />
The account can be grown, shrunk, or even<br />
closed instantly. Users are aware of the<br />
costs involved before they change the<br />
amount of compute or storage they rent.<br />
This scalability has led to the coining of the<br />
phrase 'Elastic Cloud'.<br />
5. Agility<br />
Hosting data in a Cloud data warehouse<br />
means you can switch providers if and when<br />
it suits changes in business strategy. Staying<br />
database-agnostic means you have the<br />
agility to upsize, downsize or switch<br />
completely. Metadata-driven automation<br />
software allows you to lift and shift entire<br />
data infrastructures on and off of a Cloud<br />
data warehouse if desired, and allows<br />
different teams within the same company to<br />
work with the database and hybrid cloud<br />
structure that best suits their needs.<br />
CHOOSING A SOLUTION<br />
A cost analysis is vital in estimating how<br />
much money a Cloud Data Warehouse<br />
might save a business. Different Cloud<br />
providers have different pricing structures<br />
that need bearing in mind. More<br />
established providers such as Amazon and<br />
Microsoft rent nodes and clusters, so your<br />
company uses a defined section of the<br />
server. This makes pricing predictable and<br />
constant, but sometimes maintenance to<br />
your particular node is needed.<br />
Snowflake and Google offer a 'serverless'<br />
system, which means the cluster locations<br />
and numbers are not defined and so are<br />
irrelevant. Instead the customer is charged<br />
for the exact amount of compute or<br />
processing power it consumes. However, in<br />
bigger companies it is often difficult to<br />
predict the amount of users and size of a<br />
process before it occurs. It is possible for<br />
queries to be much bigger than was<br />
assumed and so cost much more than was<br />
expected.<br />
Each cloud provider has its own suite of<br />
supporting tools for functions such as data<br />
management, visualisation and predictive<br />
analytics, so these needs should be factored<br />
in when deciding on which provider to use.<br />
Using cloud-based data warehouse<br />
platforms, you can gather even more data<br />
from a multitude of data sources and<br />
instantly and elastically scale to support<br />
virtually unlimited users and workloads.<br />
With the ability to manage the influx of big<br />
data, using automation to aid in providing<br />
return on investment, businesses will be<br />
able to manage the influx of big data,<br />
automate manual processes and maximise<br />
the return on cloud.<br />
More info: www.wherescape.com<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
25
CASE <strong>ST</strong>UDY: CINESITE <strong>ST</strong>UDY: CINESITE<br />
RENDERING ASSI<strong>ST</strong>ANCE<br />
DIGITAL ENTERTAINMENT <strong>ST</strong>UDIO CINESITE IS ABLE TO BRING MOTION PICTURES TO AUDIENCES FA<strong>ST</strong>ER<br />
WITH CLOUD RENDERING<br />
FRU<strong>ST</strong>RATIONS DRIVE NEW<br />
THINKING<br />
During the recent production of a fulllength<br />
animated feature film, Cinesite ran<br />
into technology issues with the existing<br />
cluster it had recently purchased. That<br />
vendor's system was causing network<br />
slowdowns for unknown reasons - for up to<br />
minutes at a time.<br />
Cinesite is a leading digital<br />
entertainment studio with credits on<br />
animated feature films such as The<br />
Addams Family, Extinct and Riverdance<br />
and VFX projects such as Avengers:<br />
Endgame, Rocketman, The Witcher, and<br />
the James Bond franchise. The company<br />
employs nearly 1,000 digital artists and<br />
staff, who work from offices across<br />
London, Montreal, Berlin, Munich, and<br />
Vancouver.<br />
Cinesite's award-winning visual effects<br />
and animation teams help bring<br />
filmmakers' visions to life. To support<br />
complex and demanding workflows for<br />
visual effects, and conceiving and<br />
realising CG-animated films, Qumulo<br />
and AWS enabled Cinesite to leverage<br />
high-performance storage at scale,<br />
helping Cinesite achieve more than the<br />
studio ever thought possible, including<br />
developing scalable 16K video workflows<br />
for future applications.<br />
Cinesite's existing infrastructure included<br />
a newly-installed but older generation<br />
storage technology from another provider<br />
that supported approximately 500 render<br />
nodes in the Montreal data centre, and a<br />
workflow that leveraged AWS for<br />
occasional overflow rendering.<br />
Eventually, the slow-downs became full<br />
interruptions - freezing the productivity of<br />
465 artists for as long as an hour. The<br />
system freezes could happen at any time,<br />
and that put production schedules at risk.<br />
JUMPING INTO ACTION<br />
Frustrated with that vendor's system and its<br />
inability to solve the problem, Cinesite<br />
approached Qumulo for ideas. Qumulo<br />
quickly deployed hardware nodes onsite,<br />
and was able to get Cinesite back up and<br />
running in short order.<br />
After the immediate need was solved,<br />
Qumulo engineers worked with the Cinesite<br />
team to diagnose other issues they were<br />
facing with their legacy systems to fully<br />
restore network speed. In fact, on one<br />
occasion, the Cinesite technical team was<br />
working on a solution well into the early<br />
morning - and reached out to Qumulo's<br />
customer success team at that unusual<br />
hour. Within sixty minutes, the Qumulo<br />
team responded with suggestions for<br />
configuration changes that would further<br />
increase network performance. Cinesite<br />
implemented those suggestions, got the<br />
performance it needed, and "from that day<br />
forward, we haven't looked back," said<br />
Graham Peddie, Chief Operating Officer,<br />
Cinesite Montreal.<br />
PLANNED TO EXPAND<br />
Cinesite knows first-hand the challenges of<br />
26 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
CASE <strong>ST</strong>UDY: CASE <strong>ST</strong>UDY: CINESITE<br />
resource planning. "We can't plan for the<br />
peaks, so we plan for an average," said<br />
Peddie. That is the way planning had<br />
worked in the past. It was clear Cinesite<br />
would need a modern, cloud- native<br />
solution to move to a competitive scale.<br />
With the visual effects (VFX) and feature<br />
animation division pipelines at full capacity,<br />
and no easy way to burst to the cloud at the<br />
scale Cinesite needed for the extraordinary<br />
render and storage requirements, the studio<br />
again turned to Qumulo for a way out.<br />
To achieve the scale Cinesite was after,<br />
moving the workload to AWS US East<br />
(Virginia) region from the smaller region<br />
they had been using was required. With the<br />
existing solution, this would have been no<br />
easy feat. With Qumulo, it was seamless.<br />
"The only way we could expand to the new<br />
zone was by implementing Qumulo cloud<br />
storage," said Peddie. "This approach<br />
allowed us to spin up the machines and<br />
store data for offsite rendering on AWS US<br />
East (Virginia). Without Qumulo, we<br />
wouldn't have been able to do this or meet<br />
our deadlines."<br />
Qumulo's hybrid file software runs the<br />
same enterprise file system in the cloud as<br />
on-prem, and data can be natively and<br />
seamlessly replicated between instances or<br />
across regions. Bursting to 20, 200, or<br />
even 2,000 high-quality render nodes on<br />
AWS with Qumulo to keep pace with all<br />
that power is no problem. Instances can be<br />
spun up in minutes, and torn down just as<br />
quickly.<br />
Spencer Kuziw, Lead Systems<br />
Administrator, Cinesite Montreal, explained:<br />
"Qumulo is a huge benefit to us. We can<br />
spin up as many high quality render nodes<br />
as we need, in as many regions as we<br />
need, without impacting our local storage.<br />
And the Qumulo hybrid cloud software can<br />
handle whatever we throw at it. It is an<br />
essential part of our cloud deployment<br />
strategy."<br />
QUMULO GETS IT<br />
Customer support is another crucial benefit.<br />
Cinesite's media and entertainment clients<br />
operate within pressure-packed deadlines,<br />
and the studio has to be highly proactive to<br />
meet their needs. "Qumulo is different,"<br />
Peddie said. "When it comes to our<br />
workflows and deadlines, Qumulo gets it.<br />
They know that we're under pressure. They<br />
know that solutions can't take weeks and<br />
months. We need issues solved quickly. So,<br />
for me, Qumulo's responsive and proactive<br />
customer support was an important benefit<br />
and set the company apart from all the<br />
other vendors we've seen."<br />
FINGER ON THE PULSE<br />
Analytics and real-time visibility are also<br />
crucial to Cinesite. Qumulo's real-time<br />
analytics tools enabled the studio to identify<br />
and fix pipeline inefficiencies. During a<br />
recent migration, real-time activity and<br />
usage analytics made it immediately clear<br />
that a script was making multiple copies of<br />
a directory, eating up space.<br />
Qumulo analytics show activity in realtime,<br />
including by directory growth, most<br />
active network IPs, most active file paths,<br />
and so on, making it simple to pinpoint a<br />
problem and quickly clean it up. Typically<br />
on other systems, common issues like that<br />
go unnoticed and storage capacity simply<br />
fills up, leaving admins with the task of<br />
running reports, waiting days for them to<br />
complete, then conducting forensics.<br />
EYES ON THE FUTURE<br />
Cinesite continues to consider additional<br />
cloud options to take advantage of the<br />
latest media and entertainment<br />
technologies. The team is exploring new<br />
and exciting projects like 16K-plus file sizes<br />
and unique applications outside of cinema.<br />
Peddie said, "We could never have tackled<br />
these technological and creative challenges<br />
without a cloud solution. Qumulo has<br />
enabled us to boost Cinesite's competitive<br />
position within the industry."<br />
More info: www.qumulo.com<br />
"Qumulo is a huge<br />
benefit to us. We can<br />
spin up as many high<br />
quality render nodes as<br />
we need, in as many<br />
regions as we need,<br />
without impacting our<br />
local storage. And the<br />
Qumulo hybrid cloud<br />
software can handle<br />
whatever we throw at it. It<br />
is an essential part of our<br />
cloud deployment<br />
strategy."<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
27
TECHNOLOGY:<br />
TECHNOLOGY: ENERGY CONSUMPTION<br />
POWER PLAY<br />
RAINER KAESE, SENIOR MANAGER, <strong>ST</strong>ORAGE PRODUCTS DIVISION, TOSHIBA ELECTRONICS<br />
EUROPE, SHARES SOME INSIGHTS FROM A RECENT EXPERIMENTAL PROJECT UNDERTAKEN AT THE<br />
COMPANY INTO THE ENERGY CONSUMPTION OF DISK DRIVES<br />
Energy efficiency initiatives have<br />
driven down energy consumption<br />
significantly over the past decades.<br />
Today's homes probably consume as<br />
much energy for lighting as that required<br />
for two or three old 100 W light bulbs.<br />
But who would have thought that, using<br />
the latest generation of hard disk drives,<br />
a petabyte of storage requiring less<br />
energy than five of those old light bulbs<br />
could be achieved?<br />
With the demand for always-on, online<br />
storage capacity for databases seemingly<br />
showing no signs of abating, it is vital to<br />
develop storage systems that can keep up<br />
with this growing flood of data while<br />
simultaneously fulfilling certain criteria.<br />
Cost per capacity ($/TB) is usually the<br />
most important of these, due to the<br />
immense quantities of data involved.<br />
However, energy consumption is another<br />
aspect to consider as this impacts the<br />
long-term operational costs. This energy<br />
should also be consumed efficiently,<br />
thereby reducing the need for cooling<br />
that also incurs costs.<br />
Physical dimensions of the end solution<br />
also need to be considered. Increasing<br />
the number of disks requires a housing<br />
with increased volume. Ideally, the server<br />
housing should easily be accommodated<br />
by a standard 19" rack system, fitting into<br />
existing infrastructure of 1000 mm long<br />
racks. Performance is obviously another<br />
factor but, if the key goals are high<br />
capacity at low power consumption, it is<br />
possible to tolerate toward lower IOPS or<br />
throughput figures.<br />
28 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
TECHNOLOGY:<br />
TECHNOLOGY: ENERGY CONSUMPTION<br />
In an investigation undertaken by the<br />
research team at Toshiba Electronics<br />
Europe GmbH, a project was undertaken<br />
to see if it was possible to build 1 PB of<br />
data storage into a system consuming less<br />
than 500 W of power.<br />
CHOICE OF <strong>ST</strong>ORAGE<br />
The requirement for mass capacity is<br />
achieved most cost-effectively with the use<br />
of HDDs, the top capacity models of<br />
which have similar $/TB ratios for 12 TB,<br />
14 TB and 16 TB models. However, in<br />
order to ensure that the final system<br />
would fit into a standard 19" rack, it<br />
clearly made sense to select the largest<br />
16 TB capacity drives to keep the physical<br />
volume required to an absolute minimum.<br />
This choice also aligns well with the<br />
power consumption goal, since the power<br />
dissipation per unit capacity has<br />
successively dropped with the introduction<br />
of new HDD models (see Table 1).<br />
This is due not only to the new<br />
technology implemented, but also thanks<br />
the move to helium-filled drives (see<br />
Figure 1).<br />
The 16 TB models of Toshiba's MG08<br />
series are available with both SAS and<br />
SATA interfaces. The SAS interface<br />
provides two 12 GB/s channels that are<br />
ideally suited to systems where high<br />
availability and throughput are a priority.<br />
However, there is a power consumption<br />
cost associated with this choice since SAS<br />
drives consume around one to two watts<br />
more than their SATA counterparts. Since<br />
the goal was to reduce power<br />
consumption, the SATA interface model<br />
MG08ACA16TE was the chosen<br />
candidate for this project.<br />
The individual specifications for this<br />
particular drive, in terms of power<br />
dissipation, are shown in Table 2.<br />
SELECTING AN ENCLOSURE<br />
With the storage defined, the next step<br />
was to select a suitable enclosure. Toploader<br />
models are convenient and<br />
available as a JBOD in four unit high 19"<br />
rack format. A 60-bay model from AIC,<br />
the AIC-J4060-02, was selected for this<br />
project. The single expander version was<br />
chosen, saving on cost and power<br />
dissipation, matching with the<br />
specification of the one-channel SATA<br />
interface. Once filled with 16 TB HDDs,<br />
the solution has a gross storage of 960<br />
TB, almost one petabyte. The JBOD is<br />
then connected to the host bus adapter<br />
(HBA) or RAID controller of the server via<br />
one mini SAS-HD cable.<br />
With a length of just 810 mm, this JBOD<br />
fits into any existing rack.<br />
BASELINE TE<strong>ST</strong>ING<br />
An initial power consumption<br />
measurement was made without the<br />
HDDs via the 220 V inputs to the twin<br />
redundant power supply. With no HDDs<br />
inserted, but both the JBOD and SAS link<br />
up, an initial measurement of 80 W was<br />
made. The next step was to measure<br />
power consumption with a single drive<br />
under different workload conditions.<br />
Write workloads were chosen that<br />
simulated archiving, video recording and<br />
backup using 64 kB sequential blocks.<br />
Using the same block size, sequential<br />
reads were also undertaken, equivalent to<br />
a backup recovery and media streaming<br />
workload. To provide a further data point,<br />
4 kB random read/writes were also<br />
performed, corresponding to the agile<br />
"hot-data" workload of databases.<br />
Obviously, these do not fully correlate<br />
with the typical workload for this type of<br />
system but allowed the collection of<br />
reference data for comparison purposes.<br />
In addition to these borderline cases a<br />
test with an approximate real workload<br />
was carried out. A mix of different block<br />
sizes was read and written randomly (4kB:<br />
20%, 64kB: 50%, 256kB: 20%, 2MB:<br />
10%). In order to achieve the maximum<br />
possible performance, all synthetic loads<br />
were executed with a queue depth (QD)<br />
of 16. In addition to these tests, a<br />
standard copy process was started on a<br />
logical drive under Windows and the<br />
power dissipation measured.<br />
The results for the individual drive use<br />
case consistently shows a lower power<br />
consumption than that given in the data<br />
sheet for the selected drive (see Table 3).<br />
Another point to note is that, in<br />
opposition to the data sheet, sequential<br />
loads result in higher power consumption<br />
than random access loads. This can be<br />
traced back to the power needs of the<br />
JBOD, since the SAS expanders require<br />
more power at high bandwidths in<br />
sequential operation.<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
29
TECHNOLOGY:<br />
TECHNOLOGY: ENERGY CONSUMPTION<br />
TE<strong>ST</strong>ING VARIOUS<br />
CONFIGURATIONS<br />
With all the slots of the JBOD filled, the<br />
maximum power consumption when the<br />
system was idling lay at a respectable<br />
420 W. This is slightly higher than<br />
expected (80 W + 60 x 4 W = 320 W)<br />
and can be traced back to the fact that<br />
the controller occasionally addresses the<br />
HDDs even in idle mode. On the other<br />
hand, the peak start-up power measured<br />
lay at just 720 W, significantly lower than<br />
the sum of the JBOD plus the spin-up<br />
data sheet values for the HDDs (80 W +<br />
60 x 16.85 W = ~1100 W). This can be<br />
traced back to the staggered approach to<br />
spin-up the system employs, applying<br />
power to the HDDs one after the other.<br />
The system was re-tested using the<br />
same workloads used for single drive<br />
operation. The highest power<br />
consumption of 500 W measured<br />
occurred during sequential reads of 64kB<br />
blocks, while the lowest of 445 W was for<br />
both sequential 64 kB and random 4 kB<br />
writes (see Figure 2).<br />
Two further configurations were also<br />
investigated. The first combined the 60<br />
disks into a local RAID10 with 5 subarrays<br />
to create 480 TB net storage. This<br />
was then formatted as two 240 TB logical<br />
drives under Windows Server 2016.<br />
Here, sequential accesses required less<br />
power, while random accesses essentially<br />
matched that measured in JBOD mode.<br />
Implementing a software-defined,<br />
zettabyte file system (ZFS) using<br />
JovianDSS from Open-E also resulted in<br />
improvements in power consumption for<br />
read tests, but slightly higher<br />
measurements when writing. In this<br />
configuration two 800 GB enterprise<br />
SSDs were also added as a read cache<br />
and a write log buffer, with the resulting<br />
240 TB logical drives made available<br />
over iSCSI.<br />
CONCLUSIONS<br />
Toshiba Electronics Europe GmbH<br />
estimates the total capacity of enterprise<br />
capacity (Nearline) HDDs shipped in<br />
2019 at around 500 exabytes (500,000<br />
petabytes). If all these HDDs were<br />
operated as 16TB models in 60-bay<br />
JBODs, this would result in a continuous<br />
power consumption of 225MW<br />
(equivalent to an average coal-fired<br />
power plant). However, since the majority<br />
of HDDs delivered in 2019 had even<br />
lower capacities, it can be assumed that<br />
the power consumption was even higher<br />
and it is clear that there is significant<br />
room for improvement to reduce the<br />
industry's W/TB power consumption<br />
figures.<br />
The investigations and testing<br />
undertaken by Toshiba show that, thanks<br />
to the power efficiency of the latest<br />
generation of high-capacity, helium-filled<br />
disks, petabyte storage that typically<br />
demands less then 500 W of power is<br />
indeed achievable. This is a significant<br />
milestone for data centres working to<br />
grow capacity while keeping both capital<br />
expenditure and operating costs down.<br />
Additionally, this can be achieved in a<br />
range of configurations, from pure<br />
JBOD, through RAID, to softwaredefined,<br />
and in a standard dimension 19"<br />
rack format with a commonly available<br />
enclosure.<br />
More info: www.toshiba-storage.com<br />
30 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
OPINION:<br />
OPINION: DATA PROTECTION<br />
PEOPLE: THE WEAKE<strong>ST</strong> LINK<br />
FLORIAN MALECKI OF <strong>ST</strong>ORAGECRAFT WARNS THAT<br />
ORGANISATIONS NEED TO BEWARE 'THE VULNERABILITY FROM<br />
WITHIN': HUMAN ERROR<br />
While cyber threats continue to be a<br />
massive drain on business<br />
productivity, there is another, less<br />
obvious vulnerability: unintentional employee<br />
error. Indeed, a majority of businesses say<br />
that simple human error is their leading cause<br />
of data loss, according to a survey from<br />
StorageCraft.<br />
Among survey respondents, 61% reported<br />
that their company had suffered a data loss<br />
over the last two years. More striking is that<br />
67% said human error - everyday mistakes<br />
made by employees - was the primary reason<br />
for data loss and system outages. Human<br />
error, for example, weak passwords and "dirty"<br />
work environments, can be the pathway to<br />
security hacks and have potential to wreak<br />
havoc far greater than that of a third party with<br />
malicious intent.<br />
It can be as simple as an employee<br />
misplacing a spreadsheet or spilling coffee on<br />
their laptop. It could be someone who<br />
accidentally deletes a critical file or an entire<br />
database of critical information. Then there are<br />
the real-life oddities such as dropping a laptop!<br />
These seemingly small incidents can add up<br />
and potentially cripple a business.<br />
A few years ago, software company Gliffy<br />
experienced a nightmare scenario when one<br />
of its employees pressed the wrong key and<br />
deleted the company's entire production<br />
database. The same thing happened to<br />
GitLab a few years back, resulting in a major<br />
service outage.<br />
Perhaps the most famous data-deletion story<br />
involved Pixar during the production of Toy<br />
Story 2. One of the movie's animators<br />
accidentally entered a delete command,<br />
resulting in a cascade of errors that erased<br />
90% of the production files. To make matters<br />
worse, the data-backup system failed to work<br />
properly due to inadequate disk space. For a<br />
brief moment, there were fears that the entire<br />
production would have to be scrapped. It was<br />
only a Herculean effort by the technical crew<br />
that saved the film.<br />
The data-loss problem could become even<br />
more prevalent in the current and post-COVID<br />
world, as millions of people work remotely.<br />
Moving employees, their computers, and data<br />
from a secure office environment to a lesssecure<br />
home environment presents a range of<br />
unintentional data-loss risks.<br />
The reality is that employees will continue to<br />
make mistakes, they're only human, after all.<br />
Here are three ways that organisations can<br />
protect themselves against catastrophic data<br />
loss caused by human error:<br />
Promote good data backup habits. With so<br />
many employees working remotely, it's<br />
harder for organisations to manage<br />
backups and store data on the corporate<br />
network. Encourage employees to be<br />
responsible and back up their data<br />
regularly. If they store data on a local flash<br />
drive inserted into their laptop, they should<br />
back it up to the cloud or another hard<br />
drive. If employees store their data primarily<br />
in the cloud, they should be sure to have<br />
another copy offline.<br />
Encourage stringent cyber hygiene. All<br />
employees, especially those working<br />
remotely, need to be reminded to update<br />
the software on their devices and enable all<br />
available security features, such as firewalls<br />
and anti-malware. Failing to install updated<br />
software and security patches is a wellknown<br />
employee misstep that creates gaps<br />
for malware and ransomware to seize on.<br />
Limit the number of files employees can<br />
access. Employees should only be able to<br />
access data and folders based on the<br />
principle of 'least privilege'. This gives<br />
employees enough access to perform their<br />
required jobs but prevents them from<br />
accidentally deleting or corrupting files they<br />
shouldn't have had access to in the first<br />
place, meaning the risk caused by human<br />
error is significantly reduced.<br />
A business' weakest link may well be the<br />
'danger within', albeit unintentional. With the<br />
right strategies and processes in place,<br />
businesses can limit data loss when employees<br />
inevitably make mistakes.<br />
More info: www.storagecraft.com<br />
32 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE
TECHNOLOGY: TECHNOLOGY: SSD<br />
WHAT HAPPENS WHEN YOUR<br />
SSD DIES?<br />
RECOVERING DATA FROM FAILED SOLID-<strong>ST</strong>ATE DRIVES CAN BE MORE<br />
CHALLENGING THAN WITH HARD DISKS, EXPLAINS PHILIP BRIDGE,<br />
PRESIDENT OF ONTRACK<br />
There is no doubt that the use of solidstate<br />
drives (SSDs) has gathered pace.<br />
The main benefit is that they are much<br />
faster than a legacy HDD. This is because a<br />
standard HDD consists of many moving<br />
parts, as typified by the telltale 'whirring'<br />
sound we have all become accustomed to.<br />
When data needs to be accessed, the<br />
read/write head needs to move to the<br />
correct position. SSDs, by contrast, don't<br />
have any moving parts. This speed of<br />
operation makes them perfect for<br />
environments where real-time access and<br />
transfer of data is a necessity.<br />
One of the main downsides of SSDs<br />
though is that they have a limited life span.<br />
Whilst HDDs can - in theory - last forever, an<br />
SSD has a built-in 'time of death' that you<br />
can't ignore. This is because data can only<br />
be written on the storage cells a finite<br />
number of times. After that, the cells 'forget'<br />
new data. Because of this - and to prevent<br />
certain cells from getting used all the time<br />
while others aren't - manufacturers use wearlevelling<br />
algorithms to distribute data evenly<br />
over all cells by the controller.<br />
When it comes to estimating this time of<br />
death, manufacturers use something called<br />
terabytes written (TBW). The TBW figure can<br />
rather accurately tell you how much data<br />
can be written in total on all cells inside the<br />
storage chips. A typical TBW figure for a<br />
250 GB SSD lies between 60 and 150<br />
terabytes written. To put this in perspective,<br />
to get over a TBW of 70, a user would have<br />
to write 190 GB daily over one year (in<br />
other words, to fill two-thirds of the SSD with<br />
new data every day). While in a consumer<br />
environment, this is highly unlikely, in a 21stcentury<br />
business, it is highly plausible.<br />
One of the most popular SSDs - the<br />
Samsung SSD 850 PRO SATA - is stated to<br />
be "built to handle 150 terabytes written<br />
(TBW), which equates to a 40 GB daily<br />
read/write workload over a ten-year period."<br />
Samsung promises that the product is<br />
"withstanding up to 600 terabytes written<br />
(TBW)." If we consider a normal office user<br />
to write somewhere between 10 and 35 GB<br />
a day, even if one raises this amount up to<br />
40 GB, it means that they could write for<br />
more than five years until they reach the 70<br />
TBW limit.<br />
These rates have been verified by Google<br />
and the University of Toronto who - after<br />
testing SSDs over a multi-year period - put<br />
the age limit as somewhere between five and<br />
ten years depending on usage; around the<br />
same time as the average washing machine.<br />
WOR<strong>ST</strong> CASE SCENARIO<br />
So, what do you do if the worst happens and<br />
your SSD does indeed stop working? It is no<br />
exaggeration to say that in this era where<br />
data is king, not having access to that data<br />
could prove to be catastrophic. To mitigate<br />
the impact, it is best to contact a<br />
professional data recovery service provider<br />
where possible.<br />
When it comes to a physical fault, it is not<br />
possible for a user to recover or rescue their<br />
data themselves, however well-intentioned<br />
they may be. In fact, any attempt to recover<br />
data could make matters worse and lead to<br />
permanent data loss.<br />
Even though the average SSD lifespan is<br />
longer than users may expect, using SSDs<br />
can still pose a serious threat, as recovering<br />
data from failed SSDs is distinctly<br />
challenging. Sometimes the only solution is<br />
to find an identical functioning controller<br />
chip and swap it in to gain access - which is<br />
easier said than done.<br />
More info: www.ontrack.com/uk<br />
www.storagemagazine.co.uk<br />
@<strong>ST</strong>MagAndAwards <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
<strong>ST</strong>ORAGE<br />
MAGAZINE<br />
33
RESEARCH:<br />
RESEARCH: <strong>ST</strong>ORAGE <strong>ST</strong>RATEGIES<br />
PANDEMIC INCREASES PRESSURES ON I.T.<br />
SURVEY UNCOVERS THE LIMITATIONS IMPOSED BY TRADITIONAL I.T. INFRA<strong>ST</strong>RUCTURES, EXACERBATED<br />
BY REMOTE WORKING DURING COVID-19 PANDEMIC<br />
Nebulon has released the results of an<br />
independent survey completed by IT<br />
decision makers at 500 companies in<br />
the IT, financial services, manufacturing, retail,<br />
distribution and transport industries across the<br />
UK, US, Germany and France. Conducted in<br />
June of this year, the survey exposes the biggest<br />
challenges enterprises face in transforming<br />
their on-premises application storage<br />
environments, which have only been<br />
exacerbated during this Covid-19 era. While IT<br />
organisations cite multiple restrictions, the<br />
survey reveals limited infrastructure automation<br />
and high CAPEX as the most significant<br />
challenges for those deploying enterprise<br />
storage array technology, forcing them to reexamine<br />
IT spending and operations even<br />
more so than usual amidst the pandemic.<br />
While increasing automation and reducing<br />
costs may seem like mainstream initiatives for<br />
any large organisation, the pandemic and<br />
resulting workforce restrictions mandate<br />
significant progress in days or weeks, versus<br />
months or quarters. The results of the survey<br />
further reinforce this as respondents also<br />
highlighted their on-premises application<br />
storage environments are difficult to maintain,<br />
and reveal that they lacked the in-house<br />
expertise necessary to manage them. Even<br />
more disconcerting, respondents indicate that<br />
their traditional external storage arrays are not<br />
suited to handle new workloads, including<br />
containers and NoSQL databases. This is<br />
unsurprising as modern workloads have been<br />
architected for local versus shared storage<br />
resources.<br />
British IT decision makers specifically ranked<br />
"expensive" highest, with 57% making this one<br />
of their top three challenges, followed by "time<br />
consuming to maintain" (50%) and "difficult to<br />
automate at scale" (49%). Respondents from<br />
smaller organisations (1,000-2,999<br />
employees) were more likely to mark "lack of inhouse<br />
expertise" highly compared to larger<br />
organisations (3,000+employees) (59%<br />
compared to 31%) while these larger<br />
companies were more likely to consider cost a<br />
top challenge (61% compared to 35%).<br />
"The impact of the pandemic is forcing<br />
CIOs worldwide to reconsider their<br />
operations," said Siamak Nazari, Co-Founder<br />
and CEO of Nebulon, Inc. "Reducing costs<br />
through server-based storage alternatives<br />
without the restrictions of hyperconverged<br />
infrastructure, and reducing operating cost<br />
pressure through cloud-based management<br />
of the application storage infrastructure are<br />
crucial initiatives for IT organisations looking<br />
to survive this new normal."<br />
For companies with a growing class of<br />
mission-critical data that cannot or should not<br />
move to the public cloud, Cloud-Defined<br />
Storage is an alternative to expensive storage<br />
arrays, offering enterprises a cloud-managed,<br />
server-based approach for mission-critical<br />
storage. By combining a cloud-based control<br />
plane, called Nebulon ON, with server-based<br />
storage that is powered by the Nebulon<br />
Services Processing Unit (SPU), Nebulon<br />
enables organisations to reduce cost for<br />
enterprise storage by up to half without<br />
compromising on enterprise data services.<br />
This is made possible by Nebulon's unique<br />
architecture that makes use of commodity SSDs<br />
in industry standard servers, Ethernet in favour<br />
of Fibre Channel, and by eliminating<br />
operational complexities by moving<br />
management to Nebulon ON with an as-aservice<br />
model. With the architectural and<br />
operational simplicity of Cloud-Defined<br />
Storage, application owners gain a self-service<br />
infrastructure provisioning that is unmatched<br />
with existing on-premises storage solutions.<br />
"IT organisations have been seeking a costeffective<br />
alternative to external storage arrays<br />
for years," said Nazari. "With our Cloud-<br />
Defined Storage offering, they have the<br />
opportunity to reduce costs while also<br />
deploying a self-service solution for<br />
application owners that also reduces the<br />
operational burden."<br />
More info: www.nebulon.com<br />
34 <strong>ST</strong>ORAGE <strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2020</strong><br />
@<strong>ST</strong>MagAndAwards<br />
www.storagemagazine.co.uk<br />
MAGAZINE