24.07.2016 Views

www.allitebooks.com

Learning%20Data%20Mining%20with%20Python

Learning%20Data%20Mining%20with%20Python

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Working with Big Data<br />

Governments are increasingly using big data too, to track populations, businesses,<br />

and other aspects about their country. Tracking millions of people and billions of<br />

interactions (such as business transactions or health spending) has led to a need for<br />

big data analytics in many government organizations.<br />

Traffic management is a particular focus of many governments around the world,<br />

who are tracking traffic using millions of sensors to determine which roads are most<br />

congested and predicting the impact of new roads on traffic levels.<br />

Large retail organizations are using big data to improve the customer experience and<br />

reduce costs. This involves predicting customer demand in order to have the correct<br />

level of inventory, upselling customers with products they may like to purchase, and<br />

tracking transactions to look for trends, patterns, and potential frauds.<br />

Other large businesses are also leveraging big data to automate aspects of their<br />

business and improve their offering. This includes leveraging analytics to predict<br />

future trends in their sector and track external <strong>com</strong>petitors. Large businesses also use<br />

analytics to manage their own employees—tracking employees to look for signs that<br />

an employee may leave the <strong>com</strong>pany, in order to intervene before they do.<br />

The information security sector is also leveraging big data in order to look for<br />

malware infections in large networks, by monitoring network traffic. This can<br />

include looking for odd traffic patterns, evidence of malware spreading, and<br />

other oddities. Advanced Persistent Threats (APTs) is another problem, where a<br />

motivated attacker will hide their code within a large network to steal information or<br />

cause damage over a long period of time. Finding APTs is often a case of forensically<br />

examining many <strong>com</strong>puters, a task which simply takes too long for a human to<br />

effectively perform themselves. Analytics helps automate and analyze these forensic<br />

images to find infections.<br />

Big data is being used in an increasing number of sectors and applications, and this<br />

trend is likely to only continue.<br />

MapReduce<br />

There are a number of concepts to perform data mining and general <strong>com</strong>putation on<br />

big data. One of the most popular is the MapReduce model, which can be used for<br />

general <strong>com</strong>putation on arbitrarily large datasets.<br />

[ 274 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!