Microsoft Azure and Big Data
Danut Rusu, Vlad Frasineanu, Andrei Badoi
“Our vision is to create innovative technology that is accessible to everyone and that adapts to each person’s needs. Accessible technology eliminates barriers for people with disabilities and it enables individuals to take full advantage of their capabilities.” – Bill Gates
Big Data and Cloud Computing
- What is Big Data?
- What is Cloud Computing ?
- How are these two connected to each other?
- How can they help us in our day by day lives ?
These are just a few questions that needs to be answered before we begin talking about our chosen subject.
Let’s start with Big Data. Is it really just data which is “big” (gigabytes, terabytes) ? Well, it is not. It’s about how to organize the data, how to label different parts from it, which technologies are used to store and retrieve it. It is a model of collecting, storing, handling and extracting from different kinds of data.
Over the last few years, “cloud” has been one of the most used words in tech, and 2016 is no exception – for good reason. Nearly three-fourths (70 percent) of IT professionals report their organizations use public cloud solutions, and nine in 10 (92 percent) say their companies have services that should be running in the public cloud, but aren’t currently (source: IT Pro Cloud Survey). As organizations embrace the cloud globally, we’ve seen digital transformation of entire industries powered by the cloud – from automotive builders creating connected cars to new retail customers leveraging cloud-based data and advanced analytics to personally tailor customer experiences.
Cloud Computing represents the model of computing on the fly. Everything is dematerialized and there is no need for a specific setup or a certain computer in order to manage it. For instance, we can think about Dropbox or Google Drive. One can upload, download and access the files from everywhere in the world with just a simple device and an internet connection.
Now that we know the meaning of each concept, can we describe them with just one word ?
We could try and say that Cloud Computing represents the infrastructure and Big Data the content. Are they really connected ? Yes ! If it weren’t for the cloud computing, then we would not be able to process Big Data on normal machines.
Why Microsoft ?
“Microsoft’s mission is to enable people and businesses throughout the world to realize their full potential. We deliver by striving to create technology that is accessible to everyone—of all ages and abilities. Microsoft is one of the industry leaders in accessibility innovation and in building products that are safer and easier to use.” This is why we chose to write about Microsoft Azure.
What is Microsoft Azure ?
Microsoft Azure is a cloud computing platform created for building, deploying and maintaining different applications and services – analytics, computing, database, mobile, networking, storage and web – through a global network of Microsoft’s data centers. The software supports the broadest selection of operating systems, programming languages, frameworks, tools, databases and devices. Any developer or IT professional can be productive with Azure. The integrated tools and pre-built templates and services make it easier to manage applications with technologies that you already know. For example, one can build apps for iOS, Android and Windows devices in any language.
Microsoft and Big Data
Nowadays, a large amount of data is generated every second. One might interpret this as “Big Data”, but this is only half of the truth. Besides the big magnitude in terms of size, there is also another important thing that must be taken into account when working with this kind of data and this is all about running analytics on it. A huge amount of data kept safe and untouched on some machine is totally useless, thus not only individuals but also companies conduct different kind of analysis on it for different purposes. For example one web-developer may want to know what the visitators of his website think about its content; a businessman may want to know what his customers think of his products or more important, you might need to discover that your latest promotional campaign had the biggest effect on people aged between 40 and 50 living in Bremen, DE, and more importantly, why.
A vital resource in today’s competitive environment is being able to get answers to this kind of questions. The problem is that the source data that might contain all the information might be very difficult to analyze, i.e. it might be distributed across many different databases or files or be in format that is hard to process.
To resolve these issues, data analysts are adopting techniques that were commonly at the core of data processing in the past, but have been sidelined in the rush to modern relational database systems and structured data storage. The new buzzword is “big data” and the associated solutions encompass a range technologies and techniques that allow you to extract real and useful information from the very large quantities of data. And here is where Microsoft comes into play together with their applications dedicated to Azure.
The Three V’s problem
Volume: All Big Data solutions store and query hundreds of terabytes of data and total volume is growing fast day by day. Storage has to be able to manage this amount of volume and to work efficiently by scaling out across multiple machines.
Note: We can think of a terabyte of data as 1500 CD-ROMs.
Variety: New data may not match any existing data schema. It can be unstructured data for example. Therefore, by applying schemas to the data before or during storage would be considered impractical.
Velocity: Data is being collected from new types of applications and devices. The design and also the implementation must be able to manage this in an efficient way and the results have to be within an acceptable timeframe.
“Deliver better experiences and make better decisions by analyzing massive amounts of data in real time. Get the insight you need to deliver intelligent actions that improve customer engagement, increase revenue, and lower costs.”
Microsoft Azure brings together large volume of data from different areas, while taking care of the three v’s problem. This data is used by companies in different industries, making them able to monitorize their performance and results, optimize and adapt their applications/products accordingly to users’ preferences and even predict in different situations the necessary maintenances/solutions. Azure makes the manipulation of data more efficient in increasingly complex and distinct environments.
Nowadays the world is taking a serious turn towards staying connected. So from here the term of Internet of Things (IoT) comes into use. As of 2016, the vision of the internet of things has evolved due to a convergence of multiple technologies, including ubiquitous wireless communication, real-time analytics, machine learning, commodity sensors, and embedded systems.
With this new concept, also the big tech leaders had a word to say. Microsoft has rolled out an IoT Suite for its Azure public cloud. To ease customers’ transition to IoT, Microsoft has a handful of pre-configured IoT templates, including one for customers to remotely monitor IoT devices, and another for rolling out predictive maintenance.
If the application requires a starting point for an IoT solution or it is related to some common patterns in IoT solution design and development the Azure IoT Suite comes with preconfigured solutions. These are implementations of common IoT solution patterns that you can deploy to Azure using your subscription. For example the following diagram illustrates the key elements of the remote monitoring solution. The sections below provide more information about these elements.
Malware is a leading cause of identity compromise, with its ability to run in the background and collect information such as usernames and passwords, and transmit them back to the attacker. With these credentials, an attacker can access, modify, or destroy your valuable data. If the compromised account has administrative privileges, the attacker can change system or account settings and do much more damage. Thus, an important element in keeping user identities secure is protecting them from the effects of malicious software.
Microsoft cloud services help you protect against malware threats in multiple ways. Microsoft Antimalware is built for the cloud, and additional anti-malware protections are provided in specific services.
It is well documented that malware and targeted attacks are on the rise, and that most organizations are struggling to effectively combat these threats. Conventional countermeasures continue to do their jobs, often admirably, but are simply not up to the challenge of server-side polymorphism, customized attacks, and other advanced tactics. Big Data can help in this case by generating substantially better models of normal behavior and revealing deviations from these patterns to more thoroughly and accurately identify malicious activity.
Microsoft Antimalware has been previously developed and used on Windows XP, Windows Vista and Windows 7. The program was used as an antivirus software product that fights malware(malicious software), including computer viruses, spyware, Trojan horses and rootkits. Built upon the same virus definitions and scanning engine as other Microsoft antivirus products, Microsoft Antimalware Service provides real-time protection, constantly monitoring activities on the computer and scanning new files as they are downloaded or created and disabling detected threats.
Big Data faces a challenge when it comes to security. The majority of systems were developed to protect the limited scope of information stored on the hard disk, but Big Data goes beyond hard disks and isolated systems. Almost all data security issues are caused by the lack of effective measures provided by antivirus software and firewalls.
The Microsoft Antimalware can be also installed on Azure systems and has the same purposes such as free real-time protection capability that helps identify and remove viruses, spyware, and other malicious software, with configurable alerts when known malicious or unwanted software attempts to install itself or run on your Azure systems.
The program comes with very advanced features included such as: real-time protection scheduled scanning, malware remediation (automatically takes action on detected malware, such as deleting or quarantining malicious files and cleaning up malicious registry entries), automatically updates to the latest protection signatures (virus definitions) to ensure protection is up-to-date on a predetermined frequency, automatically updates for the Microsoft Antimalware engine and platform.
Real World Case – Jet.com and Microsoft Azure
“To be one of the best e-commerce destinations in the US, we will have to handle millions of customers…. That requires a top-class e-commerce system built on a flexible, open cloud platform. That is exactly what we got with Azure.” Mike Hanrahan: CTO Jet.com
In 2010, Marc Lore sold his company Quidsi to Amazon for $550 million. Four years later, Marc was competing against Amazon with the creation of a new only marketplace called Jet.com
Jet uses an innovative pricing engine which reduces or even eliminate costs in the e-commerce value chain, especially fulfillment costs and marketplace commissions.
“Our pricing engine will continually work out the most cost-effective way to fulfill an order from merchant locations closest to the consumer,’ explains Lore, Co-Founder and CEO of Jet. “The engine will also figure out which merchants can fulfill most cheaply by putting multiple requested items into one shipment. And so we can cut probably 10 percent of a cost of a typical e-commerce transaction just by being smarter about fulfillment.”
This being said, the company is looking forward to an exponential growth.“We want to be one of the leading e-commerce destinations in a very short period of time—18 to 36 months,” says Mike Hanrahan, CTO at Jet.
Problem: It has to find the right cloud partner to support the company’s ambitious growth plan.
Solution and Benefits
Jet decided to become partners with Microsoft and therefore to solve their problem with Azure.
Working with Microsoft Azure cloud services has provided Jet with a level of flexibility and scalability that has been critical to its aggressive development schedule.
1. From code to production in minutes
Jet has been able to dramatically streamline its development process. They designed, developed, deployed and scaled web-apps more rapidly.
- Scaling automatically to meet customer demand
Jet required extremely rapid and flexible scaling based on ever-changing customer traffic. They were able to scale its servers based on load or schedule.
- Accommodating rapidly growing storage requirements
With the growth of the company, Azure provided a wide range of storage options to handle virtually any amount of data.
With Azure, Jet has created a cloud infrastructure that’s ready to meet the company’s most ambitious growth plans. “To be one of the best e-commerce destinations in the US, we will have to handle millions of customers, placing tens of thousands of orders a day. That requires a top-class e-commerce system built on a flexible, open cloud platform. That is exactly what we got with Azure,” says Hanrahan.
References and bibliography