Quantcast
Channel: Kumar Chinnakali – dataottam
Viewing all articles
Browse latest Browse all 65

Top 10 Reasons to Run Hadoop in the Public Cloud !

$
0
0

Top 10 Reasons to Run Hadoop in the Public Cloud !

Hadoop ecosystem in the public cloud means, which it is running Hadoop clusters on hardware offered by a cloud service provider. And this practice is business as usual compared with running Hadoop clusters on our own hardware, called on-premises clusters or “on-prem”. But installing a Hadoop cluster on a public cloud service is not as straightforward as it may appear, it’s it little tricky due to it’s distributed and parallel in nature.

  1. Lack of space – Our organization or clients may need Hadoop clusters, but we don’t have anywhere to keep racks of physical servers, along with the necessary power and cooling.
  2. Flexibility – Without physical servers to rack up or cables to run, it is much easier to reorganize instances, or expand or contract our footprint, for changing business needs. Everything is controlled through cloud provider APIs and web consoles. Changes can be scripted and put into effect manually or even automatically and dynamically based on current conditions.
  3. New usage patterns -The flexibility of making changes in the cloud leads to new usage patterns that are otherwise impractical. For example, individuals can have their own instances, clusters, and even networks, without much managerial overhead. The overall budget for CPU cores in our cloud provider account can be concentrated in a set of large instances, a larger set of smaller instances, or some mixture, and can even change over time.
  4. Speed of change – It is much faster to launch new cloud instances or allocate new databases than to purchase, unpack, rack, and configure physical computers. Similarly, unused resources in the cloud can be torn down swiftly, whereas unused hardware tends to linger wastefully.
  5. Lower risk – How much on-prem hardware should we buy? If we don’t have enough, the entire business slows down. If we buy too much, we’ve wasted money and have idle hardware that continues to waste money. In the cloud, we can quickly and easily change how many resources we use, so there is little risk of undercommitment or overcommitment. What’s more, if some resource malfunctions, we don’t need to fix it; we can discard it and allocate a new one.
  6. Focus – An organization using a cloud provider to rent resources, instead of spending time and effort on the logistics of purchasing and maintaining its own physical hardware and networks, is free to focus on its core competencies, like using Hadoop clusters to carry out their business. This is a compelling advantage for a tech startup and small and medium based organizations.
  7. Worldwide availability – The largest cloud providers have data centers around the world, ready for us from the start. We can use resources close to where we work, or close to where our customers are, for the best performance. We can set up redundant clusters, or even entire computing environments, in multiple data centers, so that if local problems occur in one data center, we can shift to working elsewhere.
  8. Data storage requirements – If we have data that is required by law to be stored within specific geographic areas, we can keep it in clusters that are hosted in data centers in those areas.
  9. Cloud provider features – Each major cloud provider offers an ecosystem of features to support the core functions of computing, networking, and storage. To use those features most effectively, our clusters should run in the cloud provider as well.
  10. Capacity – Few customers tax the infrastructure of a major cloud provider. We can establish large systems in the cloud that are not nearly as easy to put together, not to mention maintain, on-prem.

Ref. Moving Hadoop to Cloud by Bill Havanki ( Early Release, awaiting to grab the hard copy on first day).

interested? questions? feedback? Let us have coffee@dataottam.com !

 

Please subscribe to www.dataottam.com to keep yourself trendy on ABCD of Data (Analytics, Big Data, Cloud Computing, and Digital).


Viewing all articles
Browse latest Browse all 65

Trending Articles