Well its basic algebra of course:
CDP = CDH + HDP
CDP (Cloudera Data Platform) is the unity build, from the DNA of Cloudera(CDH) and Hortonwoks(HDP). Cloudera did muddy the waters a little as they conceptually also named their enterprise cloud offering CDP as well. The idea here (I postulate) was that you can transition from CDP install/on premise to CDP in the cloud. I think the idea of marketing was to name it the same so you can sell either one. I think they also wanted to strengthen the idea that it’s a straight forward migration as they’re “the same”. In my experience even though they likely share a lot of DNA, they will not be “the same” and their will be edges(edge cases) that hurt if you hit them. Don’t get me wrong I like the idea, but practically I don’t like that it’s not clear if you are buying cloud or on premise just by the name alone and we have to always add at the end of the sentence: “cloud” or “on premise”. This notion that they aren’t the same is strengthened by the fact they didn’t release both at the same time, meaning they needed to make further changes to the “on premise” version.
That whoever is the limit of my complaints. What I really like about using CDP in the cloud is the simpleness of the actual cluster installation. I like that they came up with some pre-canned solutions to make on boarding easier. They have a data warehouse solution and a data science solution. These will make great starter packs to let people explore and try out features. I feel that seasoned cluster users may wish for their own recipes but it’s great to have something to start with.
Creating a cluster if finally easy. Some of the DNA of CDP public cloud came from Cloudbreak. Where early versions of cloudbreak where prone to error because of the amount of human typing required. (blueprints/cluster templates) CDP has removed the option for you to make typos unless you really want to. They have automated the generation to the point that you don’t need to get in and touch it, but also kept the ability if you really need to get into the blueprint. All I can say is “thank you cloudera” for really making this into a usable system. Spinning up clusters is pretty much push button now. I cannot see why anyone would want to build their own if you are in the cloud.
Cloudera as of March 2020 supports Azure and AWS, but has promised to also support GCP in the near future.
Resources:
https://www.cloudera.com/content/dam/www/marketing/resources/webinars/get-an-exclusive-first-look-at-cloudera-data-platform.png.landing.html