What you may be taught
- Perceive how MapReduce can be utilized to investigate large knowledge units
- Write your individual MapReduce jobs utilizing Python and MRJob
- Run MapReduce jobs on Hadoop clusters utilizing Amazon Elastic MapReduce
- Chain MapReduce jobs collectively to investigate extra advanced issues
- Analyze social community knowledge utilizing MapReduce
- Analyze film rankings knowledge utilizing MapReduce and produce film suggestions with it.
- Perceive different Hadoop-based applied sciences, together with Hive, Pig, and Spark
- Perceive what Hadoop is for, and the way it works.
- You may want a Home windows system, and we’ll stroll you thru downloading and putting in a Python growth atmosphere and the instruments you want as a part of the course. For those who’re on Linux and have already got a Python growth atmosphere in place that you just’re conversant in, that is OK too. Once more, make sure you’ve gotten no less than some programming or scripting expertise underneath your belt. You will not should be a Python knowledgeable to reach this course, however you may want the basic ideas of programming to be able to choose up what we’re doing.
“Large knowledge” evaluation is a scorching and extremely useful ability – and this course will train you two applied sciences basic to large knowledge rapidly: MapReduce and Hadoop. Ever surprise how Google manages to investigate all the Web on a continuous foundation? You may be taught those self same strategies, utilizing your individual Home windows system proper at residence.
Study and grasp the artwork of framing knowledge evaluation issues as MapReduce issues by way of over 10 hands-on examples, after which scale them as much as run on cloud computing companies on this course. You may be studying from an ex-engineer and senior supervisor from Amazon and IMDb.
- Study the ideas of MapReduce
- Run MapReduce jobs rapidly utilizing Python and MRJob
- Translate advanced evaluation issues into multi-stage MapReduce jobs
- Scale as much as bigger knowledge units utilizing Amazon’s Elastic MapReduce service
- Perceive how Hadoop distributes MapReduce throughout computing clusters
- Study different Hadoop applied sciences, like Hive, Pig, and Spark
By the tip of this course, you may be working code that analyzes gigabytes price of data – within the cloud – in a matter of minutes.
We’ll have some enjoyable alongside the best way. You may get warmed up with some easy examples of utilizing MapReduce to investigate film rankings knowledge and textual content in a ebook. As soon as you have acquired the fundamentals underneath your belt, we’ll transfer to some extra advanced and fascinating duties. We’ll use one million film rankings to search out motion pictures which are related to one another, and also you may even uncover some new motion pictures you may like within the course of! We’ll analyze a social graph of superheroes, and be taught who essentially the most “in style” superhero is – and develop a system to search out “levels of separation” between superheroes. Are all Marvel superheroes inside just a few levels of being related to The Unbelievable Hulk? You may discover the reply.
This course may be very hands-on; you may spend most of your time following together with the teacher as we write, analyze, and run actual code collectively – each by yourself system, and within the cloud utilizing Amazon’s Elastic MapReduce service. Over 5 hours of video content material is included, with over 10 actual examples of accelerating complexity you possibly can construct, run and research your self. Transfer by way of them at your individual tempo, by yourself schedule. The course wraps up with an outline of different Hadoop-based applied sciences, together with Hive, Pig, and the very popular Spark framework – full with a working instance in Spark.