On a project for a recent client I needed to apply some common Natural Language Processing (NLP) techniques to surveys they had gathered, but one of … [Read More...] about Importing a Word Document into RapidMiner
I'm loving Seahorse, a GUI frontend for Spark by deepsense.io. The interface is simple, elegant, and beautiful, and has the potential to significantly … [Read More...] about Using Seahorse for Spark on a Cloudera HA Cluster
You'll probably see a lot of CSV files in the workplace, or generate them from the vast ocean of spreadsheets that are floating around the average … [Read More...] about Installing MySQL
Learn to use Machine Learning to solve real-world business problems.
In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming.
If you want to learn Data Science, there's no escaping the need for hands-on experience with the tools you'll be using on a daily basis. This email course takes you through install free and Open Source tools that top professionals use on a regular basis. Start getting your feet wet today.
You'll also receive new tutorials and announcements as soon as they are posted.
What Students Are Saying
Super. Straight to the key points and simple examples to each component. In my opinion, anyone looking for a good start should start here. Loved it.
This is an introductory course to this amazing technology but compared with other similar courses I had in the past, it is clearly described from a practical point of view and not only from a technical point of view. Of course this course is a starting point but the major advantage is that you will hear about a concrete experience from a skilled professional.
Short and Crisp. This is probably the 7th or 8th Hadoop course I went through. This one tops all.
Mr. King’s explanations are (like most of the material used in the course) clear and fresh, describing all the concepts, general usage of the tools and even showing common mistakes, so we can avoid them in the future.