Getting Started with Data Science

written in data-science, english, python

Data Science has gained a lot of market strength in recent years, a hitherto academic market now draws attention of large companies and startups. Who here does not know someone on Linkedin that was a Software Engineer or Developer and now became a Data Analyst or Data Scientist? It is a new wave of professionals with experience in programming languages like Python, R, Scala, Java along with experience or academic training in areas of mathematics, such as statistics.

Attending talks in the Python community events and motivated by good friends who already work in the area, I decided to invest in this field and in October last year started some courses on my own.

The first course I did was the Introduction to Big Data at Coursera, offered by the University of California, San Diego. My experience was extremely positive, which motivated me to create a kind of script for me to follow in the coming months.

This article shows my studies grid, which can be a guide for people interested to invest in this new world of information. But if you are one of those people who have never heard about Data Science or Machine Learning, I strongly recommend reading these articles Machine Learning is fun and Learn Data Science with Python from Scratch.

Moving forward, all materials listed here are in English, subtitled and not so scary for anyone who is not fluent, with a little effort you can proceed. Worth highlighting that most of the courses showed here are free!

Introductory courses

For beginners Udacity has a large collection of free stuff and excellent quality. These courses have a 3 week period of average, I recommend doing some parallel. Some of them are very short that can be made in a single weekend.

In addition to the first list we have the math session with Khan Academy lessons of algebra, probability and statistics and Udemy with Big Data Basics: Hadoop, MapReduce, Hive, Pig & Spark.

It sounds like a lot for a material for beginners, right? Correct, but this is an overview of the Data Science covers, if you like the statistical topics then you have a greater focus on Machine Learning but the preference for visualization and data analysis then perhaps the topics of Big Data will be more interesting.

Intermediate level

The intermediate and advanced level courses are in the script, but necessarily in that order. In this case the choice is more on demand, depending on the need of a project or in a specific study.

Advanced level

This was the list of free courses that interested me, as you can see the Udacity provides a good structure for those who have time to invest. With good discipline and good personal management in a few months we can evolve a lot without spending one penny.

But I would also like to add some paid courses to this script, I woud include Frank Kane on that list. His courses are relatively cheap with a number of weeks and very practical material with real examples, they are:

Each of these courses cost on average $25, then the cost-benefit is huge in comparison to specializations at Coursera or nanodegree at Udacity, I will explain what they are below.

Specializations

The Coursera specializations are longer courses that usually last 4 to 6 months and need a greater commitment than the free courses that I quoted above. These specializations are a pack of grouped courses, you can purchase individually and the last course is a final project called Capstone Project that needs to be done so you can get your certification.

Generally the full specialization has at least 5 courses, some even 10. Each course has an average value of $60.

The Introduction to Big Data that started in October is the first course of Big Data Specialization I will complete this year.

These two specializations are some of my goals for 2016, noting that the Coursera offers other specializations about Data Science, the list is huge.

The nanodegrees of Udacity are not in my plans for now, not for quality but for reasons of time and cost. Nanodegrees are huge courses ranging from 9 to 12 months and has a monthly cost of $200, something a little out of my Brazilian reality with the high dollar today. But is recommended for those who want to run into this mini-college, they promise guaranteed job and a teacher available to you. The course’s grade are very good, in partnership with companies like Google and Facebook:

Conclusion

This article, as well as a guide for beginners in Data Science also serves as a personal goal for my career. Studying theses materials and continue with my specializations.

What do you think of this guide? Was it helpful for you? Comment below and see you soon!


Comments