Saai talks Data Science #1
Date: 2021-03-08
Ever looked at the spam folder in you e-mail and thought, “Woah!, how does this thing know if a mail is a spam or not?”.
Natural Language Processing or on a broader perspective, Machine Learning and Deep Learning. Now, these words are often confused to be the same as Artificial Intelligence.
“Data Science”, I am 99% sure you have heard this word before. That 1% is for people who were living under a cave all these years. If, you having been living under a cave, the upcoming series of posts should be able lay out you basics and get you started in this wonderful intersection of Statistics and Computer Science.
Yeah, you heard it right! STATISTICS and also PROBABILITY, i.e MATHEMATICS, to be a good Data Scientist your Statistics skills should be top notch. Note that this is a required skill, but not the only skill. That is the reason why Statisticians find it easier to transition into Data Science jobs.
Apart from Mathematics, you might also have to learn -
-
Machine Learning— mostly math
-
Deep Learning— math again
-
Big Data concepts ( Apache Hadoop and Apache Spark ) — an intermediate level of knowledge will suffice as of the hour I am writing this post.
-
DBMS — Database Management Systems (So, if you have been running away from SQL, FACE IT! )
-
Data Visualization tools— there are many wonderful tools like D3js, Plotly, Matplotlib , Seaborn
-
Storytelling skills— if you can talk about a bar chart for a minimum of 15 minutes non-stop! , you are good to go.
-
Communication Skills — So, that you don’t stare at the audience or in this case your clients or your employer.
-
Ability to work in a team — I don’t have to explain why this is important, you can’t manage an entire Hadoop Cluster, while making sure its secure, while analyzing the data while designing your neural network while communicating to your clients.
-
Most important skill — Curiosity, but don’t worry, humans were born with this, you will awaken your inner “Cat” soon enough.
-
I almost forgot(literally), an expert understanding of either Python or R, you might find it difficult with any other languages, due to lack of community resources. In my posts, I will use python.
Data Science might sound like its easy, but the subject is vast, as of the ten points above, I am just scratching the surface…the learning never ends actually… so, by the time you retire, you will have huge skill-set, so…
if you are in it for the money and do not really like the subject,
“Let me warn you, its not going to be a lot of fun. The Bandwagon won’t take you too far, unless you are super determined.”
If you are genuinely interested in the subject though, carry on happily, you’ll definitely love this field.
One more tip I would like to give you is, use open — source as much as possible, you can apply you knowledge in open — source tools easily in enterprise tools.
Why Open-Source?
Coz, it is the best thing that ever happened to humanity, the word shouts “Unity”. Take “Wikipedia”, the ultimate encyclopedia, it couldn’t have been made by one organization, let alone one person. Every single contributor of Wikipedia comes from various countries, various backgrounds, various skill-sets and yet its marvel stuns all!
So every-time someone asks you why open-source is important, tell them,
“What one man achieves in a year, a community achieves in a minute”
IT’S FREE! Do you need a better reason?
Here are some excellent website every student of Data Science might wanna know of:
-
YouTube- DeepLearning.ai — Take a moment to thank Andrew Ng and his wonderful team for making such top notch content free to watch. There are Coursera courses too, if you want to get certifications.
Some, python libraries that are essential,
-
Numpy
-
Pandas
-
SciPy
-
Scikit Learn
-
Matplotlib
-
Seaborn
-
TensorFlow (or) PyTorch — they are the leaders as of the time I am writing this post.
I have attached a few of the best resources and libraries I know, if you have anything in mind, do drop it in the comments.
Hope you liked this post, thank you, drop any feedback you might have in the comments, I’ll make sure to fix any issues if you find any.
Thank you…see ya!