A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.
– Josh Wills, Cloudera
Data science is easily one of the most sought-after skills in the market today. According to a CNBC article, ‘data scientist’ is among the top 10 professions of 2017 with respect to factors such as job growth, median salary, physical effort required, stress levels, and so on. With the median salary marked at $111,267 and an expected job growth of 16%, a career in data science can be lucrative indeed.
Having said that, it takes a great deal to be a data scientist. The job requires a lot of imagination and sound technical skills, particularly skill with numbers. One must have the ability to gather the right data, shape the data for analysis, devise creative approaches to visualize the data, and answer specific, pertinent questions with the insights from the data.
And for this very reason, an ace data scientist often receives the rockstar treatment at a technology firm. Life as a data geek is no walk in the park though. With new tools and different problem-solving techniques coming out every other day, data scientists need to be perennial learners to keep their knowledge and skills updated and remain valuable assets to their organizations.
That said, below are 12 situations that perfectly describe what it is like being in the shoes of a data scientist.
When somebody asks you, “What is big data?”
We live in a highly digitized world, and big data is everywhere. We constantly produce an incredible amount of data — think social media, online banking, mobile shopping, GPS, and so on. In fact, we reportedly produce about 2.5 quintillion bytes of data in a day!
Big data has changed the way we communicate with people and manage our lives. The insights from big data are what help retail sites send you product recommendations that perfectly align with your preferences, government authorities understand and predict crimes, transport agencies manage and control traffic, or medical practitioners identify people at risk.
The applications of big data are indeed endless and have improved the quality of our lives to a great degree. So, everybody ought to be familiar with the term ‘big data’.
When your R code works for the first time
Budding data scientists will relate to this GIF. R programming is among the most demanded skills in the field of data science. According to a KDnuggets article, R was the most popular software in analytics and data science followed by Python for 2016.
Given how in-demand the software is in the market, when your R code works the way it was intended to, you can’t help but visualize yourself as the biggest data nerd of them all.
When you have to handle unstructured streaming data
The realm of unstructured data analysis is often referred to as ‘dark analytics’. That sounds intimidating indeed, and rightly so.
Handling unstructured streaming data can be a handful for even the most skilled data scientists out there. Be it data from social media, videos, customer log files, or geospatial services, the analysis needs to be sequential and incremental with the creation of multiple records. Also, time is of the essence while dealing with such data.
So, when you are knee-deep into dark analytics, you certainly feel akin to a space scientist trying to unravel the mysteries of the cosmos.
When your model forecasts with more than 90% accuracy
This is a big deal. Data scientists have to spend a significant amount of time studying, understanding, preparing, and manipulating data for analysis. This process takes a great deal of patience and effort. However, the pay-off is enormous when the model you build delivers insights with over 90% accuracy.
And when there is an outpour of praise and appreciation from your client, manager, and coworkers, there is only one thing on your mind as you are swollen with pride — a weekend of unbridled revelry!
When you try to find an issue in your model
Trying to find errors in hundreds or thousands of lines of code is not very different from attempting to find a needle in a haystack while you have a terrible hangover.
But let’s face it; it’s all part of the game.
When your manager asks you about the bug fixes
Oftentimes, managers in a data science organization do not understand the nitty-gritty of a technical task, be it debugging or tweaking machine learning models. Managers are usually more concerned about the project management aspects (read: deadlines).
In such a situation, as a data scientist, all you can do is stall for some more time or, in rare cases, disappear mysteriously.
When your friend who knows nothing about the model fixes the issue
You have been staring at your code for hours in vain, and you are quite close to giving up. And a friend casually looks at what you have and points out the error in seconds.
Though this can be one of the most annoying things to happen, those initial feelings of embarrassment and petulance quickly turn into relief because you now have one less thing to worry about.
It’s okay. Sometimes, a fresh pair of eyes is what you need to get the job done.
When your SQL query takes forever to execute
Slow server, terrible Internet, or whatever the reason might be, having to wait for an SQL query to execute is like watching paint dry, except you are too paranoid to walk away from your system.
It’s boring and an exhausting test of patience. We’ve all been there though.
When you are done with the modeling and your client changes the data
After making mind-numbing efforts to analyze the data and identify innumerable trends or patterns, the last thing you want to hear is that the data you used wasn’t the ‘right’ data.
And when you get a completely different data set to work with, it means you have to make significant changes to the model itself. So, just like that, you end up working from scratch…yet again. The agony!
When you have a client presentation in 20 minutes and you are yet to complete your presentation
You just finished preparing a presentation for a client. And you’re all set to rock and roll, right? In most cases, not.
A data scientist has to run the presentation by his or her peers and the manager. And this often means one thing — tons of last-minute changes. Making such changes can be extremely stressful because aside from your personal reputation, there’s a lot at stake: a potential sale, the company’s image, or even your career progression.
But ultimately, when you receive a positive response from the client, you realize that some of those changes were critical and, in fact, made your case stronger.
As the legendary Steve Jobs once said:
“Great things in business are never done by one person; they are done by a team of people.”
When the client finally agrees with your model output
This is the ‘boss’ moment that a data scientist fantasizes about – when you know you’re contributing greatly to your organization’s business.
A client can be extremely demanding and fastidious, so you work tirelessly to make your model as accurate and effective as it can possibly be. You have to play by the client’s whims and fancies, which isn’t always fun. However, in the end, when the client gives you a definite thumbs up, it is the ultimate victory.
As Dr. Kirk Borne rightly said:
“The customer may not always be right, but the customer is always the customer.”
When a new big data technology comes to the market
The analytics industry is evolving rapidly, and so are new tools and technologies. Data scientists are seeing the emergence of various big data, analytics, and deep learning tools.
Being constant learners by nature, data geeks are open to these new developments as they get an opportunity to broaden their knowledge and skill set.
[GIF source: https://giphy.com]