Demand for data scientists has grown 30-40% in the last several years, according to LinkedIn analytics. So how do you join this booming industry?
What is a data scientist, actually?
First, you have to recognize that “data scientist” is an incredibly broad term and used by many organizations to mean different things. It’s helpful to understand what exactly YOU want to do, and then determine what jobs will actually entail that work, regardless of what they are called.
“Data Science” can encompass data management, business analytics, artificial intelligence and machine learning, programming, database development, statistics, and more. While Database Administrator (DBA), business analyst, and AI engineer are in fact separate jobs, you’ll still find their tasks often lumped into job descriptions with data scientist. That’s why you have to decide where along the analytic pipeline you want to work. Do you want to collect, clean, prep, and store data? Do you want to analyze data with statistics? Do you want to build predictive models from the data? Do you want to incorporate data into applications or products? Do you want to visualize data for end-users?
What skills does a data scientist need?
Once you have a better sense of the specific tasks or roles you’d like to play, then you can determine the skills you need. Many companies will require some kind of related degree (although I strongly disagree with this). An undergraduate degree with significant experience may be enough, though many companies will want a masters. Acceptable programs include computer science, statistics, business analytics, data science, or database administration, among others.
Many data scientists and similar workers also know multiple coding languages like Python and SQL. It’s certainly possible to do an incredible amount of data work without these skills (with the right software like Alteryx), but often companies view proficiency in these languages as a pre-requisite. In general data science, there’s a raging debate about which language is the best to learn, so you may wish to explore a few of the jobs you’re interested in to determine if there’s a common requirement, and go with that. If you want to work with data in applications, typically Python is best, whereas database administrators and managers typically use SQL or other data query languages.
You’ll also need to develop proficiency with some of the common data tools. Again, specific companies will require different software, but there are some common threads across most industries. Data visualization typically uses Tableau or Power BI, with Qlik, Looker, and other platforms very distant thirds. Data science will often rely on Python-based programs from DataRobot to custom-built applications. I’m a huge fan of Alteryx myself for pretty much anything you might need to do with data, and other companies are starting to agree. Understanding at least one of the various SQL databases, such as Microsoft SQL Server, is necessary for a database administrator.
Prove your skills with data projects you can share
The last step to prepare for a move into a data science career is develop some example projects. See if you can find a way in your current job to create a few analyses, visualizations, or reports that can benefit your current company as well as showcase your growing data skills. If you’re still a student, consider joining a research team that could enable you to help with data processing or analytics. You can always seek out data challenges from places like Kaggle that will give close-to-real-world examples of projects to tackle. And never underestimate asking a question that interests you, and seeing if you can solve it with publicly available data. These sample projects will be important indicators of what you can do to your next prospective employer.
Working with data can be an incredibly rewarding, challenging, and engaging career. I wish you the best, and feel free to post a comment or contact me if you have further questions about how you can become a data scientist.