Pino bio photo


I (infrequently) blog about data science, business intelligence, big data, web technologies and free software.

Twitter LinkedIn

You probably know that most companies mostly hire folks with programming skills for data science jobs. Also that experience with machine learning algorithms and a strong statistics background may get you a job interview. Unfortunately, that’s not enough to get you the job and retain it in most cases.

I’ve been on both sides of the table. I have interviewed for some jobs over the last year and I’ve also been in charge of hiring for startups. After that experience, I have tried to put some thoughts together with the goal of helping those of you that may be seeking a job.

Here are six overlooked skills you should try to cultivate if you want to start climbing your career ladder. I hope you find it useful!

Solve business problems

Gilberto Titericz, #1 Kaggler in the world, recently answered during an interview that understanding the problem and the features that you can use as one of the most critical parts of solving a Kaggle challenge. Samuel Noriega listed “knowledge of a particular business domain” as one of the three basic skills that a data scientist needs in order to be successful. I totally agree with them and I keep seeing people underrating this knowledge.

Fortunately, big companies heavily test the ability of the candidate to adapt business problems that they are not familiar with.

Being able to understand a completely new business language and domain well and knowing how to measure the success of any kind of project is vital.

SQL or able to query the company data

Companies clearly prefer employees who can provide immediate results. And, in order to provide immediate results, a data scientist needs to be able to retrieve data from any source and wrangle it to integrate into our system.

I listed several resources that helped me become a better Data Scientists and quoted Greg Reda saying that “World of tech is built on SQL”. That’s true, but lean organisations are adopting non-relational databases more and more and big companies have been orchestrating their clusters for years now.

In an early-stage startup, you will find more collaboration and communication between departments which will certainly improve the production of most data scientists. In exchange, the organisation will benefit from defining realistic and relevant data science project since the beginning.

Since data science is an interdisciplinary field by definition, You should be comfortable with whatever data storage solutions the company has chosen to become the well-rounded data scientist that every company wants to keep.

Design good experiments

This is a tricky one to test during the hiring process. Many interviewers ask questions regarding statistical power, statistical significance and splitting your dataset into training and set sets.

That’s usually not enough to know whether someone is able to design and perform good experiments or not. For instance, a good candidate would know that testing does not only depend on splitting the training and test sets, but also on how careful you do it. It’s frequent to find data scientists who understand the concept behind cross-validation but don’t apply it to their experiments.

Software quality

It’s not a secret that most companies need reliable source code. But data scientists often overlook this. That means writing programs that are highly available, bulletproofed against errors and easy to maintain. Experienced managers know that hiring a data scientist that doesn’t know basic QA and software architecture concepts might cost you more money and resources than you thought at first.

The ideal candidate would also have experience maintaining legacy code and solving maintain legacy code and solving most of the problems that arise when working with a big version-controlled code repository.