Data science 101: What a data scientist does, and how to become one

Written by Julianne Tveten
Published on Sep. 07, 2014

[ibimage==33090==Original==none==self==ibimage_align-center]

If you’ve got a penchant for the intricacies of statistics, an insatiable desire to code, or a fervor for solving mathematical problems, perhaps working as a data scientist has crossed your mind.

And if it hasn’t yet, maybe it should.

The term “data science” might strike some as abstruse or overused. It’s been dismissed as a buzzword, a glorifying synonym for “statistics.” However, the field has a distinct, complex identity. Data science, which is defined as “the study of the generalizable extraction of knowledge from data,” entails processing a collection of unstructured raw data (from multiple sources, including text, images, and video) and analyzing that data based on creative problem-solving skills informed by various disciplines, from sociology to econometrics.

What distinguishes data science from traditional database methods most is the goal of knowledge discovery--more specifically, finding patterns. According to the Association for Computing Machinery, traditional database methods, such as those on which search queries operate, are designed to give the user a summary of data, quickly. (Think of a Google search results page.) Unlike database querying, which looks for data that satisfies a pattern (the query), data science looks for an unexpected, predictive pattern based on the data.

Data science is generally considered to combine three areas of study: statistics, computer science, and mathematics. Most data scientists specialize in one or more areas and maintain proficiency in the other(s).

“Data scientists typically have a strong background in math and computer science, both of which are necessary to complete our tasks,” said Mark Eberstein, a data scientist at Convertro, a Santa Monica-based company that provides marketing tools based on users’ behavioral data. “In addition, as a data scientist, we require a strong understanding of the industry in which we’re operating. This is what enables us to design, build, and deploy solutions that allow us to provide the most value to our clients.”

Data scientists also benefit from business savvy, Eberstein says. One of his most notable projects was a model that analyzed the relationship between TV advertising and online marketing performance.

“The first step was to figure out how to measure TV's effect on online marketing,” he said. “Then we built a scalable solution which models this relationship. It is very satisfying to take a project from the idea phase all the way to production, and to see clients get excited about a new technology which helps them grow their businesses.”

Though it’s a relatively new--and admittedly somewhat obscure--profession (the term “data science” has only been around for about 30 years), nearly anyone with the right amount of interest can pursue it. Eberstein, who earned a BS in physics, is completing a Master’s program in computer science, but he said he credits much of his computer science knowledge to “independent learning and on-the-job training” (Eberstein primarily programs in Python).

Emily Dunkel, a data scientist at the Santa Monica-based automotive marketplace platform TrueCar, took a less university-dependent approach. "I transitioned to data science by taking an online UCLA extension statistics course, and doing some self study," said Dunkel.

Even more encouraging for aspiring data scientists is that the field is ever-expanding, potentially breeding more job openings. Eberstein asserts that the future of data science is “bigger data...This requires technologies which will allow us to ingest and operate on larger data sets and also perform numerical methods on that data.” Similarly, Dunkel predicts the interdisciplinary profession will become recognized to more industries, including healthcare, academia, and, as Eberstein has demonstrated, marketing.

As Dunkel’s colleague Pan Wu explains, “Being a data scientist requires a hybrid of statistics, programming, and business sense. It is a ‘scientist’ who can explore insights (via programming & statistics) from the data for company’s business needs.”

Given its combination of statistics, computer science, problem-solving, pattern recognition, and business skills, the field of data science is at once layered, comprehensive, and ripe for exploration.

Have a tip for us or know of a company that deserves coverage? 

Email us via [email protected]

Hiring Now
Ampersand
AdTech • Big Data • Machine Learning • Sales • Analytics