Unlocking the Artistry of Data Engineering

By blending numerous areas of expertise, data engineering provides ample opportunities to innovate and experiment with cutting-edge technologies.

Written by Jeff Kirshman
Published on Feb. 28, 2022
Unlocking the Artistry of Data Engineering
Brand Studio Logo

Some people cherish gold. Others worship NFTs. But for Aashish Dugar, data is the most valuable commodity.

“The ability to use, build and transform data into something even more insightful is an art to me,” the senior data analyst at fintech company Pennymac said. “I’m not exactly a very good singer, nor can I paint a pretty picture, so I guess this is my way of becoming an artist.”

Dugar’s chosen field echoes the combination of freedom and creativity that he craves in his work in that data engineering occupies a middle ground between traditional software engineering and data science. It blends numerous areas of expertise to expand the possibilities within computer programming. 

It is this opportunity for exploration that data engineers find so rewarding. For those looking for an alternate approach to a traditional career in software development, data engineering provides ample opportunities to innovate and experiment with cutting-edge technologies.

“If you can imagine it, you can build it,” said David Reilly, who incorporated his interest in computer science with a degree in mathematics to become a senior data engineer at advertising technology company Edmunds.

Built In LA connected with Dugar, Reilly and local professionals from LeaseLock and DISQO to learn more about the skills and knowledge they’ve acquired to pursue a creatively fulfilling career in data engineering.

 

David Reilly
Senior Data Engineer • Edmunds

 

What led you to a career as a data engineer?

My path toward data engineering began with a challenging computer science course in college. Despite what felt like a constant uphill battle in a quest to get good grades, I was hooked by the creative freedom granted to anyone familiar with computer programming. 

Weaving that computer science series into my mathematics major gave me the foundation to explore — and eventually pursue — a career as a data scientist here at Edmunds. Yet while I genuinely enjoyed the work of data science, I found myself heavily favoring the data munging component of being a data scientist. 

After this realization, I began taking on more projects that had me working closely with the data engineering team, and I was eventually able to fully transfer into a new role as an engineer on the data warehouse team. 

 

How does data engineering differ from more traditional software development? What key technical skills do you use most often during your workday? 

Personally, I don’t see data engineering as much different from software development. Rather, I see data engineering as more of a specialization within software development. As a data engineer, I still have to write readable, maintainable and well-tested code. I still have to be familiar with build tools, version control and many other technical skills and tools common to all software development.

A data engineer has to be proficient with a different tech stack that will help them build and maintain great data pipelines.”

 

However, just as a front-end engineer needs to be proficient with a technology stack that will help them build great front ends, so too does a data engineer have to be proficient with a different technology stack that will help them build and maintain great data pipelines. 

As for my current role, that technology stack and technical skill set consists of Java, Scala, Python, and SQL as the linguistic foundation and the Apache Spark ecosystem as the large-scale data processing engine. Add to that a bit of Airflow for scheduling and orchestration of pipelines and Databricks for the data processing platform in which to run those pipelines, and you have the foundation of the technical skills and tools I use most often during my workday.

 

Describe a project you’re working on right now, including the goal, your approach to the problem and the challenges you’ve encountered.

Along with a few other team members, I’m currently working on a huge undertaking to fully rewrite and migrate a massive legacy system. The project is a part of a larger migration initiative with the goal of completely removing a dependency on a legacy database. From collecting requirements to parsing out legacy logic to designing the new system, the project has been a rewarding challenge. I’ve personally really enjoyed the chance to make an impact on such a complex, large-scale piece of software, and I’m excited to see the finished product.

 

 

Breanne Arase
Senior Data Analyst • LeaseLock

 

What led you to a career as a data engineer?

Math and critical thinking have always come naturally, but after graduating with a mathematics degree, a career in data was not the obvious choice for me. I’ve been fortunate to be hired in analytical positions at great companies across different industries — manufacturing, finance, adtech and insurtech — where I had the opportunity to develop my SQL and Python skills through practical work experience, all while learning from driven and thoughtful leaders. I discovered my passion for data after my first taste of SQL, and I still find the challenge of providing scalable data solutions for business-critical analytics at LeaseLock extremely rewarding.

I discovered my passion for data after my first taste of SQL.”

 

How does data engineering differ from more traditional software development? What key technical skills do you use most often during your workday? 

It is essential to have a strong understanding of database fundamentals, data modeling and SQL. I spend most of my time in PostgreSQL and using our business intelligence platform, Looker, to create connections to our database in order to provide self-service reporting and insights to various stakeholders throughout the company. On the other hand, software developers are much deeper into coding. The languages they use can vary from Java to Python to C++ to JavaScript. They’ll typically support the creation of new functions of our product by creating a scalable back end and a successful program.

 

Describe a project you’re working on right now, including the goal, your approach to the problem and the challenges you’ve encountered.

I worked on creating a Looker dashboard to surface an exceptions report for one of our largest customers. The goal was to work in collaboration with the customer and marry our understanding of their data with their interpretation for a stronger partnership. The challenging part was analyzing the data ingested through their API integration using SQL with little to no documentation, and to create views with human readable strings to explain our findings. I’m always excited to uncover new information, and I welcome the opportunity to work cross-functionally and learn from different team members.

 

 

Mikayel Ayvazyan
Data Engineer • DISQO

 

What led you to a career as a data engineer?

The majority of my computer science courses in college were taught in Java. After I graduated, I came across an opening for a big data developer. They were looking for someone proficient in Java who was open to learning big data technologies. I decided to apply, since I’m good with numbers and always up for a challenge. I also knew data engineering is not only about building pipelines, but also ensuring clients receive clean and good-quality data, which is something I’m passionate about. 

I landed the job, and it was the beginning of my data engineering career, where I learned additional technologies such as Scala, Spark and Hadoop. Ironically, I did not use Java! Looking back, I’m so glad I chose this route, because there were — and continue to be — so many opportunities to learn new tools and technologies that I apply in my current role as data engineer working on DISQO’s data pipelines.

Data engineering is not only about building pipelines, but also ensuring clients receive clean and good-quality data.”

 

How does data engineering differ from more traditional software development? What key technical skills do you use most often during your workday? 

In my opinion, the main difference between a data engineer and software engineer is that data engineers are tasked with building data pipelines and systems that ingest, process, store and retrieve data. Software engineers are responsible for creating applications and software for a specific purpose, such as a web application. In most companies, the two branches coexist and one needs the other to develop and continually enhance a complete product. As a data engineer, I might use an API that a software engineer has created to pull in and store data. A software engineer can then, for example, use the output of my processed data to display on a web page.

The key technical skills I use on a daily basis are Airflow to orchestrate jobs, Scala and Spark to ingest and process data, and various other cloud technologies such as Amazon S3 and Redshift.

 

Describe a project you’re working on right now, including the goal, your approach to the problem and the challenges you’ve encountered.

Currently, I’m working on a project for DISQO’s ad measurement service, which allows our clients to understand the impact of their advertising by connecting attitudes with online behaviors. I’m responsible for automating the manual processes that deliver data to clients. I’m also looking into ways to improve the execution of these processes through various data engineering concepts. 

The goal of this project is to deliver high-quality, clean data as fast as possible to our clients so they understand whether or not their ads are moving the needle. Some of the challenges I face are trying to understand the manual process and increasing efficiency and quality. The part I enjoy most about this project, and my role in general, is working on something new and trying to solve a different problem each day. This makes the job interesting and fun, and it motivates me to continue growing as an engineer. The best part is seeing my changes live in production and how they actually make an impact on DISQO and its customers.

 

 

Aashish Dugar
Senior Data Analyst, Enterprise Data • Pennymac

 

What led you to a career as a data engineer?

I have an innate interest in software engineering and thrive on the fact that I’m an analytical person. I didn’t think there was a way to marry these two qualities until I found data engineering. In this field there are clear-cut results, truth to the models and an opportunity for data analysis. Whether it’s cleaning the data, manipulating it to a certain format in a Pandas DataFrame or using a sentiment analysis algorithm based on Bayes classifiers and MapReduce, they all lead to measurable results. This reflects the quality and usability of the models and can serve as a universal answer to the problem regardless of the approach. This thrills me!

The true power of data engineering to me is the fact that data is literally everywhere — in every field and in everyday decisions, from museums to healthcare to home purchasing. That’s a lot of opportunity for enhanced understanding and change, which is the reason why I chose this career. 

The true power of data engineering to me is the fact that data is literally everywhere.”

 

How does data engineering differ from more traditional software development? What key technical skills do you use most often during your workday? 

In the generic sense, data engineers build systems and pipelines that make the process of data storage and retrieval required for the systems and applications possible. It is often considered a subset of the software engineering profession, since data engineers are trained to handle selective tasks, and are trained slightly differently from conventional software developers. Data engineers have tasks like pulling data from various sources, be it APIs, a data warehouse, data streams or other places like documents, PDFs, news articles or online forums. They then make this data accurate and available to end users, such as executives, data scientists or analysts, enabling them to make crucial decisions. Buzzwords you’d often hear at Pennymac are Python, Pandas, SQL, Hadoop, Snowflake, Tableau and AWS.

Software engineers, on the other hand, work more on coding and collaborating with designers, programmers and developers to build applications and systems for an end user. The traditional tasks of a software engineer include the development of operating systems, software design, front- and back-end development, UI/UX experience, and mobile app creation. Some of the tools software engineers use at Pennymac are integrated development environments like JetBrains or Eclipse, or vehicles like GitHub, Docker, AngularJS, HTML and CSS. 

A day in my life revolves around the edges of the data lifecycle. This includes observing the quality of the data, its efficient storage, building pipelines to validate consistency as it moves through the organization, and building visualization to gain a better understanding of the assets for checking anomalies or fluctuations.

My move to Pennymac was influenced by the organization’s willingness to provide the freedom to operate and experiment using leading-edge tools. When taking a cloud-based approach to data solutions, we use many professional tools to store data and trigger automated scripts. We generate business insights that provide users with an improved understanding of the data, and we’re marching on the right path.

 

Describe a project you’re working on right now, including the goal, your approach to the problem and the challenges you’ve encountered.

We are working on enhancing the data governance platform at Pennymac. This is an AI-based platform that will enable the organization to gauge the relationships in their data, track its movement, efficiently catalog and monitor it, and draw patterns and insights, allowing business users to plan quick and impactful changes. 

We are also training our models to streamline the data flow throughout the organization, making it more efficient and intelligent. The best part about being in the early stages is that any problem we encounter allows us to be very creative in our solution. This is a structure that is being built from the ground up, adding a layer of variety to our tech stack and making it even more enjoyable to collaborate as a team.

Believe me when I say that no two days are alike. This is a good thing because it has greatly improved my skill set and trajectory as an engineer, and it allows me to be a little more daring as I try out new solutions to problems. This has undoubtedly brought out the best in me, and I am excited to see what the future holds. 

 

 

Responses have been edited for length and clarity. Images via listed companies and Shutterstock.

Hiring Now
ZS
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting