Data Scientist at Inspire
As a Data Scientist on Inspire’s Data Platforms and Services team, you will develop and deploy statistical modeling and machine learning services to drive and optimize business decisions, objectives, and operations. This is an opportunity to play a central role in a high-growth company committed to expanding our renewable energy footprint and building a sustainable energy future.
THE DATA SCIENTIST HAS THREE MAIN RESPONSIBILITIES
- Iteratively and quantitatively improve our energy forecasting and existing modeling services
- Expand the capabilities of Inspire’s core modeling platform to support improving applications and product features
- Partner with stakeholders to translate business goals into models, services, and metrics
SOME EXPECTED 2020 DELIVERABLES
- Measurably improve our energy forecasting and existing modeling services
- Refactor legacy modeling code into testable and deployable modeling services
- Partner with Analytics to systematize and scale high-integrity and value-oriented outcome analysis
- Partner with Product stakeholders to define how machine learning and algorithmic optimization can improve customer experiences or business outcomes
- Partner with other engineering teams to guide integrations into existing applications and data infrastructure to improve performance and features
- Cultivated familiarity with Inspire’s frameworks and operating model
- Improvements in modeling performance
- Increased transparency: facilitated substantive discussions about theory, implementation, available data, and observed trends
- Technical competency - comfort on a command line, a good grasp on the fundamentals of programming, familiarity with Git/source control, in-depth knowledge of domain-specific tools and frameworks
- Statistical competency - able to navigate and apply statistical frameworks for measuring confidence and predictive power. Aware of how assumptions may be violated or models over-fit, able to caveat findings with technical limitations and common sense.
- Results-orientation - resists the urge to get caught up in a great idea, emphasizes testing and respects measured outcomes over theoretical benefits.
- Problem-solving mentality - gets excited about digging into complexity, wants to ask questions and learn more, and isn’t put off by problems they’ve never been explicitly told how to solve. Especially troubleshooting: ability to break down a chain of steps to narrow and locate a problem.
- Big-picture awareness - Understanding of the importance of context, and ability/willingness to understand the business problem in addition to the technical one. Focus on people & impact. Identify shortcuts & justify appropriate level-of-effort. Pre-emptive identification of potential issues downstream.
- Must Have
- Strong knowledge of mathematics, statistics, and experimental design
- 1 year of experience with time-series forecasting methods and measurements
- 1 year experience using SQL to query large datasets in cloud-based warehouses
- Demonstrable proficiency of model development in Python
- Fluency with data visualization to communicate complex topics in approachable ways
- Nice to Have
- Experience with key frameworks: Airflow, DBT, MLFlow, DVC, AWS, Docker, Kubernetes, Spark
- Software development lifecycle experience in GitHub (i.e. environment management, testing, deployment)
- Experience with survival analysis
- Strong understanding of data structures, algorithms, and system design
- Experience at a similar scale of data processing (multi-TB/billions of rows)
- Work with real-time event stream data
- Contextual work in the energy industry
- Experience with a visualization tool (e.g. Tableau)