This job is not active.

Sr. Data Engineer - JR0023922_IL

at Yahoo! Inc. in Champaign, Illinois, United States

Job Description

Yahoo Mail is the ultimate consumer inbox with over 220 million users. It's the best way to access your email and stay organized from a computer, phone or tablet. With its beautiful design and lightning fast speed, Yahoo Mail makes reading, organizing, and sending emails easier than ever.

A Little About Us

Yahoo makes the world's daily habits inspiring and entertaining. By creating highly personalized experiences for our users, we keep people connected to what matters most to them, across devices and around the world. Yahoo's vast businesses span across Search, Communications, Media, and many other verticals.

Yahoo generates a huge amount of data every day and it is critical to collect, manage and process data at petabyte scale to provide timely and accurate insights to executives, sales, product managers and product developers on all aspects of user interaction.

The Mail Analytics Engineering team at Yahoo is responsible for building mission critical data systems, pipelines, warehouses, analytics systems, and Machine Learning/AI/data mining programs for the Communications business. We are constantly pushing the envelope of data platforms due to the insane amount of data we need to harness.

As part of the Mail Analytics Engineering team, you will be working on data engineering pipelines and next generation Machine Learning- and AI-based data infrastructure, supporting new functionalities on existing platforms, and mining data for analytics insights and product features.

Our Big Data footprints are among the largest few in the world, at double-digit petabyte scale. Developing this infrastructure presents many technical challenges in the areas of efficient query processing, large-scale stream processing, machine learning and modeling, as well as satisfying complex business rules.

If you are someone who is passionate about harnessing data at insane scale, enjoys working with new technologies, setting up petabyte data infrastructures and implementing new machine learning solutions and metrics systems, we want to hear from you!

Responsibilities:

Improve our existing data infrastructures for machine learning and deep learning using your core expertise
Work with other engineers to implement algorithms and systems in an efficient way
Take end to end ownership of Machine Learning-based distributed data systems - from data pipelines and training, to real time prediction engines.
Develop complex queries, very large volume data pipelines, and analytics applications
Develop complex queries and software programs to solve analytics and data mining problems
Interact with data analysts, data scientists, product managers, and software engineers to understand business problems, technical requirements to deliver data solutions
Prototype new metrics or data systems
Lead data investigations to troubleshoot data issues that arise along the data pipelines
Maintenance and improvement of released systems
Engineering consulting on large and complex warehouse data

A lot About You:

BS/MS/PhD in Computer Science/Electrical Engineering, or related engineering disciplines, ideally with specialization in Data Engineering or Machine Learning
Strong fundamentals: algorithms, distributed computing, data structure, database
Fluency with: Python/Java/Scala/SQL
5+ years of industry experience on very large scale analytics or ML systems development
2+ years of experience with Google Cloud Platform (BiqQuery, Dataproc, Composer, Dataflow, BigTable, etc.)
2+ years of experience in Hadoop technologies...
Equal Opportunity Employer - minorities/females/veterans/individuals with disabilities/sexual orientation/gender identity

Copy Link

Job Posting: 11826455

Posted On: Apr 12, 2024

Updated On: May 12, 2024