Menu

Sr. Systems Engineer - Observability

at Marriott in Springfield, Illinois, United States

Job Description

Job Number 24084988

Job Category Information Technology

Location Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States

Schedule Full-Time

Located Remotely? Y

Relocation? N

Position Type Management

JOB SUMMARY

The Sr. Systems Engineer – Observability (SSE) role will define and implement infrastructure and application logging, setup governance, optimization, monitoring and control, including forecasting and chargeback model for observability platform. The role will work with engineering, application and enterprise/solution architects to develop and implement services, monitor, report and automation where applicable. This role serves as a subject matter expert in a complex array of full stack solutions. This role serves as a subject matter expert performing research, analysis, design, creation, and implementation to meet current and future requirements.

CANDIDATE PROFILE

Education and Experience

Required:

+ Undergraduate degree in an engineering or computer science discipline and/or equivalent experience/certification

+ 7+ years’ experience in information technology with hands-on technical/engineering roles including:

+ 5+ years’ admin experience Dynatrace/Grail/Splunk Cloud / Cribl, etc

+ 3+ Experience in AWS cloud platforms log ingestion solutions

+ 3+ years data onboarding within a large-scale enterprise environment

+ Experience in implementing and maintaining Dynatrace/Grail or other observability solutions

+ Experience in Dynatrace Query Language (DQL) and/or Splunk Processing Language (SPL) including building dashboards, reports and alerts to meet customer requirements.

+ Experience in integrating observability tools with other ITOps solutions (Harness, ReadyAPI, ServiceNow, BigPanda, etc.)

Additional Preferred Experiences:

+ Splunk Certified Admin and/or Dynatrace Certified Admin

+ Scripting experience in at least one of the following: PowerShell, Regex, Python, JavaScript, Ansible and Terraform

+ Strong knowledge of emerging tools, software, applications, and AI solutions for attaining best-in-class IT technology across the enterprise.

+ Experience in researching emerging technologies and trends, standards, and products

+ Experience in establishing and implementing Observability best practices to standardize, monitor and control usage/performance of solutions.

+ Excellent verbal and written communication skills for a wide range of audiences including executives, business stakeholders and IT teams

+ Project planning and management experience.

+ Experience operating in Scaled Agile Framework

+ Demonstrated experience delivering technology solutions in a fast-paced, deadline driven enterprise environment

+ Demonstrated experience learning and applying new technologies to solve business needs

+ Excellent problem-solving skills working independently and through leading outcomes for cross functional teams

+ Excellent understanding of change management, testing requirements, techniques, and tools to ensure high availability of systems

+ Strong attention to detail with an ability to operate effectively across multiple priorities

CORE WORK ACTIVITIES

+ Design, implement, and maintain high-performance and scalable observability solutions (Kubernetes – EKS/ACK, ROSA, DocumentDB and other data sources) in a complex enterprise environment.

+ Collaborate with cross-functional teams to gather requirements, architect solutions, and deploy logging and monitoring environments that align with business needs.

+ Leverage in-depth knowledge of AWS, Azure and Alibaba Cloud technologies, including IaaS, PaaS, and SaaS, to architect and manage logging and monitoring tools’ deployments.

+ Demonstrate proficiency in scripting and automation, enabling streamlined operational processes and efficient management of the Dynatrace and Splunk infrastructure.

+ Lead optimization efforts for observability platform and explore alternative solutions using other automation technologies like Cribl, etc.

+ Onboard data sources from various IT infrastructure and app. components into observability tools (Dynatrace/Grail, Splunk,SignalFx, Cribl).

+ Provides technical leadership, oversight, governance and direction for services related to Marriott solution delivery

+ Provides technical expertise to project team for successful project and change implementations

+ Determines customer requirements and works with sourced resources to develop solutions

+ Provides and presents status, analysis and reporting to internal stakeholders, Executive Management and Senior Leadership

+ Leads analysis of current environment for deficiencies and provides solutions

+ Identifies opportunities to enhance the service delivery, operations and continual service improvement processes

+ Responsible for project inception including requirements gathering and architecting, costs and chargeback modeling, infrastructure-as- code development and configuration management

Delivering Technology

+ Create and enhance administrative, operational and technical policies and procedures, adopting best practice guidelines, standards and procedures for employees, contractors and vendor engagements

+ Management of daily infrastructure operations to ensure availability SLA is met for storage services

+ Interfaces with stakeholders to establish requirements and formulate priorities for infrastructure projects

+ Leads/assists in configuration management

+ Works in a concerted effort with application development and engineering teams to resolve complex issues

+ Provides oversight, collaboration, provisioning, management and maintenance of technology products and service alternatives that improve the production services environment

+ Responsible for the establishment and continuous development of monitoring and alerting for all production environments

+ Develops internal processes and training to ensure team members have the needed skills and tools to support the production environments and deliver on project commitments

+ Performs complex quantitative and qualitative analyses for operational availability to promote a zero-defect environment

+ Leads/assists operational teams in system updates & upgrades

+ Provides consultation for routine and complex systems development

+ Maintains a proper balance between business and operational risk

+ Facilitates achievement of expected deliverables and obligations of Services Providers

+ Ensures early warning to the business stakeholder executives regarding degraded or missed service levels

+ Coordinates with Product and Architecture & Development teams for deployment and production support activities

Managing Work, Projects, and Policies

+ Manages and implements work and projects as assigned.

+ Generates and provides accurate and timely results in the form of reports, presentations, etc.

+ Analyzes information and evaluates results to choose the best solution and solve problems.

+ Provides timely, accurate, and detailed status reports as requested.

Delivering on the Needs of Key Stakeholders

+ Understands and meets the needs of key stakeholders.

+ Develops specific goals and plans to prioritize, organize, and accomplish work.

+ Determines priorities, schedules, plans and necessary resources to ensure completion of any projects on schedule.

+ Collaborates with internal partners and stakeholders to support business/initiative strategies

+ Communicates concepts in a clear and persuasive manner that is easy to understand.

+ Generates and provides accurate and timely results in the form of reports, presentations, etc.

+ Demonstrates an understanding of business priorities

Additional Responsibilities

+ Manages time effectively and conducts activities in an organized man

Copy Link

Job Posting: JC259965595

Posted On: May 17, 2024

Updated On: Jul 12, 2024

Please Wait ...