Unlimited Job Postings Subscription - $99/yr!

Job Details

We are Hiring - Site Reliability Engineer

  2026-01-15     InstantServe LLC     all cities,AK  
Description:

Job Title: Site Reliability Engineer

Remote Role

Job Description:

We are currently seeking a highly skilled and experienced Site Reliability Engineer to join our dynamic team. As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability, performance, and scalability of our systems. The ideal candidate will have proficiency in utilizing monitoring and observability tools such as Dynatrace, Splunk, and Grafana.

Responsibilities:

System Monitoring and Analysis:

Implement and manage monitoring solutions, including Dynatrace, Splunk, and Grafana, to ensure optimal performance and reliability.

Conduct in-depth analysis of system behavior and performance metrics to proactively identify and address potential issues.

Incident Response and Troubleshooting:

Respond to and resolve incidents in a timely and efficient manner, minimizing downtime and impact on operations.

Collaborate with cross-functional teams to troubleshoot and resolve complex technical issues.

Performance Optimization:

Identify opportunities for performance improvements and implement solutions to enhance system efficiency.

Work closely with development teams to optimize applications for performance and reliability.

Automation:

Develop and implement automation scripts and tools to streamline operational tasks and improve efficiency.

Collaborate with DevOps teams to integrate automated processes into the continuous integration/continuous deployment (CI/CD) pipeline.

Documentation:

Maintain comprehensive documentation related to system configurations, procedures, and troubleshooting guides.

Contribute to the knowledge base and share insights with the broader team.

Qualifications:

Bachelor's degree in Computer Science, Information Technology, or a related field.

Proven experience as a Site Reliability Engineer or in a similar role.

Strong expertise in utilizing monitoring tools such as Dynatrace, Splunk, and Grafana.

Experience with incident response, troubleshooting, and performance optimization.

Proficiency in scripting languages (e.g., Python, Bash) for automation tasks.

Familiarity with CI/CD pipelines and integration with automation tools.

Excellent communication and collaboration skills.


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search