Production Support Engineer
Own production support for business-critical applications — incident management, root-cause analysis, and observability with Splunk, Dynatrace and AWS.
Apply via emailJob Summary
We are looking for a proactive and technically strong Production Support Engineer with experience in handling application support, incident management, monitoring, and troubleshooting in production environments. The ideal candidate should have hands-on experience with monitoring tools like Splunk and Dynatrace, strong SQL skills, exposure to AWS cloud services, and a good understanding of ITSM processes.
The role requires excellent analytical, communication, and problem-solving skills along with the ability to work in a fast-paced support environment.
Key Responsibilities
- Provide production support for business-critical applications and services.
- Monitor application health, system performance, and alerts using monitoring tools.
- Perform incident management, problem management, and root cause analysis (RCA).
- Troubleshoot production issues and coordinate with development, infrastructure, and business teams for resolution.
- Work on ticketing and service management activities following ITSM processes.
- Analyze logs, system metrics, and application behavior using monitoring tools such as Splunk and Dynatrace.
- Perform SQL queries for data validation, issue investigation, and troubleshooting.
- Support deployments, release activities, and post-release validations.
- Ensure adherence to SLA and support timelines.
- Participate in on-call support and production maintenance activities.
- Prepare incident reports, RCA documents, and support documentation.
- Collaborate with cross-functional teams to improve system stability, performance, and operational efficiency.
- Identify recurring issues and recommend automation or permanent fixes.
- Support cloud-based applications and services hosted on AWS.
Required Skills & Qualifications
- Strong experience in Production Support roles.
- Hands-on experience with monitoring and observability tools including Splunk and Dynatrace.
- Good understanding of ITSM processes.
- Strong proficiency in SQL for data analysis and troubleshooting.
- Experience supporting applications hosted on Amazon Web Services.
- Experience analyzing logs, alerts, and application performance metrics.
- Knowledge of Linux/Unix commands and basic scripting.
- Experience with ticketing tools such as ServiceNow or Jira.
- Good understanding of SDLC and production support best practices.
- Strong analytical, troubleshooting, and debugging skills.
- Excellent verbal and written communication skills.
Good to Have
- Exposure to DevOps and CI/CD pipelines.
- Exposure to Kubernetes or containerized environments.
- Experience working in Agile environments.
- Understanding of cloud monitoring and observability concepts.
Preferred Candidate Profile
- Ability to work under pressure and manage multiple priorities.
- Quick learner with strong problem-solving capabilities.
- Excellent stakeholder management and coordination skills.
- Flexible to work in rotational shifts or on-call support model if required.