DevOps Engineer- Remote
- Job Title
- DevOps Engineer- Remote
- Job ID
- 27699039
- Location
- Tampa, FL
- Other Location
- Description
-
Our History:
From our start in 2009, Conexess has established itself in 3 markets, employing nearly 200+ individuals nation-wide. Operating in over 15 states, our client base ranges from Fortune 500/1000 companies, to mid-small range companies. For the majority of the mid-small range companies, we are exclusively used due to our outstanding staffing track record
Who We Are:
Conexess is a full-service staffing firm offering contract, contract-to hire, and direct placements. We have a wide range of recruiting capabilities extending from help desk technicians to CIOs. We are also capable of offering project based work.
This is a fully remote position, the candidate can reside anywhere in the US. He/ She might need to come to an office for conference, events or personal meetings etc. when needed.
The candidate will be working mainly on Dynatrace and other APM tool to onboard and define monitoring metrics for applications of all types. Being an APM expert is a must have skill for this role. Also experience working on SRE team that focused on application level monitoring, code level debugging, self-healing via automation and root cause analysis.
Daily tasks will include the following:
• Collaborate with software development groups to ensure operational needs are adequately considered and baked into the APM tool as well as in new software releases.
• Define monitoring & alarming thresholds. Setup clear and accurate SLO/SLI for efficient service monitoring
• Develop infrastructure (network, compute, storage) & application capacity models.
• Drive toward automated deployments & modern approaches to configuration management.
• Focus on application reliability. Ensure applications and infrastructure designs avoid pitfalls of large scale SaaS offerings (bottlenecks, single points of failure, etc.).
• Full service handling (analysis, debugging, response, and resolution) of application level issues using Dynatrace or other application monitoring tool.
• Focus on service availability. Reduce MTTR by assisting the incident command, operations, and application developers teams to diagnose & resolve service outages (incidents with significant customer impact).
Key skills:- New Relic
- Dynatrace,
- Grafana
- SRE
- SLO
- SLI,
- Automation
- Configuration management
- Cloud architecture (Azure preferred) or AWS