Mercedes-Benz Research and Development India Private Limited
Site Reliability Engineering with extensive experience in scaling and operating cloud-native platforms. You'll work with other SRE team members operating dedicated Kubernetes environments.
- You will work closely with development team to gain in-depth knowledge to learn about how each application server works in terms of business logic, security aspects, deployment environments and the relevant software processes involved.
- As an Engineer, you are a self starter with a knack for debugging complex issues. There are opportunities to do programming to help automate application setup/config, monitor, deployment and sanity testing.
- Identifying performance bottlenecks, identifying anomalous system behavior, and determining the root cause of incidents
- There may be occasional travels required to provide on-site support.
- Solid hands-on experience building, maintaining, and scaling PaaS and container-hosting platform
- Software programming experience in one or more programming languages Python, Golang, Java
- A proven track record with Docker containers with a deep understanding of the current container ecosystem
- Proven experience with running containers (Docker) in a production environment (Kubernetes, Docker Swarm)
- Deep understanding of Kubernetes fundamentals, including scaling for production workloads
- Expert skills with Linux (network, OS, process level), networking (network layers, DNS, load balancing), storage, and virtualization
- Experience with running multi-cluster environments and strong understanding of multi-tenancy and security implications
- Experience with build automation and configuration management systems (e.g. Jenkins, Ansible)
- Experience in setting up and using of monitoring tools such as AppDynamics, DynaTrace etc
- Knowledge of continuous integration (CI) and continuous development (CD) pipelines
- Previous experience in supporting large-scale production environments
- Ability to analyze and debug complex software and infrastructure issues, and develop tools/systems for task automation
- Experience working in an agile development environment
- Strong analytical and problem-solving skills
- Strong communication and collaboration skills