ALTO Network is a leading payment infrastructure provider as well as the pioneer in payment solution by always bringing the most innovative and impactful technology to connect merchants or financial institutions with their customers to grow their businesses nationwide and beyond.
DESIGNATION : Site Reliability Engineer Automation Associate Manager
RESPONSIBILITIES
- Drive the development and adoption of automation tools and frameworks to improve efficiency and reduce toil. Identify opportunities to automate manual tasks.
- Design and implement comprehensive monitoring and alerting systems.
- Analyze performance data to identify bottlenecks and optimize system performance.
- Promote SRE best practices and principles within the organization. Influence architectural decisions to improve system reliability.
- Contribute to the development of the SRE roadmap and strategy. Identify and prioritize key initiatives.
- Stay up-to-date on the latest SRE trends and technologies. Research and evaluate new tools and techniques.
QUALIFICATIONS
Proven experience as Devops / IT Automation / Application Support / System Engineer or similar position at least 4 years' experience.
Bachelor of Information Technology
Strong knowledge and understanding on Linux OS, Kubernetes and Docker Swarm and/or other microservices technologies, GCP / AWS cloud technologies.
Experienced with Automation tools like Terraform, Ansible, etc.
- Experience with AI/ LLM (Large Language Model)
Technical:
Linux administration skills
Ansible / Terraform skills
Bash / Python / other automation code skills
AWS / GCP skills
Docker / Kubernetes skills
Troubleshooting skills
CI / CD
Elasticsearch, Grafana or other observability skills
Knowledge:
Cloud Technology
REST API
Networking
Monitoring Tools
Automation / Scripting
Postman / API Testing
TCP/IP Knowledge
Microservices
AI Model Knowledge