Senior DevOps Engineer
Posted today
$105,000 - $115,000
Brisbane City, QLD, 4000
Location
Full Time
Work Type
Mixed (Both In Office + Remote)
Work Setting
Job Description
Senior DevOps Engineer
About QCIF
QCIF Digital Research is a non-profit organisation that provides cutting-edge digital infrastructure capabilities for research and innovation across Queensland and Australia. QCIF draws investment from its Members, the Queensland Government, and the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) program. We are an NCRIS node for the Australian BioCommons (Bioplatforms Australia), and the Australian Research Data Commons (ARDC) and its Nectar Research Cloud.
About this role
In this role, you will work hands-on with cloud services (OpenStack, AWS), container orchestration (Kubernetes), automation tools (Terraform, Ansible), continuous integration and deployment tools (Jenkins, GitHub Actions), and observability tools (Grafana) to support production scientific platforms used across Australia. You will work within a team to manage day-to-day operations, implement infrastructure improvements, and collaborate with platform developers, data scientists, and partner organisations. You will contribute to and participate in open-source projects such as Galaxy, a web-based platform for data intensive life science research.
This role is ideal for someone with solid practical experience in cloud operations and DevOps tooling who is ready to take ownership of infrastructure components, learn rapidly, and contribute to national-scale research services.
Key responsibilities
Cloud infrastructure and DevOps
- Operate and maintain production software deployments in cloud environments across commercial and research clouds (such as AWS, OpenStack, Azure, and Google Cloud Platform), managing compute, object storage, identity, networking, Linux and Windows Virtual Machines
- Deploy and manage infrastructure as code (IaC) using tools such as Terraform, AWS CDK, and Ansible, ensuring detailed documentation and procedural guidelines are maintained
- Support container orchestration and deployments across cloud environments, using Kubernetes, Docker, and Helm-based workflows
- Implement and oversee system guardrails and monitoring, including logging, and alerting systems, to ensure the ongoing stability and security of all platforms
- Contribute to recommendations for capacity planning and resource management approaches in compute and storage
Automation and Software Development
- Develop operational automation and integration scripts using languages like Python, TypeScript/Node.js, Linux shell, or PowerShell to streamline administrative tasks and system operations
- Maintain robust CI/CD pipelines to automate build, test, and deploy workflows
- Develop and contribute to the Galaxy Project codebase to meet the needs of Australian researchers, through infrastructure optimisation
- Manage data needs and implement solutions, deploying, administering, and interrogating relational and NoSQL databases like PostgreSQL, MS SQL Server, and MongoDB
Collaboration & Delivery
- Collaborate within a distributed multidisciplinary team using agile, iterative delivery methods, actively participating in architectural discussions, sprint planning, and backlog reviews
- Engage with diverse local, national, and international stakeholders, including partner institutions and research communities, to prioritise technical improvements and support platform deployment
- Translate and communicate technical decisions and requirements clearly, ensuring that operational impacts and architectures are understood by both technical and non-specialist colleagues
- Create and maintain comprehensive documentation and procedural guidelines for all supported systems, automation scripts, and technical actions
- Participate in and manage the resolution of technical problems reported via service desks and during critical incidents and emergencies
About you
Essential criteria
Education & Experience
- A degree in a relevant discipline (such as Computer Science, Software Engineering, IT, or Bioinformatics), combined with at least 5 years of industry experience in a relevant discipline, or an equivalent combination of education and training
Systems, Cloud & Infrastructure
- Practical understanding of systems administration in Linux environments
- Hands-on experience managing production software deployed in cloud environments (such as OpenStack, AWS, Azure, GCP), including compute, volume and object storage, networking services and related technologies
- Practical understanding of container orchestration in cloud environments (e.g., Kubernetes, EKS) and Docker-based deployment workflows
- Competency with Infrastructure-as-Code (IaC) using tools like Terraform or AWS CDK, alongside advanced knowledge of DevOps and DevSecOps principles
- Familiarity using monitoring and logging tools such as Prometheus, Grafana, CloudWatch, or OpenSearch to track system utilisation and forecast capacity requirements
Software Development & Automation
- Strong scripting, programming, and automation proficiency using languages such as Python, TypeScript/JavaScript/Node.js, Unix shell, PowerShell, and Perl
- Demonstrated experience with database deployment, programming (SQL), and administration across relational and NoSQL databases, such as PostgreSQL, MS SQL Server, and MongoDB
- Solid understanding of modern software engineering practices, including experience with version control (Git) and CI/CD systems (e.g. Jenkins, GitHub Actions) to develop robust, testable, and scalable systems
Collaboration & Work Style
- Good communication and interpersonal skills, with the proven ability to translate and communicate technical requirements to diverse stakeholders and multidisciplinary teams
- Strong problem-solving and creative thinking capabilities
- Ability to work in a team and independently, prioritise heavy workloads, meet deadlines, and adapt to changing directions
- Commitment to organisational values, demonstrating openness, respect, and integrity
- Use safe manual handling techniques, practice safe work habits in line with QCIF Policies
- Wear protective clothing provided where necessary and take a consultative role in assisting and maintaining a clean, tidy work area and a healthy and safe working environment
- Report any health or safety hazards, faults, repairs, broken or damaged company property, cleaning needs and accidents immediately
- Ensure all equipment is kept in good working order and used only for the purpose for which it was intended
- Consult with employees on health and safety matters that impact them
- Be fully conversant with emergency procedures
- Acquire and maintain proficiency with Microsoft Office Suite
- Development and delivery of training materials and events to QCIF’s Skills Development client base may be required, as overseen by the Skills Development Coordinator
Desirable criteria
- Practical understanding of systems administration in Windows environments
- Experience in data science or bioinformatics, including an understanding of data governance considerations in biomedical and scientific contexts
- Experience supporting data-intensive or High-Performance Computing (HPC) workloads, including an understanding of hardware requirements for life sciences analyses and experience managing large files and datasets
- Demonstrated skills in cybersecurity and information security management frameworks, such as ISO/IEC 27001, the NIST Cybersecurity Framework, or the Australian Information Security Manual
- Experience working with tool packaging systems, specifically Conda and Singularity
- A track record of contributions to open-source software or collaborative development environments
- Training materials and events have been developed and delivered, including answering participants’ questions in post-training forums, and engaging in one-to-one consultations
Reporting Relationship
This position reports to Head of Data and Software.
QCIF values diversity and inclusion and actively encourages applications from those who bring diversity to QCIF.
Flexible working arrangements may be negotiated.
Job Activity
- Employer reviewed job: 18 hours ago
- Posted: 1+ days ago