On top of this, there are various microservices, developed in house, that handle provisioning of these services, entitlement, and upgrades. These are written primarily in Golang and a couple in Python.
The core infrastructure team is responsible for this infrastructure, spread across 10 production deployments across the globe, 24/7, with 4 nines of uptime. Our infrastructure is managed using Terraform (for IaC), GitLab CI and monitored using Prometheus and Datadog.
Were looking for you if:
- You are strong infrastructure engineer with specialty in networking and site reliability.
- You have strong networking fundamentals (DNS, subnets, VPN, VPCs, security groups, NATs, Transit Gateway etc)
- You have extensive and deep experience (~4 years) with IaaS Cloud Providers. AWS is ideal, but GCP/Azure would be fine too.
- You have experience with running cloud orchestration technologies like Kubernetes and/or Cloud Foundry, and designing highly resilient architectures for these.
- You have strong knowledge of Unix/Linux fundamentals
- You have experience with infrastructure as code tools. Ideally Terraform, OpenTofu but CloudFormation or Pulumi are fine too.
- You have experience designing cross Cloud/on-prem connectivity and observability
- You have a DevOps mindset: you build it, you run it.
- You care about code quality, and know how to lead by example: from a clean Git history, to well thought-out unit and integration tests.
Even better (but not essential!) if you have experience with:
- Monitoring tools that we use, such as Datadog and Prometheus
- CI/CD tooling such as GitLab CI
- You have programming experience with (ideally) Golang or Python
- You are willing and able to use your technical expertise to mentor, train, and lead other engineers
You ll help drive digital innovation by:
- Continually improving our security + operational excellence.
- Work directly with customers to set up connectivity between Mendix Cloud platform and customers backend infrastructure.
- Rapidly scaling our infrastructure to match our rapidly increasing customer base.
- Continuously improving the observability of our platform, so that we can fix problems before they occur.
- Improving our automation and surrounding tooling to further streamline deployments + platform upgrade.
- Improving the way we use AWS resources, and defining cost optimization strategies
Here are many of the tools we make use of:
- Amazon Web Services (EC2, Fargate, RDS, S3, ELB, VPC, CloudWatch, Lambda, IAM, and more !)
- PaaS: (Open Source) Kubernetes, Docker, Open Service Broker API
- Eventing: AWS MSK and Confluent Warpstream BYOK
- Monitoring: Prometheus, InfluxDB, Grafana, Datadog
- CI/CD: GitLab CI, ArgoCD Automation: Terraform, Helm
- Programming languages: mostly Golang and Python, with a sprinkling of Ruby and Lua
- Scripting: Bash, Python
- Version Control: Git + GitLab
- Database: PostgreSQL
Employment Type: Full Time, Permanent
Read full job description