ANY CV/RESUME SHOULD BE PROVIDED ON ENGLISH OTHERWISE WE WILL DISCARD YOUR POSITION.
Senior DevOps & Linux Infrastructure Engineer
About the Role
We are seeking a Senior DevOps & Linux Infrastructure Engineer to support, modernize, and improve the reliability of our production infrastructure.
Our environment includes business-critical web applications, legacy PHP systems, newer Node.js/TypeScript services, PostgreSQL/MariaDB databases, Redis, background workers, Docker-based deployments, S3-compatible object storage, and Linux servers.
This role requires strong hands-on experience with Linux systems, networking, containerization, CI/CD, PHP runtime environments, database operations, automation, and production troubleshooting.
The ideal candidate is comfortable working with existing infrastructure while helping us gradually improve deployment safety, observability, automation, security, and operational reliability.
Key Responsibilities
- Manage, maintain, and improve production Linux infrastructure.
- Support Docker-based production services.
- Work with Docker Swarm and/or Kubernetes environments.
- Build, maintain, and improve CI/CD pipelines, particularly with GitLab CI.
- Support modernization of legacy PHP applications and deployment workflows.
- Manage PHP production environments, including PHP-FPM, Nginx, Apache, worker processes, configuration, and performance tuning.
- Improve release processes, including versioned deployments, rollback procedures, deployment history, staging environments, and canary workflows.
- Manage PostgreSQL and MariaDB/MySQL databases in production.
- Support database operations, including backups, restores, replication, migrations, upgrades, monitoring, and performance troubleshooting.
- Analyze and optimize SQL queries, indexes, locks, deadlocks, slow queries, connection limits, and database resource usage.
- Manage and troubleshoot Redis, background workers, job queues, retries, and failure handling.
- Support S3-compatible object storage infrastructure, preferably Ceph.
- Assist with object storage operations, including bucket management, lifecycle policies, replication, capacity planning, monitoring, and troubleshooting.
- Use Ansible or similar tools to automate server provisioning, configuration, and operational tasks.
- Write scripts in Bash, Python, or similar languages to automate infrastructure workflows.
- Troubleshoot production issues involving Linux, networking, containers, storage, databases, PHP, queues, and web servers.
- Improve logging, monitoring, alerting, dashboards, and incident visibility.
- Improve infrastructure security, secrets management, access control, and least-privilege production access.
- Create and maintain documentation, runbooks, and operational procedures.
Required Qualifications
- 5+ years of experience in DevOps, infrastructure engineering, SRE, platform engineering, or Linux systems engineering.
- Strong Linux systems administration experience.
- Strong understanding of Linux fundamentals, including processes, filesystems, permissions, services, logs, networking, and resource usage.
- Production experience with Docker.
- Experience with Kubernetes and/or Docker Swarm.
- Experience building and maintaining CI/CD pipelines, preferably with GitLab CI.
- Strong scripting skills using Bash, Python, or a similar language.
- Strong networking knowledge, including DNS, TCP/IP, HTTP/HTTPS, TLS, firewalls, routing, load balancing, reverse proxies, and network troubleshooting.
- Production experience with Nginx and/or Apache.
- Production experience managing PHP applications, especially PHP-FPM.
- Production experience with PostgreSQL and/or MariaDB/MySQL.
- Strong SQL knowledge.
- Experience with database backups, restores, replication, monitoring, and performance troubleshooting.
- Experience troubleshooting slow queries, indexing issues, locks, deadlocks, connection limits, and transaction behavior.
- Understanding of safe schema migration practices in production environments.
- Production experience with Redis.
- Experience operating background workers, job queues, retries, and failure handling.
- Experience with Ansible or similar configuration management tools.
- Experience modernizing legacy applications or improving legacy deployment workflows.
- Ability to troubleshoot production systems from first principles.
- Strong documentation and communication skills.
Preferred Qualifications
- Hands-on experience operating Ceph or another S3-compatible object storage platform.
- Experience with S3-compatible storage operations, including replication, lifecycle policies, bucket policies, capacity planning, monitoring, performance tuning, and disk failure handling.
- Experience with MinIO, SeaweedFS, or similar self-managed object storage systems.
- Experience with k3s, Helm, cert-manager, ingress controllers, and ArgoCD.
- Experience with GitOps workflows.
- Experience with Harbor or other private container registries.
- Experience with Prometheus, Grafana, Loki, ELK/OpenSearch, Sentry, or similar observability tools.
- Experience with NFS, shared storage systems, and migrations away from fragile shared filesystem patterns.
- Experience with PCI-sensitive, payment-related, or security-sensitive infrastructure.
- Experience designing backup, restore, disaster recovery, and high-availability procedures.
- Experience with Terraform or other infrastructure-as-code tools.
Technical Environment
- Linux / Ubuntu
- Docker
- Docker Swarm and/or Kubernetes
- GitLab CI
- Nginx / Apache
- PHP-FPM
- PHP legacy applications
- Node.js / TypeScript services
- PostgreSQL
- MariaDB / MySQL
- Redis
- Background workers / job queues
- Ceph or S3-compatible object storage
- Ansible
- Bash / Python
- Sentry / Prometheus / Grafana or similar observability tools
Ideal Candidate Profile
The ideal candidate is a senior, hands-on infrastructure engineer who has operated real production systems and understands how to improve reliability without unnecessary disruption.
You should be comfortable working with legacy environments, identifying operational risks, and implementing practical improvements over time. You should be able to troubleshoot issues across the infrastructure stack, including Linux, networking, containers, web servers, PHP-FPM, databases, Redis, queues, storage, CI/CD, and observability.
We value someone who can resolve immediate production issues while also building automation, documentation, monitoring, and repeatable processes to prevent similar issues in the future.
Example Initiatives
- Improve GitLab CI pipelines for PHP and Node.js applications.
- Containerize legacy PHP applications.
- Improve PHP-FPM and Nginx production configuration.
- Reduce manual deployment steps through automation.
- Improve Docker Swarm or Kubernetes deployment practices.
- Support future Kubernetes, k3s, or ArgoCD-based deployment workflows.
- Improve PostgreSQL and MariaDB backup, restore, migration, monitoring, and performance processes.
- Troubleshoot slow SQL queries, locking issues, deadlocks, and connection exhaustion.
- Support Ceph or other S3-compatible object storage systems.
- Improve Redis and job queue reliability.
- Add monitoring and alerting for servers, databases, queues, storage, containers, and application health.
- Define rollback and incident response procedures.
- Automate server configuration using Ansible.
- Improve secrets management and production access controls.
- Create runbooks for common production incidents.
Interview Areas
The interview process will focus on practical production experience in areas such as:
- Linux systems troubleshooting
- Docker, Kubernetes, or Docker Swarm
- Networking fundamentals
- Nginx, Apache, and PHP-FPM
- GitLab CI/CD
- PostgreSQL and MariaDB operations
- SQL troubleshooting and optimization
- Redis and background job queues
- S3-compatible storage and Ceph
- Ansible and scripting
- Legacy application modernization
- Production incident response
Sueldo: $27,000.00 al mes
Lugar de trabajo: Empleo presencial