Google Cloud DevOps / Site Reliability Engineer (SRE) Job at Purple Drive, Alpharetta, GA

TitFZE45M0NUWGFvTE1hK3R3VUpESTV6
  • Purple Drive
  • Alpharetta, GA

Job Description

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Location: Alpharetta, GA
Experience: 8-12 Years (Senior Level)

Job Summary

We are seeking an experienced Google Cloud DevOps / SRE Engineer to design, build, and operate highly reliable, scalable, and secure cloud infrastructure on Google Cloud Platform (GCP) . The ideal candidate will bring deep Linux expertise, strong cloud networking and security knowledge, and hands-on experience with automation, CI/CD, and Kubernetes-based deployments. This role plays a critical part in ensuring system reliability, performance, and operational excellence across large-scale distributed systems.

Key Responsibilities

Cloud Infrastructure & Platform Engineering

  • Design, deploy, and manage cloud infrastructure using Google Cloud Platform services including Compute Engine, GKE, VPC, IAM, Cloud Storage, and Cloud SQL.

  • Architect and support highly available, scalable, and fault-tolerant systems on GCP.

  • Implement and manage Shared VPCs, VPC peering, firewall rules, load balancers, DNS, and VPN tunnels .

DevOps & Automation

  • Build and maintain CI/CD pipelines using Jenkins (Declarative & Scripted) and GitHub Actions .

  • Automate infrastructure provisioning and configuration using Terraform , including module development, remote state management, dependency handling, and DRY principles.

  • Implement modern deployment strategies such as Canary releases and Blue/Green deployments .

  • Manage container artifacts using Docker and Helm .

Site Reliability & Operations

  • Ensure high availability, performance, and reliability of production systems.

  • Troubleshoot complex system issues including CPU, memory, disk I/O bottlenecks , kernel issues, and system boot failures.

  • Analyze logs and metrics to proactively identify and resolve performance and stability issues.

  • Support incident response, root cause analysis, and post-incident reviews.

Linux Systems Engineering (Must Have)

  • Demonstrate deep hands-on expertise with Linux systems (RHEL, Ubuntu, CentOS).

  • Perform kernel tuning, system optimization, storage management (LVM), and systemd administration.

  • Maintain OS-level security, patching, and performance best practices.

Security & Identity Management

  • Implement and troubleshoot Cloud IAM , service accounts, and Workload Identity Federation .

  • Enforce least privilege access and security best practices across environments.

  • Partner with security teams to maintain compliance and secure cloud operations.

Collaboration & Process

  • Work closely with application teams, architects, and security stakeholders.

  • Participate in on-call rotations and incident management processes.

  • Contribute to operational documentation, runbooks, and best practices.

Required Skills & Qualifications

Must-Have Skills

  • Strong hands-on experience with Google Cloud Platform (GCP) .

  • Deep expertise in Linux systems engineering (RHEL, Ubuntu, CentOS).

  • Proficiency in at least one programming language: Python, Go (Golang), or Java .

  • Strong troubleshooting and debugging skills across infrastructure and application layers.

  • Hands-on experience with Terraform for infrastructure as code.

  • Experience with CI/CD pipelines using Jenkins and/or GitHub Actions.

  • Kubernetes experience with GKE , Docker, and Helm.

Preferred Qualifications

  • GCP Certifications:

    • Google Professional Cloud DevOps Engineer

    • Google Professional Cloud Architect

  • CKA (Certified Kubernetes Administrator) .

  • Experience supporting large-scale distributed systems and microservices architectures .

  • Familiarity with ITIL processes , Change Advisory Board (CAB) workflows, and incident management .

Soft Skills

  • Strong analytical and problem-solving abilities.

  • Excellent communication skills with the ability to collaborate across teams.

  • Ownership mindset with a focus on reliability and continuous improvement.

  • Ability to work in fast-paced, production-critical environments.

Job Tags

Remote work,

Similar Jobs

ClubLink

Bartender (experienced) Job at ClubLink

 ...Job Title: Bartender Department: Food & Beverage Location: Eagle Trace Golf Club Job Status: Hourly, Non-exempt Purpose of Position: The Bartender prepares and serves alcoholic and non-alcoholic beverages. Job Summary: Greet patrons in a positive... 

University Orthopedics

Research Assistant Job at University Orthopedics

[FUNDED] 2026-2027 Orthopaedic (Shoulder/Elbow & Sports Medicine) Research Fellowship under Michel A. Arcand, MD Location : Providence, Rhode Island The Department of Orthopaedics at Brown University and University Orthopedics Inc. (UOI) invites applications for... 

CBRE

DSF Summer Intern Job at CBRE

DSF Summer InternJob ID218271Posted02-May-2025Service lineAdvisory SegmentRole typeFull-timeAreas of InterestInternship/Industry...  ...independently and collaboratively+ Familiarity with Argus, Salesforce and Tableau is a plus**Why CBRE?**When you join CBRE you will... 

Top Level Promotions

Office Administration Assistant - Work from Home Job at Top Level Promotions

 ...template and rules (no currency symbol): Work from Home Data Entry & Office Administration Remote Online Role About the...  ...Benefits ~100% remote work from home ~ Flexible schedule (part-time or full-time)~ Paid training included ~ Entry-level... 

Shelter House

AHR 1 - Case Manager Job at Shelter House

 ...Title: Case Manager Department: Artemis House, Region 1 Reports to: Assistant Director of Programs Salary Range: $52K-$56...  ...pounds Benefits Benefits:~ Medical, Dental & Vision Insurance~401K contributions with a 4% employer match~13 Paid...