VP, SRE Engineer (Splunk), Technology Group

Location: 

Singapore, SG

Job Function:  Technology Group
Job Type:  Permanent
Req ID:  17102

GIC is one of the world’s largest sovereign wealth funds. With over 2,000 employees across 11 locations around the world, we invest in more than 40 countries globally across asset classes and businesses. Working at GIC gives you exposure to an extraordinary network of the world’s industry leaders. As a leading global long-term investor, we Work at the Point of Impact for Singapore’s financial future, and the communities we invest in worldwide.

 

Technology Group
We experiment, design, and lead a 24×7 global business where we support core capabilities in asset management, trading, investment operations, and risk management. We deliver secure, reliable, and integrated solutions, and provide insights on new, and emerging technologies. 

 

Infrastructure & Cybersecurity Resilience (ICR)
We design, build, and secure the technology foundations that power GIC’s global investment operations. We aim to deliver resilient, scalable, and secure infrastructure that empowers our people and businesses to perform securely, efficiently, and effectively.

 

What impact will you make in this role?

As a VP, SRE Engineer (Splunk), you will lead the design, implementation, and optimization of enterprise observability and reliability solutions across infrastructure and application domains. You will leverage Splunk and complementary monitoring platforms to drive proactive alerting, automation, and resilience - ensuring operational stability, risk control, and compliance with financial regulatory standards.
You will define and implement advanced SRE frameworks, establish Site Reliability metrics (SLOs, SLIs, Error Budgets), and build automation pipelines that enhance system reliability, reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR), and strengthen overall operational resilience across the organization.

 

What will you do as a VP, SRE Engineer (Splunk)?

Observability Engineering & Governance

  • Architect and maintain enterprise SIEM solutions aligned with operational resilience mandates (e.g., MAS TRM, DORA, APRA CPS 230).
  • Lead deployment, configuration, and optimization of Splunk for full-stack visibility across infrastructure, applications, networks, and user experience.
  • Define and enforce telemetry data governance standards—metrics, logs, and traces—ensuring consistency, retention compliance, and security.
  • Integrate Splunk with incident management, ITSM, and AIOps systems to enable predictive alerting and anomaly detection.
  • Act as the SIEM/Splunk subject matter expert (SME) for architecture reviews, platform upgrades, and performance tuning.

 

Reliability Engineering & Automation

  • Implement and champion SRE frameworks and reliability practices for mission-critical systems.
  • Design and automate runbooks, alerts, and self-healing workflows using Python, Ansible, and Terraform.
  • Collaborate with Application, Infrastructure, and Cyber teams to embed reliability principles into the delivery lifecycle.
  • Conduct resilience, chaos, and capacity testing aligned with business continuity and disaster recovery standards.
  • Define and track error budgets, reliability scorecards, and service health indicators for production workloads.

 

Cloud & Platform Integration

  • Engineer SIEM for cloud-native workloads in AWS and Azure, ensuring visibility across compute, storage, and network layers.
  • Integrate Splunk and cloud observability tools into CI/CD pipelines and landing zones to ensure continuous compliance.
  • Implement infrastructure-as-code (IaC) models using Terraform and Ansible for consistent, auditable provisioning.
  • Collaborate with Cloud, DevOps, and Security teams to ensure telemetry aligns with audit, compliance, and operational risk requirements.

 

Operational Excellence & Collaboration 

  • Drive reduction in incident recurrence, MTTR, and manual intervention through observability-led automation.
  • Partner with Service Delivery, Cyber, and Application teams to enable predictive incident prevention and root cause transparency.
  • Develop and maintain executive dashboards and reports showcasing availability, reliability KPIs, and operational risk indicators.
  • Provide technical leadership during major incidents, post-incident reviews, and audits, ensuring lessons learned are codified into automation and process improvements.

 

What makes you a successful candidate?

  • Bachelor’s or Master’s degree in computer science, Engineering, or related discipline.
  • 10+ years’ experience in Infrastructure, Cloud, or SRE roles, with at least 5+ years specializing in SIEM/Splunk engineering or observability in financial or regulated environments.
  • Proven hands-on expertise in:
    • SIEM Platforms: Splunk (must), EL/Elastic
    • Automation / IaC: Terraform, Ansible, Python, CI/CD tools
    • Cloud Platforms: AWS (CloudWatch, X-Ray, CloudTrail), Azure (Monitor, Log Analytics, App Insights), Datadog, ServiceNow 
  • Deep understanding of SRE principles, service health modelling, error budgets, and auto-remediation design.
  • Strong analytical and troubleshooting skills, with the ability to perform deep-dive investigations and develop long-term preventive solutions.
  • Familiarity with financial sector operational resilience frameworks, regulatory compliance, and incident governance.

 

Preferred Certifications:

  • Splunk Certified Power User / Splunk Certified Admin / Splunk Certified Architect
  • Terraform / Ansible / Python Certified Expert
  • AWS Certified DevOps Engineer / Azure DevOps Expert
  • SRE Foundation / Practitioner (DevOps Institute)
  • ITIL v4 Managing Professional

 

Work at the Point of Impact
We need to be forward-looking to attract the right people to help us become the Leading Global Long-term Investor. Join our ambitious, agile, and diverse teams - be empowered to push boundaries and pursue innovative ideas, share your views, and be heard. Be anchored on our PRIME Values: Prudence, Respect, Integrity, Merit and Excellence, which guides us in how we make our day-to-day decisions. We strive to inspire. To make an impact. 

 

GIC is a Great Place to Work

At GIC, our offices are vibrant hubs for ideation, professional growth, and interpersonal connection.  At the same time, we believe that flexibility allows us to do our best work and be our best selves. Thus, our teams come into the office four days per week to harness the benefits of in-person collaboration, but have the flexibility to choose which days they work from home and adjust this arrangement as situational needs arise. 

 

GIC is an equal opportunity employer 
As an employer, we passionately believe every individual brings with them unique diversity of thought and perspectives to meaningfully enrich perspectives of GIC teams to drive competitive performance. An inclusive environment yields exceptional contribution.

 

Learn more about our Technology Group here: 
https://gic.careers/group/technology-group/

 

Our PRIME Values

Our PRIME Values

GIC is a values driven organization. GIC’s PRIME Values act as our compass, enabling us to fulfil our fundamental purpose and objectives. It is the foundational bedrock which governs our behaviors, our decision making, and our focus. It informs both our long-term strategy as a firm, and the way we relate to our Client, business partners and employees. PRIME stands for Prudence, Respect, Integrity, Merit and Excellence.