Site Reliability Engineer - Remote

Posted 2 Days Ago
Be an Early Applicant
5 Locations
Senior level
Financial Services
The Role
As a Site Reliability Engineer, you will ensure the reliability of services and systems, focusing on optimizing existing systems and automating processes. You'll design and maintain monitoring tools, manage incidents, and collaborate with engineering teams to enhance system performance and scalability.
Summary Generated by Built In

Description

At EFG (ESL FACEIT Group) we create worlds beyond gameplay where players and fans become community. We pride ourselves in having a corporate social responsibility which is that “IT’S NOT GG (Good Game), UNTIL IT’S GG FOR ALL”. We are passionate about the culture we foster that ultimately helps to create and shape the world of esports, gaming tournaments, leagues, events and holistic ecosystems staged for our millions of players, fans and heroes.

The Team:

As a Site Reliability Engineer at EFG, you will be designing, analyzing, and troubleshooting large-scale distributed systems. You will demonstrate a systematic problem-solving approach, and the ability to debug and optimize code and to automate routine tasks. You will ensure that EFG’s services and systems are reliable, that they have uptime appropriate to users' needs and they have a fast rate of improvement. 

Apart from monitoring our systems' capacity and performance, you will also focus on optimizing existing systems, on building infrastructure and on eliminating work through automation. You will work collaboratively with the software engineering teams to deploy and operate our systems, and you will help to automate and streamline our operations and processes. Within this role, you will be given real responsibilities, and you have the opportunity to drive change and have a big impact on our products and platform.

What you will do:

  • Maintaining and improving the monitoring and observability tools (Grafana/Prometheus/Thanos/Jaeger);
  • Working closely with your team and with other cross-functional teams to help design, maintain and operate systems at scale;
  • Developing and driving adoption of SRE best practices across the company;
  • Leading on incident management process and adoption;
  • Using your troubleshooting skills to help identify and fix operational issues;
  • Working with Cloud Native technologies such as Kubernetes, Envoy, Istio, Prometheus and Helm;
  • Working with the “Hashi Stack” (terraform, packer, vault);
  • Experimenting with and introducing cutting edge technologies.
Requirements
  • Proven experience as a Site Reliability Engineer, DevXP Engineer or Software Engineer, focusing on building and maintaining scalable infrastructures;
  • Excellent working knowledge on at least one of the major cloud providers (GCP/AWS/Azure);
  • You have experience with cluster management systems (Kubernetes);
  • Knowledge of incident management: ability to investigate, troubleshoot, recover and prevent the recurrence of incidents that interfere with the normal delivery of IT services;
  • Proficient in Go language and some level of proficiency in at least another language: Java, Python, Rust…;
  • You have knowledge of GitOps practices;
  • You have production scale experience with one of the following; MongoDB, Redis, MySQL;
  • Experience contributing to open source technologies would be an added bonus.

Top Skills

Go
Java
Python
Rust
The Company
HQ: Miami, FL
140 Employees
On-site Workplace

What We Do

EFG Capital International Corp. provides a wide variety of investment advisory, portfolio management and brokerage services, designed to provide a comprehensive array of investment alternatives tailored to the investor's specific needs and risk profiles.

We are also part of the strong and long-standing regional network of EFG International ("EFGI"​). Our team of dedicated investment professionals consists of client relationship officers, portfolio managers, chartered financial analysts and investment adviser representatives who understand investor requirements and market characteristics. Just as importantly, our advice is global in scope. As such, we are able to provide regional services while tapping market opportunities in Europe, North America, Asia and other key financial regions.

Similar Jobs

5 Locations
1354 Employees

GitLab Logo GitLab

Intermediate Site Reliability Engineer, Database Operations

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
28 Locations
2050 Employees

Fivetran Logo Fivetran

Senior Site Reliability Engineer

Big Data • Cloud • Software • Database
27 Locations
1200 Employees

Fivetran Logo Fivetran

Senior Site Reliability Engineer

Big Data • Cloud • Software • Database
27 Locations
1200 Employees

Similar Companies Hiring

Kin + Carta Thumbnail
Software • Retail • Professional Services • Information Technology • Financial Services • Consulting • Agriculture
Chicago, IL
2000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account