Employment Type
Full-time
Application Dates
Open
Closing
Opportunity Overview
Type
Graduate Jobs
Salary
Competitive
Location
Singapore
Required Level of Study
Bachelor degree
Areas of Work
Engineering
IT and Technology
Degrees Accepted
E Engineering, Maths, IT & Computer Sciences
Computer science & IT
Information science

Site Reliability Engineer (Campus Hire 2023)

The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed the most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do. Browse our Engineering and Technology team openings to see how you can make an impact with us.

About the Team:

The mission of SRE (Site Reliability Engineer) team is to ensure the efficient and sustainable operation of the Shopee 24x7, and to build and maintain large-scale, highly available, high-performance distributed systems based on system availability and performance. It is formed by combining traditional software engineering and technical operation. The SRE team needs to dive deep into the Shopee development lines to ensure that the system is highly scalable under rapid evolution of the System. From the perspective of stability and performance, it includes the design of business development, components of the basic platform (middleware, container scheduling, caching, object storage, etc.), OS optimisation, data center and network optimisation. We optimise the inefficient and complicated operation in the traditional operation and maintenance mode through engineering and service means, and are committed to building a sound monitoring system to improve the efficiency of incident handling.

Job Description:

  • Deep dive into development lines, learning and understanding the mechanism of every application component, and promoting product scalability, stability and performance
  • Setup, manage and maintain Shopee product/middleware/big-data applications and services
  • Perform regular and ad-hoc server-side deployments, performance fine-tuning and troubleshooting
  • Design and develop automated technical operation platform
  • Capacity and Resource management
  • Responsible for the full-chain stress test to enhance the performance and remove redundancy of applications.
  • Prepare routine operation documentation

Requirements:

  • Bachelor's or higher degree in Computer Science, Engineering, Information Systems or related fields graduating in 2023
  • Less than 1 year of experience welcomed
  • Extensive and hands-on knowledge with Linux operating system (Ubuntu, CentOS, etc.)
  • Knowledge of Computer Network (TCP/IP, DNS, etc.), Computer Organisations and OS
  • Hands-on experience with at least one of the programming languages: Bash, Python, Go
  • Strong analytical and problem-solving skills with the ability to thrive in a dynamic work environment
  • Passionate and possesses a strong sense of responsibility for work
  • Fast learning and a good team player
  • Detailed-oriented, cautious and prudent
  • Passionate about technical operations of internet products, Linux OS and OpenSource

 

Skills below are optional but preferred:

  • Experience with automation tools like Ansible, SaltStack
  • Experience with monitoring tools like Prometheus, Zabbix, Grafan etc
  • Experience with load balancing tools like LVS, Nginx, Openresty or HAProxy
  • Experience with container technology such as Docker, Kubernetes
  • Experience with High Availability system design and Server Deployment Process
  • Experience with SRE
  • Experience with Ops Paas platform or Ops automation platform (ie:CMDB)