Services | Tags | PagerDuty Build It | Ship It | Own It Tue, 05 Sep 2023 18:26:44 +0000 en-US hourly 1 https://wordpress.org/?v=6.3.1 How to Ace Your Services with PagerDuty by Débora Cambé https://www.pagerduty.com/blog/how-to-ace-your-services-with-pagerduty/ Wed, 06 Sep 2023 12:00:58 +0000 https://www.pagerduty.com/?p=83923 It’s finals week for the US Open, one of the most celebrated sports events in the world. Tennis is my favorite sport to watch as...

The post How to Ace Your Services with PagerDuty appeared first on PagerDuty.

]]>
It’s finals week for the US Open, one of the most celebrated sports events in the world. Tennis is my favorite sport to watch as I’m fascinated by the strength, composure and endurance each player displays while standing by themselves on the court, sometimes during incredibly long matches – the current record is 11h05.

Tennis players are fully accountable for the outcome of their matches at every single stage. Their performance directly impacts whether they win or lose. If this sounds familiar, that’s because it is. Service Ownership follows the same approach: “you build it, you own it”. In the context of DevOps, you’re not working alone. But there are definitely lessons to learn from tennis when it comes to building healthy, resilient services. 

The parallel started drawing itself when interviewing Leeor Engel, Director of Engineering for the Incident Response product line. Keep reading and find out his take on how to ace services and how the PagerDuty team used PagerDuty’s own Service Standards functionality to improve the overall maturity of their services.  

What is Service Standards?

When pivoting to a Service Ownership model, organizations struggle with having a clear visibility of their multiple services and how to uniformize their configurations. Launched a year ago for all PagerDuty plans, Service Standards can guide teams to better configure their services, while helping managers and administrators to scale these standards across the organization.

With Service Standards, PagerDuty provides nine standards that each service should fulfill to have the depth and context required for that service to be considered well-configured, all of which are able to be toggled on and off.

PagerDuty’s Customer Zero: PagerDuty

After the launch of Service Standards, PagerDuty was its own customer zero. Leeor walks us through the motivation behind this effort: “You wanna get adoption and figure out what the gaps are, get feedback, figure out ways to improve [the product]. Then there was an organizational goal. We talk a lot about what makes a service well configured and what does good look like. So we did a big push to get PagerDuty to be customer zero for that feature. We basically got every team to review all their services. And we actually found that many services did not meet the standards.”

Services varied considerably in their standard compliance, but “under 50%” were fully compliant. Approximately four months later, the goal to reach 100% compliance was achieved. But it’s a constant work in progress to keep it that way: “It can be very difficult, depending on the type of service, to get 10 out of 10 [standards]. So our goal was to get 100% of services to be at least 80% compliant. We got there. But then there’s an ongoing effort to maintain that because new services are created all the time, and it’s easy to forget this. And so our continuous process is what catches those stragglers and gets them compliant.

If you also want to ace your services, here are four lessons you can draw from tennis dynamics to get there:

Warm-Up

You might have identified the need to standardize your services to play in the best practices court. But maybe your organization has dozens, even hundreds, of services and that feels overwhelming. Where and how should you start to avoid feeling overwhelmed?

Lesson #1: Start with the baseline

In tennis, the baseline is where each game begins. It’s where players serve and it’s the foundation for their positioning and strategy. Without a well developed baseline play, there’s no chance of winning. But it needs to be built gradually.  

Similarly, standards work as a service’s baseline level of quality, consistency, and functionality. It’s not about achieving perfection from the outset but rather about having a structured foundation to build upon. Take it from Leeor: “You want to focus on systemic things and define any standard as a starting point. Don’t worry about it being perfect. Just get it in place and have a continuous monitoring regime. And that’s gonna move the needle the most, because that’s going to expose all these other problems you might have in your processes that you need to improve, whatever it might be. It’ll be sort of the gateway to exposing those things and then addressing them, continuously improving.

Lesson #2: Adapt to the surface

Every tennis player has their own style of play, but they must adapt to the surface they’re playing on, each enabling different dynamics. On grass, for example, rallies are usually shorter, as the ball bounces low and players need to get to it faster – playing the net successfully and mastering the volley is key to success.

In the context of services, recognizing each team’s unique circumstances is a crucial first step when determining which standards that team’s service should follow. As Leeor explains, “teams can have pretty different needs in terms of their services. Sometimes their integration set up is a little bit different. Sometimes they’re not monitoring things that are directly based on code deployments. For example, one of our Service Standards is having at least one change integration – we may have services that don’t. They may be triage services that have email integrations or things like that. Those services still provide value and they need a standard, but they need a slightly different one. There isn’t a one-size-fits-all that works for everyone.

Win the game

The foundations are set: you have defined your service’s boundaries and standards according to the needs of the team that owns it. Now you need to ensure those standards are complied with. How?

Lesson #3: Avoid unforced errors

An unforced error happens when a player loses a point even though their ability to execute it was completely in their control, i.e. not forced by the opponent.

Teams are responsible for keeping their service standards in check, but in the fast-paced DevOps world that can be tough; services change or new ones might be created depending on business needs. Leeor highlights three essential steps to successfully maintain the balance of your service standards and avoid the unforced error trap:

  • Monitor: With the new PagerDuty Service Standards API you can pull your service standards on a regular basis. This allows you to confirm if the standards are in line with the service needs, if they might need to change or if it makes sense to create exemptions.
  • Report: Create a reporting regime where you define a regular cadence to assess the state of all the services. With PagerDuty Service Standards it’s easy to do so, as the service performance data can be exported out of PagerDuty by the admins and shared as needed to drive accountability and show progress. Admins also have the option to make standards publicly available for the rest of the organization to view. 
  • Educate and be educated: Leeor explains how talking directly and frequently with team owners can raise awareness and educate on the importance of complying with service standards: “For example, business services were not uniformly used across all teams and it’s actually pretty useful. Even just to have a parent business service for your area. Then you can leverage capabilities like the Service Graph or Business Impact features. A system where you can see all your services at a bird’s eye view.” It can also help surface different use cases: “Over time, we developed this process where we could have some exemptions. An example would be testing a service that isn’t in production yet, and it doesn’t yet have the escalation policy. So we set up an exemption process – which ideally was temporary – and we set up some exclusions around specific standards.” 

Win the match

Lesson #4: Continuously improve

The beauty of tennis is the course of a match can change instantly. There is no time limit to a game or even a set and players aren’t only depending on variables they can control: there’s the opponent’s focus and physical condition, the weather, and even the audience. Are they cheering you on?

Tennis is a game of continuous improvement and the same happens with services. Well configured services help scale Service Ownership best practices which, in turn, drive the organization’s operational maturity level.

Here’s Leeor’s number one advice to get there: “The key thing is reporting. Of course you need to establish what your standard is and that may look a little different depending on the business. But really the critical thing is the continuous monitoring and reporting. Mistakes happen, things get missed, humans are humans, right? So you need some process that catches the things that fall through the cracks. Define a standard and continuously monitor it, like you would do with any other process. You’re trying to continuously improve. You need to monitor it.

Start Acing Your Services

Put all these lessons in practice with the PagerDuty Operations Cloud, the essential platform to get your services in shape and manage all unplanned, time-sensitive, critical work across the enterprise. Learn more here and try our free 14-day trial

The post How to Ace Your Services with PagerDuty appeared first on PagerDuty.

]]>
Resilience for Retailers Ahead of Peak Traffic Seasons by Nisha Prajapati https://www.pagerduty.com/resources/webinar/resilience-for-retailers/ Wed, 01 Sep 2021 20:00:41 +0000 https://www.pagerduty.com/?post_type=resource&p=71162 The post Resilience for Retailers Ahead of Peak Traffic Seasons appeared first on PagerDuty.

]]>
The post Resilience for Retailers Ahead of Peak Traffic Seasons appeared first on PagerDuty.

]]>
Services Like a Boss: Best Practices for Implementing and Maintaining Services Architecture by Nisha Prajapati https://www.pagerduty.com/resources/webinar/services-like-a-boss-best-practices/ Wed, 11 Aug 2021 18:55:07 +0000 https://www.pagerduty.com/?post_type=resource&p=70878 The post Services Like a Boss: Best Practices for Implementing and Maintaining Services Architecture appeared first on PagerDuty.

]]>
The post Services Like a Boss: Best Practices for Implementing and Maintaining Services Architecture appeared first on PagerDuty.

]]>
PagerDuty For Pros by Nisha Prajapati https://www.pagerduty.com/resources/webinar/pagerduty-for-pros/ Tue, 27 Jul 2021 14:19:58 +0000 https://www.pagerduty.com/?post_type=resource&p=70549 The post PagerDuty For Pros appeared first on PagerDuty.

]]>
The post PagerDuty For Pros appeared first on PagerDuty.

]]>
Fuelling Always-On Digital Services in the Financial Sector by Abhijit Pendyal https://www.pagerduty.com/blog/financial-sector-digital-services-au-2020/ Wed, 09 Dec 2020 14:00:52 +0000 https://www.pagerduty.com/?p=66550 The financial services sector in Australia has undergone seismic change recently with the rise of neo disruptors and a cashless society driven by the pandemic....

The post Fuelling Always-On Digital Services in the Financial Sector appeared first on PagerDuty.

]]>
The financial services sector in Australia has undergone seismic change recently with the rise of neo disruptors and a cashless society driven by the pandemic. Australia is quickly becoming one of the more mature markets to embrace digital transformation, with the federal government announcing it has committed $800 million to a digital infrastructure upgrade.

As we move closer to 2021, the financial services sector will continue to see accelerated change and a greater reliance on digital technology. As the reliance on technology grows, so does the pressure on digital services as community expectations rise. Being online with seamless reliability is more important than ever.

In my role as the Director of Solutions Consulting at PagerDuty, I am seeing a greater focus from our financial services customers on full-service ownership. Technology teams now view the service ownership model as a way to bring value to their business when every second of the customer experience counts. One example comes to mind: Recently, Bendigo and Adelaide Bank injected $52.4 million AUD into its ongoing transformation program to reduce complexity and build digital capability through streamlined service offerings. The accelerated transformation targets the bank’s cost base through automation initiatives and new capabilities to improve operational efficiency.

During the pandemic, organisations have found that in order to effectively reduce costly downtimes, they need to adopt the best in development and operations best practices, as well as adopting cloud technology. Additionally, in certain cases, the process and cultural changes are even more important than tools and have the biggest impact on the operational maturity of the organisation.

In a recent PagerDuty survey of 700 DevOps and IT practitioners across the globe, we found that more than 80% of organisations have experienced a significant increase in pressure on digital services since the start of COVID-19. These same companies cited a 47% increase in the number of daily incidents, resulting in responders spending more than 10 extra hours per week resolving incidents, compared to 9 months ago.

To minimise the chances of future incidents, financial businesses need to have visibility into their entire tech stack and the interdependencies within it. They also need the ability to act on the machine and human response data collected, while broadening observability for developers and responders so they can resolve issues more effectively.

Here are some opportunities for financial services to embrace:

  • Tooling. Adopting best-of-breed, decentralised cloud tools with API-first design methodologies that can easily integrate with users’ platform of choice so they can access automation, extensibility, flexibility, and auditability.
  • Talent. Empowering talent to “work where they want” while also ensuring centralised visibility of relevant information. By democratising uniform operational practices like on-call rotations, escalations, and incident triage, responder burnout and fatigue can be dramatically reduced. It also leads to better engineering productivity and higher talent retention.
  • Culture. The philosophy of real-time communication and collaboration that the DevOps model encourages can help foster a leaner system where bottlenecks are solved in a manner that not only fixes the problem, but also improves the process.
  • Process. Significantly improve processes for alert and incident through iterative efficiencies like dynamically executing runbooks, and implementing incident bridge automation, proactive and relevant stakeholder updates, and blameless postmortems. Apart from high team and service performance, process improvements can lead to rigor and accuracy in adherence to compliance and regulatory requirements.

Today, the unit of focus is not the bug but the incident. That’s because an incident typically not only impacts the technical responders responsible for maintaining systems, but also impacts the customer experience. And since incident response is an inherently people-centric activity, informing and mobilising the right people with the right tools and access to the right data is key. With customers today expecting instant gratification, moments like an app or a website not working have far-reaching consequences.

PagerDuty can help improve your service uptime and reliability to increase revenue, customer loyalty, and brand recognition. Our 13,000+ customers in 80 different countries are proof of that. To see for yourself, contact your account manager and try a 14-day free trial today.

The post Fuelling Always-On Digital Services in the Financial Sector appeared first on PagerDuty.

]]>
Operations as Code: Mass Configurations Using Terraform by Bianca Wood https://www.pagerduty.com/resources/webinar/operations-as-code/ Thu, 08 Oct 2020 21:27:12 +0000 https://www.pagerduty.com/?post_type=resource&p=65144 The post Operations as Code: Mass Configurations Using Terraform appeared first on PagerDuty.

]]>
The post Operations as Code: Mass Configurations Using Terraform appeared first on PagerDuty.

]]>
How Scalable Service Ownership Leads to Faster Incident Triage by Bianca Wood https://www.pagerduty.com/resources/webinar/faster-incident-triage/ Thu, 08 Oct 2020 21:21:04 +0000 https://www.pagerduty.com/?post_type=resource&p=64963 The post How Scalable Service Ownership Leads to Faster Incident Triage appeared first on PagerDuty.

]]>
The post How Scalable Service Ownership Leads to Faster Incident Triage appeared first on PagerDuty.

]]>
PagerDuty Like a Pro by Bianca Wood https://www.pagerduty.com/resources/webinar/like-a-pro/ Thu, 08 Oct 2020 21:20:48 +0000 https://www.pagerduty.com/?post_type=resource&p=64957 The post PagerDuty Like a Pro appeared first on PagerDuty.

]]>
The post PagerDuty Like a Pro appeared first on PagerDuty.

]]>
The Shared Irresponsibility Security Model With PagerDuty and Lacework by Bianca Wood https://www.pagerduty.com/resources/webinar/irresponsibility-model-lacework/ Thu, 08 Oct 2020 21:14:34 +0000 https://www.pagerduty.com/?post_type=resource&p=65087 The post The Shared Irresponsibility Security Model With PagerDuty and Lacework appeared first on PagerDuty.

]]>
The post The Shared Irresponsibility Security Model With PagerDuty and Lacework appeared first on PagerDuty.

]]>
PagerDuty Deep Dive: How to Optimise Your Digital Ops Platform by Bianca Wood https://www.pagerduty.com/resources/webinar/deep-dive/ Thu, 23 Jul 2020 16:44:13 +0000 https://www.pagerduty.com/?post_type=resource&p=62913 The post PagerDuty Deep Dive: How to Optimise Your Digital Ops Platform appeared first on PagerDuty.

]]>
The post PagerDuty Deep Dive: How to Optimise Your Digital Ops Platform appeared first on PagerDuty.

]]>