Features | Categories | PagerDuty https://www.pagerduty.com/blog/category/features/ Build It | Ship It | Own It Mon, 17 Apr 2023 20:13:25 +0000 en-US hourly 1 https://wordpress.org/?v=6.3.1 Reduce MTTR and Take Automation to a New Level with PagerDuty Global Event Orchestration by Hannah Culver https://www.pagerduty.com/blog/global-event-orchestration-generally-available/ Tue, 18 Apr 2023 12:00:58 +0000 https://www.pagerduty.com/?p=81923 PagerDuty’s Global Event Orchestration is now generally available. Global Event Orchestration’s powerful decision engine enriches events, controls their routing, and triggers self-healing actions based on...

The post Reduce MTTR and Take Automation to a New Level with PagerDuty Global Event Orchestration appeared first on PagerDuty.

]]>
PagerDuty’s Global Event Orchestration is now generally available. Global Event Orchestration’s powerful decision engine enriches events, controls their routing, and triggers self-healing actions based on event data. Teams can use this functionality across any or all services within PagerDuty. This feature is a continued investment in Event Orchestration, demonstrating PagerDuty’s commitment to providing customers with best-in-class automation capabilities.

Customers in our early access program are already seeing value in Global Event Orchestration, touting reduced MTTR and better standardization of incident response at scale. As Kiril Yurovnik, Technical Lead at Riskified, said, “With a growing number of events, minimizing noise and toil is imperative, especially as organizations aim to optimize their IT processes amid the current economic environment. We’ve been using PagerDuty’s Global Event Orchestration as part of the early availability program, and the results have been strong. Riskified has been able to scale noise reduction, especially from non-production environments, saving our team valuable time to spend time innovating on what’s next.” 

What are Global Event Orchestrations?

Global Event Orchestration is like Service Event Orchestration in that it allows users to define complex rules that determine what happens to an event as it is processed. The difference is that Global Event Orchestration enriches events at ingest. Then, once the data is normalized, the event is routed to a service based on various criteria. This ensures that responders have the best event data possible to begin the response process.

Global Event Orchestration has three key components that make it successful for scaling incident response. 

Global Orchestration Rules allow users to apply actions to events across services. Teams can create rules which process event data across services and use the processed data to improve event routing. This empowers organizations to establish and improve on auto-remediation. This means that a human doesn’t need to be involved in an incident to resolve it. This also reduces the blast radius of an incident via more intelligent routing.

Enhanced integration key management reduces the workload of managing integration keys for different monitoring tools. This allows users to combine integration keys into one event orchestration. Even better, enhanced integration key management is now available for all PagerDuty plans.

Additional APIs allow for management at scale. Teams can use REST APIs for event source and Global Orchestration Rule management. Both of these APIs have Terraform support. These APIs are in addition to the REST APIs for Event Orchestration/Service Orchestration management.

“Leveraging PagerDuty’s Global Event Orchestration has been critical to ensure that our event routing processes are efficient and scalable to optimize IT operations and spend,” said Brian Long, Cloud Infrastructure Engineer at Hyland. “With Global Event Orchestration, our organization is able to detect the “resolved” condition from our notifications to execute as a resolve and reduce the number of places these conditions need to be configured by at least a factor of three. This frees up our time to focus on innovation, not configuration.”

How can Global Event Orchestration help my team?

With Global Event Orchestration, teams will see:

  • Codified incident response processes: democratize and distribute well-understood incident responses across distributed teams
  • Fewer incidents: use contextual event data from all services within your ecosystem to improve suppression accuracy
  • Faster resolution: apply automation across teams and enable automated diagnostics at scale with standardized enrichment and data normalization

How teams use Global Event Orchestration may vary based on organizational structure. Capabilities align with two different teams: ITOps, SRE, and NOC teams and developer teams.

ITOps teams will be able to capitalize on the event normalization capabilities, ensuring that all events look the same as they come in.

SRE teams can create and extend automation across any or all services within a technical ecosystem. This makes scaling and standardizing automation across an organization easier than ever.

For L1 response teams such as NOCs, Global Event Orchestration helps them handle the massive incoming wave of events. Events can be routed to the NOC if they meet certain criteria. And, as the event passes through levels of rules and nested rules, automation can deliver diagnostics to the L1 responder. If the fix for an incident is well-known, organizations can create auto-remediation.

Developer teams will see fewer incidents and faster resolution. With auto-remediation, incidents can be resolved before they even hit the services that the developer teams are on call for. And, with in-depth routing criteria, incidents don’t bounce from team to team. If automation or the NOC or L1 responders can’t resolve it, the incident will go to the subject matter expert (SME). And, by the time the SME begins to work on the incident, diagnostic information is already available, reducing resolution time.

How can I get started today?

Global Event Orchestration is generally available for all PagerDuty AIOps customers. To see it in action, join us on Twitch Friday, April 14. 

PagerDuty AIOps helps teams experience fewer incidents, faster resolution, and greater productivity without long implementations or heavy ongoing maintenance. To try PagerDuty AIOps, you can request a trial here or take our product tour. If you want to talk to sales, contact us through this form.

To learn more about Global Event Orchestration, register for this webinar. If you’re a PagerDuty AIOps customer looking to create your first Global Event Orchestration, this knowledge base article can show you how to get started.

The post Reduce MTTR and Take Automation to a New Level with PagerDuty Global Event Orchestration appeared first on PagerDuty.

]]>
Introducing PagerDuty AIOps: Harnessing the Power of AI to Transform Modern Operations for the Enterprise by Hannah Culver https://www.pagerduty.com/blog/introducing-pagerduty-aiops/ Tue, 11 Apr 2023 12:00:40 +0000 https://www.pagerduty.com/?p=81930 Today, PagerDuty launched a new AIOps solution to leverage the power of AI, provide built-in automation and build on the company’s foundation data model to...

The post Introducing PagerDuty AIOps: Harnessing the Power of AI to Transform Modern Operations for the Enterprise appeared first on PagerDuty.

]]>
Today, PagerDuty launched a new AIOps solution to leverage the power of AI, provide built-in automation and build on the company’s foundation data model to transform modern operations for the enterprise. PagerDuty has long suppressed noise to help distributed development teams focus. Now, PagerDuty AIOps addresses the large-scale event correlation, compression, and automation needs of ITOps, Command Centers, NOCs, and SRE teams with Global Event Orchestration (now generally available), and Global Alert Grouping (EA in H2 2023). If you’re interested in being a part of the early access program for Global Alert Grouping, sign up here. Going beyond event management, PagerDuty AIOps helps organizations work more efficiently, including giving them the ability to execute end-to-end, event-driven automation.

Our early access customers are already seeing results with PagerDuty AIOps, including 87% average noise reduction, deployed automated incident response 9x faster than existing solutions, and 14% faster MTTR.

As Kiril Yurovnik, Technical Lead at Riskified, said, “With a growing number of events, minimizing noise and toil is imperative, especially as organizations aim to optimize their IT processes amid the current economic environment. We’ve been using PagerDuty’s Global Event Orchestration as part of the early availability program, and the results have been strong. Riskified has been able to scale noise reduction, especially from non-production environments, saving our team valuable time to spend time innovating on what’s next.”

You can see PagerDuty AIOps in action by taking our product tour.

What is PagerDuty AIOps?

According to PagerDuty platform data, event volumes have grown by 70% YoY. As a result, businesses suffer from too much noise and too much toil while their response teams slog through chaotic, manual response processes.  

And when ITOps and SRE teams who act as first responders for incidents lack access to crucial context and visibility system-wide, they can’t take the next best action. This operational inefficiency has a compounding effect. It increases the cost of operations, reduces productivity across the technical organization, and takes away from value-add work.

In a resource-constrained environment, teams can’t wait for year-long implementations, they need help now. Organizations are looking for a solution that has fast time to value, integrates with their existing systems, and provides fast ROI. 

PagerDuty AIOps helps teams reduce noise, triage efficiently to drive the right actions towards resolution, and remove manual, repetitive work from the incident response process. PagerDuty AIOps works out of the box without requiring long implementations or heavy, ongoing  maintenance. Organizations continue to see best-in-class results. Noise reduction baked in with ML models that learn and adapt based on user behavior means teams see fewer incidents overall. And end-to-end event driven automation ensures that resolution is faster and requires less input from humans who are needed for value-add work.

“Leveraging PagerDuty’s Global Event Orchestration has been critical to ensure that our event routing processes are efficient and scalable to optimize IT operations and spend,” said Brian Long, Cloud Infrastructure Engineer at Hyland. “With Global Event Orchestration, our organization is able to detect the “resolved” condition from our notifications to execute as a resolve and reduce the number of places these conditions need to be configured by at least a factor of three. This frees up our time to focus on innovation, not configuration.”

Here’s what PagerDuty AIOps includes: 

  • Event correlation, noise compression, and triage context functionality, freeing site reliability engineers and information technology teams from managing multiple vendors and manual processes to a single powerful solution that drives to resolution quickly.
  • End-to-end automation, from event ingestion through auto-remediation, to help teams shift from reactive to proactive by capturing and actioning critical events before they become value-destroying incidents.  
  • Advanced noise reduction features (available in our early access program) that group alerts across services and allow customers to leverage both defined rules and machine learning to only surface the incidents that matter.
  • A visibility console that gives operations teams a single source of truth to monitor and quickly manage all incidents before major incidents occur with far-ranging business, IT, and financial impacts. 
  • Global Event Orchestration, a powerful decision engine to enrich and control routing or trigger self-healing actions.
  • With more than 700 integrations on the PagerDuty Operations Cloud platform, teams can trust our automation-led, people-centric AIOps solution to help save time and money.

How does PagerDuty AIOps work?

PagerDuty AIOps has sets of capabilities that help organizations standardize and scale incident best practices across all teams and services. And, it comes with new features custom-built to serve ITOps, Command Centers, NOCs, and SRE teams.

Reduce noisy incidents: reduce incident noise with the click of a button, either within a service or across services with Global Alert Grouping. Use built-in ML models, or create your own logic. And combine intelligent ML and rule-based alert grouping methods for customizable grouping capabilities. Group alerts by content, time, or other criteria for noise reduction that fits your organization’s needs.

Screen recording of PagerDuty noise reduction via alert grouping.Accelerate triage time and drive action: Leverage ML to surface the most important information for responders immediately. When an incident occurs, responders can quickly discover the probable origin of the incident, if the incident has previously occurred, and if a change was the likely cause.

Screen recording of PagerDuty triage features including past incidents and probable origin.Automate the redundant: Leverage event orchestration’s powerful decision engine to enrich and control routing or trigger self-healing actions based on event conditions across any or all services within PagerDuty with Global Event Orchestration.

Screenshot of PagerDuty Global Event Orchestration rule builder.Visualize what matters: Create a custom dashboard that provides a comprehensive view of your operations posture across services. Additionally, you’ll get full visibility into your event data so that you can prioritize what gets ingested and processed and have total transparency into your event usage.

Screen recording of PagerDuty Visibility Console where users can visualize all their event data.

How can I get started with PagerDuty AIOps today?

For current PagerDuty customers with Professional or Business plans, you can self-serve purchasing PagerDuty AIOps in your account subscriptions menu. 

For Event Intelligence customers, contact your account team about migration options to get access to new features available in PagerDuty AIOps. For more details, please see our knowledge base article.

Whether you’re a current PagerDuty customer or looking to get started, you can see PagerDuty AIOps in action by requesting a trial or taking our product tour. If you have questions and want to speak with our sales team, you can reach out here.

The post Introducing PagerDuty AIOps: Harnessing the Power of AI to Transform Modern Operations for the Enterprise appeared first on PagerDuty.

]]>
Say Goodbye to the ‘Executive Swoop and Poop’ with Status Update Notification Templates by Hannah Culver https://www.pagerduty.com/blog/status-update-notification-templates-now-generally-available/ Wed, 25 Jan 2023 14:00:52 +0000 https://www.pagerduty.com/?p=80995 Incidents are unpredictable, but how you share updates with stakeholders doesn’t have to be. Status Update Notifications Templates help teams streamline communication with internal stakeholders...

The post Say Goodbye to the ‘Executive Swoop and Poop’ with Status Update Notification Templates appeared first on PagerDuty.

]]>
Incidents are unpredictable, but how you share updates with stakeholders doesn’t have to be. Status Update Notifications Templates help teams streamline communication with internal stakeholders during a major incident. We are excited to announce that this feature has added new capabilities. Now teams can not only customize their communications; they can also create and standardize reusable templates using dynamic variable insertion representing criteria such as impact, service areas, and more.

What are Status Update Notifications?

You’re in the middle of a high-priority incident. You’re working to bring the problem to a close, but you can’t concentrate because you have a dozen (or two) stakeholders pinging you for updates. You’re copying and pasting your response across multiple internal communication channels, but that doesn’t stop the direct messages from popping up. Sound familiar? At PagerDuty, we call this the ‘executive swoop and poop.’ Basically, a responder is so inundated with update requests that they can’t do their main job: resolve the incident.

During an incident, it’s key to keep stakeholders in the loop. This helps the business respond as one to an incident, reducing resolution times and preserving customers’ trust. But stakeholders expect communications to look a certain way and contain the context that’s important to them. Formatting and writing these communications might require a responder’s full attention if the communication is done ad-hoc. Status Update Notifications allow teams to standardize communication expectations and reduce the toil of sharing updates.

This feature includes a rich text editor so teams can format the text to company communication standards, including adding company logos. With our drag-and-drop variables, responders can easily include incident details and populate key information as needed.

Status update notifications setting up template screenshot: configuring template with variables

How do the templates help my team?

During an incident, a responder has to remember so much about the system and services that they’re responsible for. It takes concentration and critical thinking. Status update notifications are a quick way to communicate with stakeholders, reducing the time and energy responders spend on sharing updates. But sometimes you need to send the same notifications frequently as similar incidents occur. Or you want your P3 and P0 communications to have different information and you don’t want to build the notification from scratch each time.

You could write all this in a playbook and store it in a wiki. But those wikis are hard to find and rarely updated. It’s not ready when and where responders need it. That’s why we built templates. Now, teams can customize and standardize reusable communication templates based on impact, business areas, and more. This functionality will also be available via API, so teams are able to customize and leverage status update notification templates to fit their needs in any context.

Status update notifications setting up template screenshot: creating new template details

With templates, your incident communications are as easy as:

  1. Ensure stakeholders are subscribed to the incident.
  2. Click “status update” and choose your template.
  3. Edit (if necessary) and preview your template.
  4. Send your status update notification.

How can I get started today?

Status Update Notification Templates are now generally available for Business and Digital Operations customers. With Status Update Notification Templates, your teams can communicate better and with fewer ‘swoop and poops.’ Your communications will match company branding and standards, and the reusable template format means any communication is ready for you at a moment’s notice.

If you want to learn more, read our knowledge base article here or watch our demo:

 

If you’re ready to see Status Update Notification Templates in action, try PagerDuty for free for 14 days.

The post Say Goodbye to the ‘Executive Swoop and Poop’ with Status Update Notification Templates appeared first on PagerDuty.

]]>
PagerDuty Service Standards helps organizations better configure services at scale by Hannah Culver https://www.pagerduty.com/blog/introducing-service-standards/ Tue, 23 Aug 2022 13:00:19 +0000 https://www.pagerduty.com/?p=77871 Service ownership, a DevOps best practice, is a method that many companies are pivoting towards. The benefits of service ownership are varied and include boons...

The post PagerDuty Service Standards helps organizations better configure services at scale appeared first on PagerDuty.

]]>
Service ownership, a DevOps best practice, is a method that many companies are pivoting towards. The benefits of service ownership are varied and include boons such as bringing development teams much closer to their customers, the business, and the value being delivered. The “build it, own it model” has tangible effects on customer experience, as developers are incentivized to innovate and drive customer-facing features that delight.

But the pivot to service ownership is difficult, especially for large companies with hundreds or even thousands of services. Everything from defining a service, its boundaries, and who owns it can be a behemoth undertaking. And ensuring that services are configured in a way that allows the organization to scale quickly is next to impossible across the entire technology ecosystem. However, gaining this level of visibility is crucial for better business outcomes.

Screenshot of service directory

 

To address this problem, PagerDuty is excited to announce the general availability of Service Standards for all plans. PagerDuty’s emphasis on service ownership through our service-based architecture has traditionally allowed individual teams to determine how to configure services. Now, with Service Standards, teams across an organization have both the visibility to understand what best practice looks like as well as the flexibility to standardize that knowledge across teams new to service ownership in a way that’s beneficial for both the team and organization. 

Service Standards help all teams ensure that their service configurations are adhering to service ownership best practices. This means that services are informative, integrated with the right tools, and supported by the right people. Service Standards provides both the visibility and means to institute standards across teams to not only embrace service ownership, but also to scale it across the organization.

Introducing Service Standards

When configuring services, teams throughout an organization will have different methods. Some services may have the information that all teams need to act quickly during an incident and others may not. This lack of uniformity can cause problems across the ecosystem, with information that’s lost or locked up as purely tribal knowledge. And, it’s next to impossible for managers and administrators to know whether the services they are responsible for are in good shape or not.

Service Standards can help individual engineers understand how to configure better services, while providing a guide for managers and administrators to scale these standards across an organization.

Set up better services with guidelines for success

With the shift to cloud, the number of services for any organization has grown exponentially, and a central governing and creation team isn’t often able to handle the load. To make things more complicated, service owners configure services in markedly different ways. From naming conventions, to descriptions, to whether they have the right people on call, services vary in the depth of information they provide.

Too often this results in a lot of rework. Imagine this scenario: Team spins up new services only to be blocked before they can enter production. They’re told to make a variety of changes and fixes to ship. And, since these requirements are often not codified or widely known, this is a mistake that the team might make multiple times, adding pain and toil to the service creation process. 

We hear this from customers all the time. In fact, one of the top questions we get asked is “what does ‘good’ look like?” The truth is, it often depends, but it’s always the case that ‘good’ is unique to each team’s particular way of working. 

With Service Standards, teams can standardize on what good looks like according to company policy. PagerDuty has provided nine standards that each service should fulfill to have the depth and context required for the service to be considered well-configured, all of which are able to be toggled on and off.

Screenshot of Service Standards pass or fail

 

Audit services for accountability

Service Standards also give managers and administrators the level of control they need to ensure that configuration requirements are met at scale. Administrators can determine visibility and decide whether to make these standards publicly available for the rest of the organization to view. They can also toggle on or off all nine standards depending on what the company needs. On a more granular level, administrators can apply these standards to only a subset of services for more flexibility. And, the service performance data can be exported out of PagerDuty and shared as needed to drive accountability and show progress.

Screenshot of Service Standards settings

 

Ready to try yourself?

Service Standards are here to help all organizations scale service ownership best practices. This feature gives engineers an understanding of what is ready for production and reduces the toil required to ship new services. For administrators and managers, Service Standards help drive accountability throughout the technology ecosystem and provide a way to assess progress. Over time, this improves incident response for first responders looking for quick context, and helps drive operational maturity at the organization level.

If you want to learn more, check out our recent webinar, “How to Standardize Service Ownership at Scale for Improved Incident Response” or read our knowledge base article here.

If you’re ready to see Service Standards in action, try PagerDuty for free for 14 days.

The post PagerDuty Service Standards helps organizations better configure services at scale appeared first on PagerDuty.

]]>
What’s New: Updates to Incident Response, PagerDuty® Process Automation, Integrations, and More! by Vera Chan https://www.pagerduty.com/blog/whats-new-product-update-2022-07-28/ Thu, 28 Jul 2022 13:00:25 +0000 https://www.pagerduty.com/?p=77455 We’re excited to announce a new set of updates and enhancements to the PagerDuty Operations Cloud. Following another successful PagerDuty Summit, development continues across several...

The post What’s New: Updates to Incident Response, PagerDuty® Process Automation, Integrations, and More! appeared first on PagerDuty.

]]>
We’re excited to announce a new set of updates and enhancements to the PagerDuty Operations Cloud. Following another successful PagerDuty Summit, development continues across several areas of the product. Recent updates from the product team include Incident Response, PagerDuty® Process Automation and PagerDuty® Runbook Automation, Partner Integrations & Ecosystem, as well as Community & Advocacy Events updates. In addition to helping automate incident response and reducing the amount of issues escalated to other teams, you can:

Incident Response

Service Standards

Now available to all customers, Service Standards can help teams improve operational maturity and provide a better customer experience by defining criteria that standardizes what ‘good’ looks like across teams. Configure services more consistently according to best practices for improved predictability during incident response. Scale service ownership across the entire organization. Enhance your analytics by improving the accuracy of your service configurations. Admins and account owners can view Service Standards by default, but permissions can be adjusted to allow all users to also view them.

  • Enhanced search functionality
  • Toggling between all schedules and team schedules
  • Collapsing or expanding information based on the level of detail needed
  • Easier schedule comparison with time windows displayed
  • Streamlined view of on-call responders with shift times

Learn more in the knowledge base or watch the webinar on demand

View it above or watch it in action later

PagerDuty® Process Automation Software and PagerDuty® Runbook Automation

What's New in PagerDuty : PagerDuty + AWS

New AWS Plugins for Automated Diagnostics

Do your apps and services reside in AWS? If so, we now have AWS plugins for automated diagnostics that can help you and your teams triage incidents on AWS faster, and with more efficiency. These new plugins join our existing library.

Updates include:

  • CloudWatch Logs plugin retrieves diagnostic data from AWS infrastructure and applications. Now users can more easily run automated diagnostics for AWS across multiple accounts and products.

What's New in PagerDuty and PagerDuty Product Updates : CloudWatch Logs Plugin

  • Systems Manager plugin allows for faster execution and accuracy for tasks such as configuration management, patching, and deploying monitoring and security tooling agents. Now operators can view and manage their global EC2 footprint in a single interface with security best-practices.
  • ECS Remote Command plugin provides a mechanism to execute commands on containers. This enables developers and operators to retrieve diagnostic data from their running applications in real-time before redeploying their services.

Read the blog or contact us to learn more!

PagerDuty® Process Automation Software and PagerDuty® Runbook Automation Release 4.4.0

Commercial product users can now enjoy new AWS Job Step Plugins for Lambda and ECS (Fargate). 

Learn more

The Lambda Custom Code Workflow step, you can create, execute, and optionally delete a new Lambda function with the custom code provided in a Job step as its input. Most advantageous to Runbook Automation users is the ability to execute custom scripts as steps in jobs without having to install any software! 

We’ve made additional enhancements and fixes to our popular Ansible plugin and the recent additional enhancements and fixes:

  • Ansible: Inline inventory Fix
  • Ansible: Update Gradle to 7.2
  • Ansible: Normalize line separators to LF (unix)
  • Ansible: Add a field to set the path to the Ansible binaries directory
  • Add a field to set the path to the Ansible Binaries directory

Learn more 

You can also learn more about various other commercial updates and core product updates in the release notes.

Integrations & Partner Ecosystem

What's New in PagerDuty and PagerDuty Product Updates : PagerDuty + Zendesk

PagerDuty App for Zendesk – Automation Actions

Customer service agents can now run Automation Actions directly within the PagerDuty App for Zendesk. This automation improves efficiency, lightens agents’ burgeoning workload, reduces the chances of mistakes when agents have to run manual tasks when responding to high-pressure customer cases, and simply improves their lives without adding more toil. It empowers customer service agents to automatically validate problems and capture critical information instantly for response teams to diagnose and resolve them. The added context from running an Automation Action is critical for response teams to access in an instant to reduce resolution times and also eases the load on backend teams. It helps reduce the amount of issues that are escalated to engineering teams–enabling them to work more efficiently with less interruptions by issues, for instance, that may have lower urgencies or are less customer-impacting. 

What's New in PagerDuty : PagerDuty App for Zendesk

What's New in PagerDuty and PagerDuty Product Updates : PagerDuty + Slack

PagerDuty App for Slack Dedicated Incident Channel Improvements

The Next Generation Slack V2 dedicated Incident channel improvements are in Early Access and now ready for customer use! These improvements allow collaborative teams across an organization to access the following from just within an incident’s dedicated incident channel:

  • View all incident details, history, and updates
  • Perform all incident actions
  • Add all responders to a channel (requires responder to have previously linked their slack account to PagerDuty).
  • Post or create a dedicated zoom conference bridge

Read the blog

Webhooks-V3-Update

Are you, or do you know a Restricted Access User who wants to manage Webhooks V3 integrations for a team(s)? If so, now it’s possible to be assigned the Manager Team Role! This empowers you and members of operations teams by removing the dependency on Global Admins and Account Owners to manage day-to-day operations.

What's New in PagerDuty and PagerDuty Product Updates : Webhooks Permissions & Teams

What's New in PagerDuty and PagerDuty Product Updates : Manage Webhooks V3

Product Deprecations

Please take note and keep your teams informed of our upcoming product deprecations:

V1/V2 Webhooks

If you are currently using V1/V2 webhook extensions in your PagerDuty environment, you need to migrate them to V3 webhook subscriptions to maintain functionality.

Please follow our migration guide

Important Dates:

  • V1 Webhooks – V1 webhook extensions became unsupported (no new features or bug fixes) since November 13, 2021 and will stop working in October, 2022.
  • V2 Webhooks – V2 webhook extensions will be unsupported in October, 2022 and will stop working in March, 2023.

Required Permissions:

  • Admins or Account Owners can migrate an entire account.
  • Team Managers can only migrate webhooks for their assigned Teams.

What are Webhooks? Webhooks allow you to receive HTTP callbacks when significant events happen in your PagerDuty account, for example, when an incident triggers, escalates, or resolves. Details about the event are sent to your specified URL, such as Slack or your own custom PagerDuty webhook processor.

If you have any questions, please reach out to your PagerDuty contact or our support team at support@pagerduty.com.

Learn more about webhooks

Webinars & Events

Join us for the following webinars and events to learn more about PagerDuty’s recent product updates and how they benefit customers. These are just a few of many:

Events

PagerDuty Summit 2022 (On Demand)

What's New in PagerDuty and PagerDuty Product Updates : PagerDuty Summit

Missed us at PagerDuty Summit this year? Summit talks are now available on demand so you can catch our newest demos, technical sessions, and keynotes from executives and industry leaders anytime!

Watch Summit 2022 on demand

Common Diagnostics for Common Components
Tuesday, August 16th, 2022 10am PDT / 1pm EDT / 18:00 BST

Join Justyn Roberts from PagerDuty in an interactive event where you can learn how to automate manual tasks for recurring issues that bog down productivity and increase toil. You’ll learn how you can embrace automation to operate faster, reduce the number of escalations to specialists to save time, money, and burnout, as well as optimize resolution times.

Learn More

Quarterly Terraform Roundtable
Tuesday, August 9th, 2022 10am PDT / 1pm EDT / 18:00 BST

Join a discussion moderated by Scott McAllister from PagerDuty and hear from industry peers as well as share your experiences regarding best practices, learnings, and what to consider when deciding to migrate to Infrastructure as Code with Terraform and PagerDuty.

Learn more and reserve your spot

Live Intro to PagerDuty Process Automation
Monday, Aug 1, 2022 — 6:00 PM PDT
Tuesday, Aug 2, 2022 — 5:00 AM PDT
Tuesday, Aug 2, 2022 — 10:00 AM PDT

Join Craig Hobbs and Martin Van Son from PagerDuty as they walk you through common use cases and challenges that Process Automation solves, show you a product demo, and conduct a live Q&A!

Learn More

Spring Launch Webinar: New Product Releases for Automating IT Processes in the Cloud and at Higher Scale

Join Peko Karanayev, Greg Chase, and Madeline Zemer from PagerDuty as they discuss how to employ automation to optimize security and compliance.

Learn More

How to Standardize Service Automation at Scale for Improved Incident Response

Join Hannah Culver, Davis Godbout, and Karen Myers from PagerDuty as they discuss how teams can improve incident response by leveraging PagerDuty to configure their services at scale to meet organization requirements and to improve visibility into the state of services across the technology ecosystem.

Learn More

Supercharge your AWS Cloud Platform with Self-Service Cloud Ops

Join Mandi Walls from PagerDuty and Mark Kriaf from AWS as they share how Cloud Ops teams can further eliminate toil and escalations via PagerDuty® Process Automation On Prem and PagerDuty® Runbook Automation–embracing the ability to standardize and automate operational procedures and then safely delegate them as self-service requests to other stakeholders.

Learn More

Register for upcoming events in August here!

PagerDuty Community Twitch Stream

Join us on our Twitch channels, PagerDuty Twitch Stream and PagerDuty Community Twitch Stream, to catch up on one of our latest streams led by our Developer Advocates! Catch our past streams via the YouTube Twitch Streams Channel.

PagerDuty Community Twitch Stream

If your team could benefit from any of these enhancements, be sure to contact your account manager and sign up for a 14-day free trial.

The post What’s New: Updates to Incident Response, PagerDuty® Process Automation, Integrations, and More! appeared first on PagerDuty.

]]>
Equitably distribute on-call responsibility and streamline incident response with Round Robin Scheduling by Hannah Culver https://www.pagerduty.com/blog/introducing-round-robin-scheduling/ Tue, 11 Jan 2022 14:00:01 +0000 https://www.pagerduty.com/?p=73381 PagerDuty is excited to introduce Round Robin Scheduling. Round Robin Scheduling allows teams to equitably distribute on-call shift responsibilities amongst team members. Automatically assigning new...

The post Equitably distribute on-call responsibility and streamline incident response with Round Robin Scheduling appeared first on PagerDuty.

]]>
PagerDuty is excited to introduce Round Robin Scheduling. Round Robin Scheduling allows teams to equitably distribute on-call shift responsibilities amongst team members. Automatically assigning new incidents across different users or on-call schedules on an escalation level ensures that teams are resolving incidents as efficiently as possible. And, by balancing the workload across multiple users, there’s less risk of burnout.

Seamlessly resolve multiple incidents occurring on the same service

When a service experiences an incident, a single responder receives the alert and begins to triage. If only one incident occurs, this is manageable. But, for services that have a higher volume of alerts, this can cause confusion during incident response as the responder is pulled in multiple directions to attend to multiple incidents. Imagine a service receives 5 distinct yet simultaneous alerts that must be addressed within 30 minutes. A single on-call engineer can’t handle them all, and that’s where Round Robin Scheduling can help.

With Round Robin Scheduling, users can easily set up a rotation by creating a new escalation policy or editing an existing escalation policy and checking the box that says, “Users are assigned via round robin on the escalation level.”

In a case like the above example, each person on the round robin would be assigned one of the 5 alerts to triage. This streamlines incident response and results in less downtime and better customer experience.

Without Round Robin Scheduling With Round Robin Scheduling
All incidents assigned to one person and the rest of the team is idle as they’re not on the schedule Incidents are assigned fairly amongst a team of people who each share the load
MTTA and MTTR increase as a single responder attempts to handle multiple alerts MTTA and MTTR decrease as each responder is able to give the alert their full attention
When the responder is overwhelmed, they are forced to escalate Escalations are less frequent as there are alternative responders who can jump on incoming issues

Additionally, identifying who is next on the rotation is simple. When users view their escalation policy, a green arrow indicates who is next up in the Round Robin rotation, so there are no surprises about who will be alerted when an issue arises.

Distribute work and escalate as needed to reduce burnout

With on-call teams that receive a high volume of requests, burnout is always top of mind.  One teammate may be expected to handle multiple issues at the same time while the rest of the team waits idle. Any particular on-call shift like this can result in alert fatigue, slow responses, and decreased cognitive capacity. Even if the on-call shift occurs only once per month, it can be enough pressure to increase attrition.

Round Robin Scheduling ensures that each new incident is assigned to the next person in line, including managers as well as users so teams are able to better balance responsibilities. This helps keep rotations fair and predictable for everyone on the on-call schedule, including upper levels of escalations such as directors who may need to step in during a high priority incident.

Use Round Robin Scheduling today

If your team is looking for a way to manage on-call volume, streamline incident response, and equitably distribute the workload, Round Robin Scheduling is now available for all Business and Digital Operations plans. If you’re a current customer and you’d like to upgrade to unlock access to this feature, reach out to your PagerDuty account team. If you’re not a customer yet, you can try this feature for free for 14 days.

To learn more about Round Robin Scheduling, you can read our support documentation here or watch a short YouTube video here.

The post Equitably distribute on-call responsibility and streamline incident response with Round Robin Scheduling appeared first on PagerDuty.

]]>
Visualize and manage all of your services in one place with Dynamic Service Graph by Hannah Culver https://www.pagerduty.com/blog/introducing-dynamic-service-graph/ Mon, 08 Nov 2021 14:00:22 +0000 https://www.pagerduty.com/?p=72369 In this digital era, technology systems are becoming increasingly complex. No longer can a single SME (subject matter expert) understand every facet of the system...

The post Visualize and manage all of your services in one place with Dynamic Service Graph appeared first on PagerDuty.

]]>
In this digital era, technology systems are becoming increasingly complex. No longer can a single SME (subject matter expert) understand every facet of the system they run. Instead, much of this knowledge is siloed and exists as tribal knowledge within certain teams. Additionally, the rate of change is faster than ever, with code deploying and new services shipping at a rate unimaginable a few years ago.

The majority of our customers tell us that they’re dealing with this enormous rise in complexity by adopting a service ownership approach to create a holistic view of a system and democratize knowledge between all technical and business teams. Adopting this model takes an organization-wide effort, and, like all cultural changes, will never truly be “done.” Thankfully, PagerDuty can help teams on their service ownership journey, whatever that stage might be.

We are excited to announce the general availability of PagerDuty’s Dynamic Service Graph. This feature enables organizations to have a holistic view of all their technical and business services and dependencies, bringing the entire service topology into a single view to be used by all teams in real-time.

Introducing Dynamic Service Graph

Dynamic Service Graph democratizes knowledge across the entire organization about how services work together to deliver customer-facing capabilities, and visually represents the health of those services. This helps teams better understand the health of the overall system, not just the components they’re individually responsible for. It also improves incident response as teams can better determine where an incident is occurring as well as the potential upstream and downstream effects of the issue.

Understand how services connect to deliver business capabilities

Technology ecosystems have hundreds or thousands of moving pieces. Other line-of-business stakeholders might find it difficult to understand how these interconnected pieces make up the business services they’re responsible for. Yet, it’s critical that all teams are able to communicate effectively with each other to resolve problems and bridge any gaps in understanding.

With Dynamic Service Graph, it’s easier than ever to visualize how each service is connected to another, and how they build up into larger business services that have direct impact on critical business capabilities. This interconnectedness removes siloes and encourages knowledge sharing.

Understand affected services during an incident

Technical teams often receive alerts that their service is experiencing an issue. Yet, sometimes their service isn’t the true culprit. If the service is dependent on another service which is experiencing an issue, there might be little they can do to fix the problem themselves. But, if teams are in the dark as to what services are affected, they could be wasting precious time investigating the source of the problem.

On the flip side, when an incident occurs, a team facing an issue with their service might lack the information to triage accurately. If the team doesn’t understand the downstream effects of the issue with their service, it could cause outsized negative impact.

Dynamic Service Graph allows teams to examine services and identify what’s affected by failure and why. From both the incident details page and the service graph, teams can quickly see a service’s impact on the rest of the system.

Identify services missing dependencies to understand gaps

Whether an organization is well into its service ownership journey or just getting started, it’s likely that some information on the system structure is incomplete. This is due to increased silos as teams grow, and tribal knowledge that’s passed on through teams but rarely documented or shared. And, it’s exacerbated by the rapid pace of change within the systems, with code pushing multiple times per day and new services changing underlying relationships.

Dynamic Service Graph helps encourage teams to maintain up-to-date dependencies because that control is in the hands of the people operating the services, making the on-call experience easier for everyone carrying the metaphorical pager. When mapping services, PagerDuty will offer suggestions as to which services currently lack dependencies. From there, you can drag and drop relevant services into your graph, creating a more comprehensive view of the system with each click.

Ready to try for yourself?

With PagerDuty’s Dynamic Service Graph, your technical teams will have the ability to view and dynamically map clearly defined services and streamline incident response to reduce downtime and customer impact. This will help teams make the most of their service ownership model and spend less time firefighting and more time innovating for their customers.

“Service Graph has driven us to shift focus from a technical service dependency strategy to a business service based strategy to more effectively identify impact to our end users. Using the new service graph functionality, our support teams are able to visually assess impact at a glance, which enables them to more quickly and accurately determine root cause. The feature has given our service owners the visibility into their dependencies which were previously only in documentation.” —Senior Manager, Gaming Industry

If you want to learn more, check out our recent webinar, Services Like a Boss: Best Practices for Implementing and Maintaining Services Architecture, read our knowledge base article here, or view our documentation here.

If you’re ready to see Dynamic Service Graph in action, try PagerDuty for free for 14 days.

The post Visualize and manage all of your services in one place with Dynamic Service Graph appeared first on PagerDuty.

]]>
What’s New: Updates to Event Intelligence, Integrations, and More! by Vera Chan https://www.pagerduty.com/blog/whats-new-product-update-2021-07/ Thu, 22 Jul 2021 13:00:10 +0000 https://www.pagerduty.com/?p=70280 If you thought that the product announcements from PagerDuty’s largest event of the year, PagerDuty Summit 2021, was all we had in store for you,...

The post What’s New: Updates to Event Intelligence, Integrations, and More! appeared first on PagerDuty.

]]>
If you thought that the product announcements from PagerDuty’s largest event of the year, PagerDuty Summit 2021, was all we had in store for you, think again! We’re excited to announce that the July Release comes with a new set of updates and enhancements to the PagerDuty platform!

This month’s release includes updates that will help our customers:

  • Improve response time through automation
  • Streamline communication and collaboration during incident response via enhanced ChatOps integrations
  • Reduce cognitive load and richer context through machine learning and AIOps

You can learn about our latest capabilities via the Q1 PagerDuty Pulse or read below for the highlights.

Runbook Automation

PagerDuty Runbook Automation helps reduce toil and escalations and improve response times through automating repetitive or manual tasks.

The Rundeck 3.4.0 and Rundeck 3.4.1 updates include a redesigned user interface and new Rundeck Enterprise capabilities. These capabilities are all designed to improve user experience, streamline configuration management, and tighten and provide more granular security controls.

To learn more about the release, watch the “What’s new in Rundeck 3.4” session from PagerDuty Summit 2021. You can also read about all the features in the release blog or the 3.4.0 and 3.4.1 release notes.

Event Intelligence & AIOps

PagerDuty’s Event Intelligence enables better noise and complexity management. The following new capabilities help save time by leveraging AIOps and machine learning to deliver centralized context. This allows teams to act rapidly when issues arise by accelerating root cause and contributing factor identification.

Outlier Incident: You can now use historical incident and alert data to identify incidents that are anomalies, rare occurrences, or happen frequently. See these patterns more easily than before without having to manually compile information in one-off dashboards or sort through operations spreadsheets when every second matters.

View the demo or learn more.

Change Correlation: A majority of incidents are caused by changes to code or infrastructure, but it can be difficult to identify which changes could have potentially caused an incident. Change Correlation surfaces changes that most likely caused an issue based on time of occurrence, related service, or machine learning analysis of similar incidents.

View the demo or learn more.

Custom Change Event Transformer: The custom change event transformer capability enables you to send change events from any system to PagerDuty. Convert an event in any format into a PagerDuty change event via javascript and view recent changes both in PagerDuty’s Recent Changes feature and as additional context in the incident’s details.

View the demo or learn more from the knowledge base!

New Service Create Flow: Enjoy the simple and intuitive guided flow to set up a well-configured service. New users can feel confident that service setup is completed according to their needs and that their service is ready to go.

View the demo or learn more from the knowledge base!

New Service Profile Updates: The new service profile will fully replace the service details page on August 31, 2021. Users with access to the previous service profile page can now view the new one. Updates include the ability to view recent change events, add and view dependencies, and add runbooks or a communication channel to streamline incident response.

Learn more here.

Partner Ecosystem & Integrations

This release includes updated ChatOps integrations with Slack and Microsoft Teams. These capabilities help teams collaborate and communicate more efficiently, perform incident actions at the click-of-a-button, and enhance context to triage and remediate incidents faster. Responders can stay in their preferred apps without context-switching while the entire organization benefits from enhanced visibility into current business-impacting issues.

Stakeholder Updates in ChatOps: New to the PagerDuty Slack integration and PagerDuty Microsoft Teams integration is the ability to communicate and provide status updates to business stakeholders across the organization directly within Slack and Microsoft Teams. This involves less context-switching and results in fewer interruptions.

Pictured above: PagerDuty Slack Integration (Stakeholder Updates)Pictured above: PagerDuty Microsoft Teams (Stakeholder Updates)

To learn more:

Incident Context and Actions in Microsoft Teams: With incident context in Microsoft Teams, responders are equipped with clear, actionable data to help them quickly triage and coordinate effective incident response.

After an incident notification has been posted in a channel, channel members can perform the following incident actions: Acknowledge, Resolve, Add Note, Add Status Update, and Create a Meeting!

Webhooks V3

Improved webhooks delivery now provides standardized, flexible, and reliable webhook payloads, even for larger amounts of data. A Webhooks V3 subscription will send a webhook to an endpoint every time an event occurs during an incident within a desired scope, such as a service or team. This update supports secure webhook signing for better authorization as well as additional event types including: priority_updated, responder.added, and responder.accept/decline events.

Webhooks V1 and V2 are being deprecated on November 13, 2021 and March 31, 2022 respectively.

View the demo or learn more in the knowledge base!

PagerDuty Where You Are

The following mobile app and PagerDuty Analytics updates embody the benefits of working anywhere while collaborating and communicating with team members or stakeholders across the organization.

Past Incidents on Mobile: View past resolved incidents in the PagerDuty mobile app that have similar metadata and were generated on the same service as the current active incident. This additional context facilitates accurate triage and also reduces resolution time. Responders can see whether they, or someone on their team, have been involved in a similar previous incident and dive into details to discover what remediation steps were taken.

To learn more, you can:

Slack Insights Previews

PagerDuty Analytics offers pre-built metrics and prescriptive dashboards. Users can paste a curated Insights URL from PagerDuty Analytics into Slack and see the link preview directly within Slack (in both #channels and direct messages). Allowing the link to unfurl a visual directly in Slack makes report sharing easier and empowers more data-driven decision making.

View the demo and learn more here.

 

We’ve consolidated all of the latest product demos into a playlist that you can watch on demand in the Q1 PagerDuty Pulse.

If your team could benefit from any of these enhancements, be sure to contact your account manager and sign up for a 14-day free trial.

The post What’s New: Updates to Event Intelligence, Integrations, and More! appeared first on PagerDuty.

]]>
PagerDuty: The Year in Review by Rachel Obstler https://www.pagerduty.com/blog/products-launched-year-in-review/ Thu, 18 Oct 2018 13:00:52 +0000 https://www.pagerduty.com/?p=50247 We just held our annual conference, PagerDuty Summit 2018, where we shared new product announcements and demoed new capabilities. But while we always have big...

The post PagerDuty: The Year in Review appeared first on PagerDuty.

]]>
We just held our annual conference, PagerDuty Summit 2018, where we shared new product announcements and demoed new capabilities. But while we always have big things that our engineering teams have produced to announce at Summit, there’s also a lot of work that happens throughout the year across our platform—and we just didn’t have time to demo it all.

But we did string together many of the features and capabilities we launched in the past year in a short form verse, which we shared at Summit. For those of you who couldn’t make it to the conference, here it is (and hope to see you next year)!

Our Event Intelligence product
Helps responders to focus.
It’s applied machine learning,
Not some hocus pocus.

It makes use of your actions
From your past response.
So if you’re an existing customer
You’ll see benefit at once.

And if you’re trying to route
Alerts to the right teams,
Our new alert rules engine
Will automate those streams.

And these rules can run response plays
That automate response actions.
They add responders and send updates
With no human interaction.

Still sometimes you get awoken
By alerts that are a-flapping,
Our new threshold alerting
Means we won’t disrupt your napping.

Integrations: our count
Is now over 15 score.
But there are always new tools
So we’ll keep adding more.

Like with Azure’s new metrics,
We’re adding alert types to the group.
And with VSTS, (I mean Azure DevOps)
We close the DevOps loop.

And with other oft-used tools,
We’re going deeper, too.
With updates for Splunk
And a Jira Version 2

And ServiceNow’s on V5
With priority syncing for starters.
We sync so well together
We’re now a Gold partner.

ServiceNow security and change items
Are also now supported,
And with our new one-click setup
Teams and schedules are ported.

The breadth and depth of our ecosystem
Insures your investment tenfold
Like with AWS Cloudwatch and Marketplace
Where we are now sold.

With our mobile app, now
You can make priority switches
To declare an incident major,
Add responders, and join bridges.

So if something goes wrong
When you aren’t at home,
You can incident command
All from your phone.

And if you are in your chat tool
Working fast while the world’s ablaze,
Our new APIs let you send
Stakeholder updates and run plays.

And as your growth company
Adds employees apace,
Our team hierarchy feature
Will keep them organized in place.

Now before we close,
We would like to thank
All who gave product feedback
Even when it was quite frank.

If you have more, please do share
We at PagerDuty are all ears
Thank you for being our customers,
Now on to innovating for the next year!

The post PagerDuty: The Year in Review appeared first on PagerDuty.

]]>
Automate Cross-Functional Team Responses For Any Situation by Paul Rechsteiner https://www.pagerduty.com/blog/response-plays/ Tue, 12 Dec 2017 12:00:42 +0000 https://www.pagerduty.com/?p=39815 Continuing our ongoing effort to make incident response best practices easy to adopt, PagerDuty is pleased to announce that response plays are now available! Response...

The post Automate Cross-Functional Team Responses For Any Situation appeared first on PagerDuty.

]]>
Continuing our ongoing effort to make incident response best practices easy to adopt, PagerDuty is pleased to announce that response plays are now available! Response plays let you automate precise cross-functional team responses for any situation, so that organizations can plan their major incident responses during peacetime and mobilize instantly when it’s wartime. While improving mobilization time and driving down total time-to-resolution, response plays also eliminate the perpetual need to keep “who to page” response documentation up to date.

How do response plays work?

Each response play lets you configure the following response actions in advance:

  • the on-call teams and conference information needed for a coordinated response
  • the cross-functional stakeholders to subscribe to incidents on a service
  • an optional status update for subscribers

Running a play on an incident is as easy as selecting the appropriate play from the list of available plays, and can be done with a couple of clicks in the PagerDuty web app:

Run a Play" action on the incident page in the PagerDuty web app, showing the list of available plays

Plays are also available in the mobile app, enabling response automation for on-call responders. This means that responders can mobilize a coordinated response or communicate to stakeholders with just a couple of taps:

 PagerDuty mobile app showing a response play selected from a list and being run

You can configure a response play to mobilize a complete major incident response team, including incident commander, communications liaison, and technical on-calls for the involved systems, which looks like this:

Response play entitled 'Mobilize P1 response' has three escalation policies as responders and a custom message

Having all of these response actions packaged up as a play lets any responder easily engage a major incident response during triage — without having to look up the corresponding documentation and then identifying the appropriate individuals and teams to involve. This kind of automation saves time and fits into nearly any triage workflow, including DevOps, NOC, and customer support.

Go directly from signal to action

If you have monitoring that directly measures customer or business impact, you can skip manual triage altogether and go directly from monitoring tool alert to mobilizing a coordinated response. For example, external monitoring can determine when your retail site is unreachable, and in that situation, there’s no need for someone to investigate before engaging the team; instead, an incident commander and the relevant subject matter experts should be automatically mobilized when this happens.

This is easy to set up with response plays. Any response play can be attached to a service, and run on each new incident created on the service. This approach minimizes the time from incident detection until the response team is mobilized and reduces opportunities for user error during mobilization.

Automate incident stakeholders

A response play can also subscribe stakeholders to an incident. This can be done on demand during an incident response, where the incident commander or communications liaison can determine when stakeholders need to be involved. Alternately, stakeholders can be immediately subscribed to new incidents by attaching the corresponding response play to a service. If your response process calls for it, you can even combine mobilization and stakeholder subscription in a single response play, like this:

Response play entitled "Mobilize P1 response" will simultaneously notify 3 responders with a custom message and subscribe 5 stakeholders to the incident

Improve your operational effectiveness with response plays

Response plays are a valuable automation capability for improving your organization’s incident response practices. Whether your goal is improving response precision, reducing mobilization time, efficiently engaging stakeholders, or saving responder time during an incident, response plays will help you achieve it.

To learn more about incident response best practices, take a look at PagerDuty’s Incident Response Guide. You can also check out our mobilizing coordinated responses and effective stakeholder communication guides for an in-depth explanation of how to best use PagerDuty capabilities for these needs.

Response plays are available immediately to all PagerDuty customers on Standard and Enterprise plans at no extra cost. Please contact our support team for additional information about response plays, or our sales team if you would like to upgrade to a plan with this feature.

The post Automate Cross-Functional Team Responses For Any Situation appeared first on PagerDuty.

]]>