This blog is an excerpt from DevOps, DBAs, and DBaaS by Mike Cuppet.
Start with Optimizing Application Performance with Change Management Improvements - Part 1
Yes, There Really Is a Problem
It is not that we do not believe user-reported information; it is just that experience tells us that other factors can be in play that make it necessary to get the full representation of the problem. One user would complain several times a week about application slowness, which was causing the person's performance metrics to drop. Upon investigation using a packet capture tool, it was determined that the live video streaming to the user's computer was causing the application slowness. This person was advised to stop the streaming and given the heads up that the company could "see" everything. Nothing illegal was happening, but complaining about self-inflicted impaired performance caused by news/entertainment traffic does not boost careers if that information is shared.
Continuing with our hypothetical problem: the user-side investigations recorded slowness consistently in the 5–17 seconds range, with very few outliers, which narrows the actual slowness impact significantly. If you are lucky, the captures you already have point to a single call that represents the majority of the slowness, allowing immediate focus on what is likely the root cause.
As member of a DevOps IT shop, you know that software releases occur nightly. Unfortunately, the users did not report the problem immediately, making it difficult to establish when the problem was introduced, (except that everything seemed to be good a few weeks ago; and, by the way, the problem occurs at different times of the day; otherwise, performance is acceptable). The release report shows at least five changes that may have impacted this functionality: four were implemented successfully, and one had to be rolled back with no root cause documented. Here, the binary release check has failed the organization. Release success or failure does not communicate information needed by the business or IT. Code that is successfully deployed with functionality validated by a tester does not tell the entire story (for example, performance degradation being introduced). DevOps testing purposely initiates more comprehensive answers. Excessive testing vets the software thoroughly and automatically, making it feasible to include tests designed to measure performance. It gives the green light only on performance that matches or is faster than a predefined value or the previous code version timing.
As DevOps teams "shift-left" and work in conjunction with business leaders as product managers, IT (now DevOps) truly becomes partners with the business. The "IT alignment to the business" goal included in the annual IT strategy deck for the last decade becomes obsolete. The perceived (or actual) misalignment was not only because the business teams did not understand what IT really did, other than spending offensively huge chunks of money to drive business operations, IT also wholly failed to come to the table as a business partner; instead remaining aloof and detached from everything but technology.
Thirty years ago, IT, MIS, or data processing (whatever the name) was given the mission of finding ways to complete work faster than teams of people could by having computers do mundane, repeatable tasks. Ironically, DevOps in many ways reaches back 40 years to repeat the tactical execution of having computers do mundane tasks: repetitive code testing, deployments, infrastructure as code, and more. Between then and now, far too many manual steps were added to processes that now need to be remediated. Forty years ago, computer work likely resulted in teams of people losing their jobs, but DevOps does not have the same mandate as in the data processing years. Instead, highly skilled engineers and programmers are freed from repetitive tasks and allowed to partner with the business to generate and implement game-changing technologies and applications.
DevOps wants and needs to shift talented, intelligent, experienced staff into roles that deliver measurable benefits for the company. Repeatable tasks can be done much faster by computers, but computers do not generate ideas. Computers running data analytics programs churn through data millions of times faster than humans, but computers still do not have the capability to find answers in the data, interpret the data, or act on the data like people do. People assimilate varying data points to produce value in new ways. DevOps needs people to create opportunities to help the business leapfrog competitors.
It is not intended to get rid of people; instead, it wants to make people more effective and focused on executing business strategies, not hampered by mundane tasks. Accomplishments have moved from "Designed a new algorithm for . . ." to "Improved customer experience . . . reduced costs . . . implemented a new revenue channel . . ."
DBAs and DevOps teams should take a positive stance and attitude toward the goals of Agile and DevOps, knowing that each person's impact on the organization can make tremendous strides to create better customer experiences and software products, and continually improve business processes, all with prospective top- and bottom-line impacts.
Change management analysis in DevOps extends beyond binary conclusions to business impact statements. Reporting successful or failed statuses alone shifts to informative, customer-centric statuses such as the following:
• "Change 123 implementing function A successfully reduced execution time 40%; now averaging 7 milliseconds per call."
• "The change to reorganize table ABC successfully reduced report execution time, allowing the business to meet contractual requirements."
• "Change 456 failed and was rolled over successfully with change 512. Testing for change 456 did not include a critical data test; later found and tested for change 512, which allowed the failure to advance. Teams had rectified, tested, and implemented the needed test earlier this week, having change 512 already in the pipeline. The 512 push completed successfully within the change window, eliminating the risk."
DevOps' fail fast edict can really benefit the company by progressing software products continuously and without having laborious rollbacks, rework, retests, and reimplementation. In the previous third scenario, the DevOps team knows that a communication was missed because change 456 should have never made it to the release stage, let alone production.
So as change management communications pivot from mundane status updates to business impact updates, opportunities to improve application performance become more apparent. Moving from a message that the code was implemented successfully to a message that the code decreased customer query time by 67% tells a better story. There is a large chasm between code that works and code that works and executes expectantly fast while generating an audit trail. Adding a new feature that performs poorly is not really a feature — it is a bug and a frustration for customers. Adding a feature that is expected to increase mobile app usage 400% without increasing infrastructure resources is not a feature, but a colossal failure. The DevOps movement provides the needed tactical response with infrastructure as code. When traffic is expected to spike, adding resources to existing virtual hosts or spinning up additional hosts with a button click or two simplifies infrastructure readiness and resiliency.
Read Optimizing Application Performance with Change Management Improvements - Part 3
APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...
CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...
Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...
Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...
Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...
Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...
Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...
Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.
An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn ...
Development teams so often find themselves rushing to get a release out on time. When it comes time for testing, the software works fine in the lab. But, when it's released, customers report a bunch of bugs. How does this happen? Why weren't the flaws found in QA? ...