Centralpoint Error Reporting & Monitoring

Error Reporting is managed in two locations, first within the Client Console of each individual Centralpoint Portal and also at the Uber Console, by Oxcyon. 

Error Reporting within Client Console
Here, Centralpoint's Custom Error Messages equip the web site's web.config file and Global.asax file to handle custom errors. When an error occurs within the application, the Global.asax determines if the error is an HTTP error and then manages it accordingly. This feature requires custom errors to be turned on in the web.config file. For example, if the current error is a 404 error and a 404 Redirect record exists with an Original URL value that matches the URL on which the error originated, the user wi..., This tool can be found under the Development section within the Client Console, under 'Utilities'. Log in the Client Console, click on development, and search for this tool within this page. You can also ask your production manager to turn on the 'Tools' section of your client console. When this is done, each tool will appear individually within the console, for easy access.

Error Reporting (Monitoring) by Oxcyon (Across all environments)
A large part of the Evergreen Updates, involve listening and responding to any anomalies of any Centralpoint installation to allow Oxcyon Development to improve the technology. This means Error Reporting (or listening) to all Centralpoint installations. When any new anomaly is recognized, This allows Oxcyon to remedy issues (before they are even known by clients), and also to study any patterns over a period of time, improving our Updates.

First, not all exceptions are errors. Many exceptions are common and unavoidable.  For example, view state errors occur when a web site changes between post backs.  This could be as simple as a client console submission.  In other cases, clients may have developed custom code in a site and refused to address the issues that were reported to them.  These errors relate to unavoidable circumstances and are therefore do not require review.  The sync will then deliver code that prevents that error from being reported to Uber in the future.  If an error has occurred in the past and has been deemed a correct response to the problem it will also be excluded from reporting.

Once an error has been determined to be of value for review it is categorized by the web site itself.  Many errors fall into common categories like connection issues, email send failures, or timeout exceptions.  The contents of the error is parsed to determine whether it fits into one of these categories and grouped accordingly.  If the error does not fall into one of these categories the system generates a category based upon the type of error and the stack trace.  The highest level Centralpoint function that was involved in the error is combined with the datatype of the error to generate a unique category for that error.  Each time the same error is sent to Uber it will contain the same category regardless of which site returned the error.  In Centralpoint this category is called internally an Error Code Key (ECK). This allows Oxcyon to study similar exceptions (across different installations) to establish patterns.

Oxcyon’s Uber console’s Errors module is used to maintain and manage all errors from all masters and web sites as they occur.  These errors are categorized using the ECK.  In some cases, these errors trigger emails or requests for immediate action.  The module also allows Oxcyon’s engineers to regularly review the activity occurring on web sites and look for patterns.  If a large number of the same ECK is reported within a short period of a time an email notification is triggered and sent to the Oxcyon’s engineers and the administrator of that web site.

All errors are reviewed on daily basis, In many cases, these errors are just being reviewed to determine patterns.  If numerous exceptions occur on a server or site at repeating time periods or intervals, a ticket is automatically sent to the (Network) administrator of that web site to determine whether any long running processes may be affecting the performance at that time.  Other times these errors will include information that indicate unwanted ‘bot’ traffic may be spidering a site and may be making invalid requests or should be blocked.  Often times, these errors are used to identify fishing attempts and prevent issues like cross site scripting (XSS) or SQL Injection.

These errors also often point out minor issues in the system.  Maybe an unusual occurrence of events can lead to an error that didn’t come up during testing.  The errors themselves include a detailed description of the request that lead to the error.  Often these details can be used to reproduce a problem, and a patch is developed and deployed in the next sync.  Clients may find issues and report them, but this module finds the errors that nobody reports.  Many times the Oxcyon’s engineers will discover errors that are infrequent and unreported for months or years.  Each time they occur they are reviewed and notes are added to the system.  Eventually, a solution is found and a patch is deployed in the next sync.  Rigorously executing this process for over 10 years has led to a very clean, trustworthy, and secure architecture, and it is improved with every sync.

Another tool that is often used for site reviews is the Development > Process Log module.  This exists in all master and client consoles.  When additional investigation is necessary related to issues on a server the Process Log is used to review all of the unattended long running processes that executed on any web site on that server.  When a web sites starts a long running process it executes in the master instead of the web site itself to take some of the stress off of the web site.  The master is on the same server so the server itself does experiences some stress.  This module can be used to determine what is running and when, how long it ran, and whether any errors occurred during the execution.  Long running processes are typically scheduled and can be things like data transfers, email broadcasts, or site synchronization.

Oxcyon’s engineers are also working on a new master console feature called the Health Monitor which will take this process to the next level (expected to be released Q2/2018).  The health monitor is being developed to attach itself to running processes and determine problems in pages and applications that are not returning errors.  For example, an administrator may have a report that is being generated on a regular basis.  The Health Monitor will attach itself to each process and create a report of the real time activity occurring in each web site.  It will provide information like the longest running pages, average request length, and total number of requests, and sometimes even which methods are running the longest on the page.  This tool combined with error reporting will take issue avoidance and resolution to the next level.