IT release is a frequent and time-consuming IT support work, and realizing no release will play a positive role in cost control and fault reduction.
The key processes involved in unattended release include: release planning, validation feedback, data collection, comparative analysis, algorithm training and playback verification.
To implement unattended release, you need to pay attention to application health indicators: basic environment indicators such as CPU, memory, and I/O; service indicators such as PV, UV, and service traffic; middleware indicators and exception logs.
Among them, the challenges of data anomaly recognition include: rapid data collection during the release process, eliminating the influence of various interference data, and selecting the appropriate detection method among various indicators with different characteristics.
The next step is data preprocessing, including data aggregation, data merging and data incompleteness. The aggregation is divided into a series of preprocessing from IP dimension and time dimension to form the data to be analyzed.
Another problem to be solved is to ensure effective accuracy and recall rates, which requires fine-tuning algorithms through continuous analysis of false positives and missed positives data, which requires unattended fault playback.
It is necessary to constantly analyze the detection data and adjust the algorithm. If there is no way to clearly identify or persistent false positives, it is necessary to improve the detection accuracy based on machine learning.
Elements of machine learning include: sample data for learning, machine learning algorithms, and the combination of learning results and real-time detection. The acquisition of learning data needs to pay attention to the following points.
By classifying and analyzing the anomalies, the machine learning algorithm evaluates whether the threshold setting is a problem, and whether the setting of relevant indicators is reasonable, and constantly adjusts and optimizes.