

Algorithms & Models


Adtributor1最早系统地提出利用根因分析对广告系统收入指标进行溯因, 其基于一个较强的假设:根因的指标来自于单个指标。


iDice2对 Adtributor1中的根因位于单个维度 的假定进行了放宽。在 iDice 中,允许根因是多个维度的组合。


HotSpot3指出多维根因分析的两个难点:单个指标的异常会传播导致该指标在不同层级的异常; 算法搜索空间过大,需要高效的搜索算法。针对这两个难点,论文给出了对应的解决方案:对于第一个异常传播的问题,提出了 一个新的指标 ripple effect 用于得分计算; 对于第二个问题采用蒙特卡洛搜索树 (Monte Carlo Tree Search) 和层次剪枝 (hierarchical pruning) 的方法 来实现更加高效的搜索。


Squeeze4提出 generalized ripple effect 和 generalized potential score, 同时可以更好地平衡搜索效率与精度。


AutoRoot5使用 daptive density clustering 来提升模型精度, 同时使用一种高效的过滤机制来提升搜索效率。


RiskLoc6通过加权的方式定义 risk score 来挖掘根因指标。


CMMD7主要由两个部分组成: relationship modeling, 根据历史数据用 GNN 来构建指标之间的关联关系; root cause localization, 使用遗传算法 (genetic algorithm) 来高效准确地定位根因。

  1. R. Bhagwan et al., “Adtributor: Revenue debugging in advertising systems,” in 11th USENIX symposium on networked systems design and implementation (NSDI 14), 2014, pp. 43–55. 

  2. Q. Lin, J.-G. Lou, H. Zhang, and D. Zhang, “iDice: Problem identification for emerging issues,” in Proceedings of the 38th international conference on software engineering, 2016, pp. 214–224. 

  3. Y. Sun et al., “Hotspot: Anomaly localization for additive kpis with multi-dimensional attributes,” IEEE Access, vol. 6, pp. 10909–10923, 2018. 

  4. Z. Li et al., “Generic and robust localization of multi-dimensional root causes,” in 2019 IEEE 30th international symposium on software reliability engineering (ISSRE), IEEE, 2019, pp. 47–57. 

  5. P. Jing, Y. Han, J. Sun, T. Lin, and Y. Hu, “AutoRoot: A novel fault localization schema of multi-dimensional root causes,” in 2021 IEEE wireless communications and networking conference (WCNC), IEEE, 2021, pp. 1–7. 

  6. M. Kalander, “RiskLoc: Localization of multi-dimensional root causes by weighted risk,” arXiv preprint arXiv:2205.10004, 2022. 

  7. S. Yan et al., “CMMD: Cross-metric multi-dimensional root cause analysis,” arXiv preprint arXiv:2203.16280, 2022.