Statistically, my main issue is cherry picking.
They've found several instances where you have LFV and they combine those to get their significance in sigma. They ignore the many, many other instances where you have results that are consistent with LFU when the justification for excluding those results is non-obvious.
For example, lepton universality violations are
not found in tau lepton decays or
pion decays, and are not found in
anti-B meson and D* meson decays or in
Z boson decays. There is
no evidence of LFV in Higgs boson decays either.
As one paper notes: "Many new physics models that explain the intriguing anomalies in the b-quark flavour sector are severely constrained by Bs-mixing, for which the Standard Model prediction and experiment agreed well until recently." Luca Di Luzio, Matthew Kirk and Alexander Lenz, "
One constraint to kill them all?" (December 18, 2017).
Similarly, see Martin Jung, David M. Straub, "
Constraining new physics in b→cℓν transitions" (January 3, 2018) ("We perform a comprehensive model-independent analysis of new physics in b→cℓν, considering vector, scalar, and tensor interactions, including for the first time differential distributions of B→D∗ℓν angular observables. We show that these are valuable in constraining non-standard interactions.")
An anomaly disappeared between Run-1 and Run-2 as documented in Mick Mulder, for the LHCb Collaboration, "
The branching fraction and effective lifetime of B0(s)→μ+μ− at LHCb with Run 1 and Run 2 data" (9 May 2017) and was weak in the Belle Collaboration paper, "
Lepton-Flavor-Dependent Angular Analysis of B→K∗ℓ+ℓ−" (December 15, 2016).
When you are looking a deviations from a prediction you should include all experiments that implicate that prediction.
In a SM-centric view, all leptonic or semi-leptonic W boson decays arising when a quark decays to another kind of quark should be interchangeable parts (subject to mass-energy caps on final states determined from the initial state), and since all leptonic or semi-leptonic W boson decays (either at tree level or removed one step at the one loop level) and are deep down the same process. See, e.g., Simone Bifani, et al., "
Review of Lepton Universality tests in B decays" (September 17, 2018). So, you should be lumping them all together to determine if the significance of evidence for LFV.
Their justification for not pooling the anomalous results with the non-anomalous ones is weak and largely not stated expressly. At a minimum, the decision to draw a line regarding what should be looked at in the LFV bunch of results to get the 3.1 sigma and what should be looked at in the LFU bunch of results that isn't used to moderate the 3.1 sigma in any way is highly BSM model dependent, and the importance of that observation is understated in the analysis (and basically just ignored).
The cherry picking also gives rise to look elsewhere effect issues. If you've made eight measurements in all, divided among three different processes, the look elsewhere effect is small. If the relevant universe is all leptonic and semi-leptonic W and Z boson decays, in contrast, there are hundreds of measurements out there and even after you prune the matter-energy conservation limited measurements, you still have huge look elsewhere effects that trim one or two sigma from the significance of your results.