Caveats of “they perform the same”
As you can see in the previous 2 articles (part1 and part2), distributed mode and bundled mode sometimes have the same performance.
However, most of “the same performance” scenarios are based on this assumption: For a single service if the computer hardware is twice as good, then the performance is twice as good .
That can mean:
- RT is half as short
- You can run 2 requests now with the same amount of time
It is an assumption of “linear scale”. But in real life, this is difficult:
- What’s the definition of a computer “twice as good” as the other? 2 cores CPU vs 1 core CPU, memory double? There is not a convincing standard to define it.
- Even if you have a standard, the performance change is difficult to be linear. Improving the machine hardware 1 fold may lead to double in performance, but will it lead to 4 times growth in performance if hardware is improved 2 folds?
But this is not a big problem of distributed mode. If you add a machine for the same hardware, then you can say the compute resource is doubled, and QPS will be doubled, because it’s the same machine hardware serving the request. So with distributed mode, you get good certainty of performance scaling.