In order to compare and rank the worlds fastest computers, benchmarks evaluating their performance are required. A single execution of HPL is used for the most widely recognized ranking: the TOP500. Lately, two benchmarks, arguably more representative of typical modern workloads, have been proposed: HPCG and HPGMG. Currently, all three benchmarks use the highest observed performance from a single problem size for ranking. In this paper we report benchmarking result for all three benchmarks with a wide range of problem sizes on six distinct hardware architectures, covering the full range of machines present on the TOP500 list. We find that the data holds significantly more information on the performance of the underlying hardware as compared to just the maximum performance observed. We therefore argue that an aggregate value derived from a whole range of problem sizes can significantly improve the sensitivity of a given benchmark to relevant hardware properties and thus be more representative. However, we refrain from proposing the specific way to best compose such an aggregate and invite the community to open the discussion on the topic.