Numbers Everyone Should Know
Partly inspired by Jeff Dean’s famous Stanford 2009-2010 Keynote “Software Engineering Advice from Building Large-Scale Distributed Systems”, which putting aside my bias - happens to be still one of my favourtie Keynotes.
In fact, estimating is still something I am not strong at doing. Well, it’s not an habit. Yet.
Jeff makes a good point here. Before deep diving into design, architecture, component libraries, dependencies, security and all of the layering of the system - think up front about performance, as performance often dictates future courses of direction. Something too slow? Already in production? Well, now you have to refactor and take things apart again. And again. Then probably again.
Gradually, this Keynote has become a cultrual artefact for many an engineer, designer, builder and tinkerer of distributed systems.
Here is the infamous “Numbers Everyone Should Know” slide:
In a homage, I am going to try and capture numbers that I find interesting. I see this post as more of a personal/internal dump of numbers, something I hope to recall and reuse longterm. Take from it what you will. I will add more overtime.
Energy
Time Units:
Time Unit | Equivalent in Seconds (s) |
---|---|
1 nanosecond (ns) | \(10^{-9}\) s |
1 microsecond (us) | \(10^{-6}\) s |
1 millisecond (ms) | \(10^{-3}\) s |
1 kilosecond (ks) | \(10^{3}\) s |
1 megasecond (Ms) | \(10^{6}\) s |
1 gigasecond (Gs) | \(10^{9}\) s |
1 terasecond (Ts) | \(10^{12}\) s |
1 petasecond (Ps) | \(10^{15}\) s |
1 exasecond (Es) | \(10^{18}\) s |
SI Prefixes:
Factor | Name | Symbol |
---|---|---|
\(10^{9}\) | giga | G |
\(10^{6}\) | mega | M |
\(10^{3}\) | kilo | k |
\(10^{-1}\) | deci | d |
\(10^{-2}\) | centi | c |
\(10^{-3}\) | milli | m |
\(10^{-6}\) | micro | μ |
\(10^{-9}\) | nano | n |
\(10^{-12}\) | pico | p |
Time Comparisons:
Time Unit | Approximate Equivalent in Seconds (s) |
---|---|
1 minute | 0.6 of \(10^{2}\) s |
1 hour | 0.36 of \(10^{4}\) s |
1 day | 0.86 of \(10^{5}\) s |
1 week | 0.60 of \(10^{6}\) s |
1 year | 3.2 of \(10^{7}\) s |
1 decade | 3.2 of \(10^{8}\) s |
1 century | 3.2 of \(10^{9}\) s |
Light Travel:
Event | Time |
---|---|
Light travels 300 meters | 1 microsecond |
Light travels 2 miles | 10 microseconds |
Light travels around the Earth | 134 milliseconds |
Light travels from the Moon to Earth | 1.28 seconds |
Light travels from the Sun to Earth | 8.3 minutes |
Light travels across our galaxy | 100,000 years |
Light travels from near galaxies | 2.5 million years |
Energy Units:
Energy Unit | Equivalent in Joules (J) |
---|---|
1 electronvolt (eV) | \(10^{-19}\) J |
1 kiloelectronvolt (keV) | \(10^{-16}\) J |
1 megaelectronvolt (MeV) | \(10^{-13}\) J |
1 gigaelectronvolt (GeV) | \(10^{-10}\) J |
1 teraelectronvolt (TeV) | \(10^{-7}\) J |
1 Joule (J) | \(10^{0}\) J |
1 kilocalorie (dietary “calorie”) | \(10^{4}\) J |
1 kilowatt-hour (kW·h) | \(10^{7}\) J |
1 ton of TNT | \(10^{10}\) J |
1 megaton of TNT | \(10^{13}\) J |
1 kilogram of mass-energy (mc^2) | \(10^{17}\) J |
Energy Comparisons:
Event | Approximate Energy in Joules (J) |
---|---|
The energy of a visible photon | \(10^{-18}\) J |
The energy to metabolize a glucose molecule | \(10^{-17}\) J |
The energy of an electron | \(10^{-14}\) J |
The energy to boil a nucleus | \(10^{-9}\) J |
The energy of a person descending 1 meter | \(10^{3}\) J |
The energy of a 40W bulb for 1 hour | \(10^{5}\) J |
The energy of a lightning bolt | \(10^{10}\) J |
The energy of the Krakatoa volcano explosion | \(10^{18}\) J |
The energy of the Lake Toba (Sumatra) volcano explosion | \(10^{20}\) J |
The energy of the Earth’s rotational energy | \(10^{29}\) J |
The energy of the Earth’s heat content | \(10^{31}\) J |
The energy of the Sun’s rotational energy | \(10^{35}\) J |
The energy of the Earth’s mass-energy (mc^2) | \(10^{42}\) J |
Energy kJoules:
Description | Calculation |
---|---|
Amount of energy consumed during the execution of the benchmark | Energy (kJ) = Power (W) * Time (s) / 1000 |
Power Consumption:
Description | Measurement |
---|---|
Maximum power consumed during the execution of the system | Measured in Watts (W) |
Average power consumed during the execution of the system | Measured in Watts (W) |
Power Frequency Mean:
Description | Range |
---|---|
Ensure power frequencies are within the mean range | 50Hz - 60Hz ± 1% |
Voltage Frequency Mean:
Description | Range |
---|---|
Ensure voltage frequencies are within the mean range | 100V, 110V, 120V, 208V, 220V, 230V or 400V ± 5% |
Kaya Identity:
Variable | Description |
---|---|
F | Global CO2 emissions from human sources |
P | Global population |
G | World GDP |
g | Global per-capita GDP (G/P) |
E | Global primary energy consumption |
e | Energy intensity of world GDP (E/G) |
f | Carbon intensity of energy (F/E) |
The Kaya Identity can be expressed as:
F = P * g * e * f
Performance
File Sizes:
Power | Approximate Value | Full Name | Short Name |
---|---|---|---|
10 | 1 Thousand | 1 Kilobyte | 1 KB |
20 | 1 Million | 1 Megabyte | 1 MB |
30 | 1 Billion | 1 Gigabyte | 1 GB |
40 | 1 Trillion | 1 Terabyte | 1 TB |
50 | 1 Quadrillion | 1 Petabyte | 1 PB |
Latency Calculations:
Unit | Equivalent in Seconds (s) |
---|---|
1 nanosecond (ns) | \(10^{-9}\) s |
1 microsecond (µs) | \(10^{-6}\) s = 1,000 ns |
1 millisecond (ms) | \(10^{-3}\) s = 1,000 µs = 1,000,000 ns |
Powers of Two Scaling vs Bytes:
Power | Exact Value | Approx Value | Bytes |
---|---|---|---|
7 | 128 | ||
8 | 256 | ||
10 | 1024 | 1 thousand | 1 KB |
16 | 65,536 | 64 KB | |
20 | 1,048,576 | 1 million | 1 MB |
30 | 1,073,741,824 | 1 billion | 1 GB |
32 | 4,294,967,296 | 4 GB |
Common Data Type Sizes:
Data Type | Size |
---|---|
int | 4 |
float | 8 |
boolean | 4 |
UTF-8 character | 1 |
UTF-8 in Chinese | 3 |
UNIX timestamp | 4 |
Latency for Sequential Data Fetch:
Formula | Description |
---|---|
latency = latency_resource_1 + latency_resource_2 | Latency for sequential data fetch |
Latency for Parallel Data Fetch:
Formula | Description |
---|---|
latency = max(latency_resource_1, latency_resource_2) | Latency for parallel data fetch |
Query per Second (Resource Fetch):
Formula | Description |
---|---|
QPS = number of CPU cores / average time for a request in seconds | Query per second (assuming peak traffic) |
Scale Calculations:
Transactions Per Day | Transactions Per Second |
---|---|
1 million | ~12 |
5 million | ~60 |
30 million | ~360 |
100 million | ~1200 |
Note: If a user generates 50 API calls during their session, then we must support ~600 transactions per second for 1 million users per day.
Peak Times Calculations:
Transactions Per Day | Peak Transactions Per Second (10% Peak for 1 Hour) | Peak Transactions Per Second (30% Peak for 1 Hour) |
---|---|---|
1 million | ~30 | ~90 |
10 million | ~300 | ~900 |
Data Type Sizes:
Data Type | Size (Bytes) |
---|---|
Int32 | 4 |
Int64 | 8 |
Float | 4 or 8 |
JavaScript boolean | 4 |
UTF-8 Char in English | 1 |
UTF-8 Char in other languages | 1-3 |
Note: For languages like Chinese, consider UTF-8 characters as 3 bytes.
System Utilization and Queue Time Calculation:
Average Cycle Time | System Utilization | System Cycle Time Variability | Arrival Rate Variability | Queue Time (avg) | Cycle Time (avg) | Total Time | Process Efficiency |
---|---|---|---|---|---|---|---|
5 (minutes/hours/days) | 80% | 0.5 | 0.5 | 5 | 5 | 10 | 50% |
Impact of Variability on Utilization and Process Efficiency:
Variability Level | Cycle Time | Utilization | Cycle Time Variability | Arrival Rate Variability | Queue Time | Process Efficiency |
---|---|---|---|---|---|---|
Zero Variability | 5 | 99% | 0 | 0 | 0 | 100% |
Moderate Variability | 5 | 99% | 0.2 | 0.2 | 20 | 20% |
Higher Variability | 5 | 99% | 0.5 | 0.5 | 124 | 4% |
Note: The process efficiency is calculated as the ratio of hands-on time to total time. In software development, it typically ranges from 5-15%, indicating a high impact of excessive utilization.
File sizes:
Item | Size |
---|---|
Storing 1 GiB/month on the cloud | $0.02 US |
Web site of my Twitter profile (@lemire), HTML alone | 296 KiB |
Web site of my Twitter profile (@lemire), all data | 296 KiB |
Google result for ‘Canada’, HTML alone | 3.9 MiB |
Google result for ‘Canada’, all data | 848 KiB |
Node JS runtime | 3.7 MiB |
Size of the Java (19) runtime | 164 MiB |
LLVM/clang compiler+runtime | 330 MiB |
Boost (C++) library (source) | 5.5 GiB |
Go runtime | 609 MiB |
Potential Savings from Config Tuning:
If the cost to run a service per year is represented by (X) (in millions), then the potential savings (S) from config tuning can be calculated as:
\[S = X \times \text{{Increase Factor}} - X\]where the Increase Factor is between 1.5 and 2 (representing the 1.5x to 2x increase in capacity).
Impact of CFS Period Tuning:
If the original CFS period is represented by (O) (default is 100ms) and the new CFS period is represented by (P), then the reduction in worst-case throttling time (R) can be calculated as:
\[R = \frac{{O - P}}{{O}} \times 100\%\]Impact of CPU Pinning and Isolation:
If the number of threads an application currently has is represented by \(T\) and the number of CPUs requested by the application is represented by \(C\), then the reduction in parallel threads \(D\) can be calculated as:
\[D = T - C\]CPU Time for Single Core Execution:
If (T) represents the total CPU time for executing 2414 tool executions on a single core, then:
\[T = 16 \text{ hours}\]CPU Time for Multi-Core Execution on a Single CPU:
If (T_n) represents the total CPU time for executing 2414 tool executions on (n) cores of a single CPU, then:
\[T_n = T + 0.5(n - 1) \text{ hours}\]This formula assumes a linear increase in CPU time as the number of cores increases, as stated in the text.
CPU Time for Multi-Core Execution on Two CPUs:
If (T_{2n}$$ represents the total CPU time for executing 2414 tool executions on (n) cores of each of the two CPUs (i.e., (2n) cores in total), then:
\[T_{2n} = T + 0.75(n - 1) \text{ hours}\]This formula assumes that using \(n\) cores on each of the two CPUs is slower than using (n) cores on only one CPU, with the maximum difference occurring for eight cores per CPU.
Memory Bandwidth Advantage:
CPU_time = Memory_size / CPU_bandwidth
GPU_time = Memory_size / GPU_bandwidth
Latency Hiding with Thread Parallelism:
GPU_time = Max(Latency, Memory_size / (GPU_bandwidth * Number_of_threads))
Register Memory Advantage:
CPU_time = Register_memory_size / CPU_register_bandwidth
GPU_time = Register_memory_size / GPU_register_bandwidth
Matrix Multiplication Speed:
GPU_time = Matrix_size / GPU_bandwidth
NOTE:** Actual CPU time will depend on various factors including the specific tool being executed, the workload, and the behavior of the CPUs/GPUs.
AI LLMs
TPU vs GPU Performance: Accordingly, a TPU is approximately 32% to 54% faster than a GPU for training **BERT-like models. This can be represented as:
TPU_speed = GPU_speed * (1 + 0.32 to 0.54)
Impact of Data Type Size: Data type size has a significant impact on performance. For instance, TPUs would be approximately 5.3x faster when using 32-bit values compared to 16-bit values. This can be represented as:
TPU_speed_32_bit = TPU_speed_16_bit * 5.3
Potential Speedup with 8-bit Computing:** If 8-bit computing can be made to work for general models, it could lead to significant speedups for transformers. For instance, GPUs could be 3.0x faster than TPUs **with 8-bit computation. This can be represented as:
GPU_speed_8_bit = TPU_speed * 3.0
Potential Cost Savings:
Calculation | Formula |
---|---|
Cost Savings with Concise Prompts | Cost Savings = Original Cost - (Original Cost * Conciseness Factor) |
Tokens to Words Ratio | Number of Tokens = Number of Words * Tokens per Word Ratio |
Cost Ratio of Different Models | Cost of Operation = Cost of Model * Number of Operations |
GPU Memory Requirements | GPU Memory Requirement = 2 * Number of Parameters |
GPU Memory Requirement for Output | GPU Memory Requirement for Output = Number of Tokens * Memory per Token |
Throughput Improvement from Batching | Throughput = Number of Queries / Total Time |
Where:
-
Conciseness Factor
is between 0.4 and 0.9. -
Tokens per Word Ratio
is 1.3. -
Memory per Token
is 1MB.
Sources:
Back of Envelope Calcs
LLM Numbers
Physical Constants
Mathematical Constants
Physics Notations