The Fun and Benefits of Optimizing JavaScript

An interactive high-quality performance optimization blog post recommended by antfu, the author has 20 years of experience, with really many details. I tried to translate it with a bit of my own polish, and strongly recommend reading the original English version for the interactive experience; here I have replaced it with screenshots for now.
This is my first translation, and any shortcomings are welcome for correction!
Link to this translated blog: https://ysx.cosine.ren/optimizing-javascript-translate
Original link: https://romgrk.com/posts/optimizing-javascript
Original author: romgrk

I often feel that regular JavaScript code runs much slower than before, and the reason is simple: it has not been properly optimized. Below is a summary of common optimization techniques I have discovered. It is important to note that the trade-off between performance and readability often favors readability, so the question of when to choose performance and when to choose readability is left for the reader to solve. I also want to say that discussing optimization inevitably requires discussing benchmarking. If a function's initial run time only accounts for a small portion of the actual total run time, then spending hours on micro-optimizing that function to make it run 100 times faster is meaningless. If you want to optimize, the most important first step is benchmarking. I will introduce this topic later. Also, note that microbenchmarks often have flaws, and what is presented here may include them. I have tried to avoid these pitfalls, but do not blindly apply any points presented here without benchmarking.

I have provided runnable examples for all possible scenarios. They default to showing the results I obtained on my machine (Brave 122 on Arch Linux), but you can run them yourself. Although I don't want to say this, Firefox has fallen a bit behind in the optimization game and currently accounts for only a small portion of traffic, so I do not recommend using results from Firefox as useful metrics.

0. Avoid Unnecessary Work#

This may sound obvious, but it needs to be here because there cannot be another first step in optimization: If you are trying to optimize, you should first consider avoiding unnecessary work. This includes concepts such as memoization, laziness, and incremental computation. Specific applications will vary depending on the context. For example, in React, this means applying memo(), useMemo(), and other applicable primitives.

1. Avoid String Comparisons#

JavaScript easily hides the true cost of string comparisons. If you need to compare strings in C, you would use the strcmp(a, b) function. JavaScript uses === for comparison, so you don't see strcmp. But it exists, and strcmp usually (but not always) requires comparing each character in one string with the characters in another; the time complexity of string comparison is O(n). A common JavaScript pattern to avoid is using strings as enums (strings-as-enums). However, with the advent of TypeScript, this should be easily avoided, as enum types default to integers.

// No
enum Position {
  TOP    = 'TOP',
  BOTTOM = 'BOTTOM',
}

// Yeppers
enum Position {
  TOP,    // = 0
  BOTTOM, // = 1
}

Here is a cost comparison:

// 1. string compare
const Position = {
  TOP: 'TOP',
  BOTTOM: 'BOTTOM',
}
 
let _ = 0
for (let i = 0; i < 1000000; i++) {
  let current = i % 2 === 0 ?
    Position.TOP : Position.BOTTOM
  if (current === Position.TOP)
    _ += 1
}

// 2. int compare
const Position = {
  TOP: 0,
  BOTTOM: 1,
}
 
let _ = 0
for (let i = 0; i < 1000000; i++) {
  let current = i % 2 === 0 ?
    Position.TOP : Position.BOTTOM
  if (current === Position.TOP)
    _ += 1
}

Comparison Results

2. Avoid Different Shapes#

JavaScript engines try to optimize code by assuming that objects have specific shapes and that functions will receive objects of the same shape. This allows them to store keys for all objects of that shape at once and store values in a separate flat array. Represented in JavaScript code, it looks like this:

// Objects accepted by the engine
const objects = [
  {
    name: 'Anthony',
    age: 36,
  },
  {
    name: 'Eckhart',
    age: 42
  },
]

// Optimized internal storage structure looks similar
const shape = [
  { name: 'name', type: 'string' },
  { name: 'age',  type: 'integer' },
]
 
const objects = [
  ['Anthony', 36],
  ['Eckhart', 42],
]

Note

I used the term "shape" to describe this concept, but note that you may also find "hidden class" or "map" used to describe it, depending on the engine.

For example, at runtime, if the function below receives two objects with the shape { x: number, y: number }, the engine will infer that future objects will have the same shape and generate machine code optimized for that shape.

function add(a, b) {
  return {
    x: a.x + b.x,
    y: a.y + b.y,
  }
}

If the objects we pass are not of shape { x, y } but of shape { y, x }, then the engine will need to retract its inference, and the function will suddenly become quite slow. My explanation here will be reserved because if you want to know more details, you should read mraleph's excellent article. But I want to emphasize that the V8 engine has three modes for access: monomorphic (1 shape), polymorphic (2-4 shapes), and megamorphic (5 or more shapes). Assume you really want to keep it monomorphic because the performance drop is quite severe:

Translator's Note:
To maintain code performance in monomorphic mode, you need to ensure that the objects passed to the function maintain the same shape. In the development of JavaScript and TypeScript, this means that when you define and manipulate objects, you need to ensure that the order of property addition is consistent, avoiding arbitrary addition or deletion of properties. This can help the V8 engine optimize object property access.
To improve performance, try to avoid changing the shape of objects at runtime. This involves avoiding adding or deleting properties or creating objects with the same properties in different orders. By maintaining consistency in object properties, you can help the JavaScript engine maintain efficient access to objects, thereby avoiding performance degradation due to shape changes.

// setup
let _ = 0

// 1. monomorphic
const o1 = { a: 1, b: _, c: _, d: _, e: _ }
const o2 = { a: 1, b: _, c: _, d: _, e: _ }
const o3 = { a: 1, b: _, c: _, d: _, e: _ }
const o4 = { a: 1, b: _, c: _, d: _, e: _ }
const o5 = { a: 1, b: _, c: _, d: _, e: _ } // all shapes are equal

// 2. polymorphic
const o1 = { a: 1, b: _, c: _, d: _, e: _ }
const o2 = { a: 1, b: _, c: _, d: _, e: _ }
const o3 = { a: 1, b: _, c: _, d: _, e: _ }
const o4 = { a: 1, b: _, c: _, d: _, e: _ }
const o5 = { b: _, a: 1, c: _, d: _, e: _ } // this shape is different

// 3. megamorphic
const o1 = { a: 1, b: _, c: _, d: _, e: _ }
const o2 = { b: _, a: 1, c: _, d: _, e: _ }
const o3 = { b: _, c: _, a: 1, d: _, e: _ }
const o4 = { b: _, c: _, d: _, a: 1, e: _ }
const o5 = { b: _, c: _, d: _, e: _, a: 1 } // all shapes are different

// test case
function add(a1, b1) {
  return a1.a + a1.b + a1.c + a1.d + a1.e +
         b1.a + b1.b + b1.c + b1.d + b1.e }
 
let result = 0
for (let i = 0; i < 1000000; i++) {
  result += add(o1, o2)
  result += add(o3, o4)
  result += add(o4, o5)
}

Comparison Results

So what should I do?

Easier said than done: create all objects with exactly the same shape. Even trivial things like writing React component props in a different order can trigger this situation.

For example, here are some simple cases I found in React's codebase, but they had a greater impact years ago on the same issue because they initialized an object with integers and then stored a floating point number. Yes, changing types also changes shapes. Yes, integer and floating point types are hidden behind number. Deal with it.

Note

The engine can often encode integers as values. For example, V8 represents 32-bit values, integers as compact Smi (SMall) values, while floating point numbers and large integers are passed as pointers, just like strings and objects. JSC uses 64-bit encoding, double-tagged, passing all numbers by value, just like SpiderMonkey, with the rest passed as pointers.

Translator's Note:
This encoding method allows JavaScript engines to efficiently handle various types of numbers without sacrificing performance. The use of Smi for small integers reduces the need to allocate heap memory for them, improving the efficiency of operations. For larger numbers, while they need to be accessed via pointers, this method still ensures that JavaScript maintains flexibility and performance when dealing with various numeric types.
This design also reflects the wisdom of JavaScript engine authors in balancing data representation and performance optimization, especially in dynamic typed languages, where this balance is particularly important. By doing so, the engine can minimize memory usage and maximize computation speed when executing JavaScript code.

3. Avoid Using Array/Object Methods#

Like others, I enjoy functional programming, but unless you are working in Haskell / OCaml / Rust, functional code is always slower than imperative code unless it is compiled into efficient machine code.

const result =
  [1.5, 3.5, 5.0]
    .map(n => Math.round(n))
    .filter(n => n % 2 === 0)
    .reduce((a, n) => a + n, 0)

The problem with these methods is:

They require making a complete copy of the array, which later needs to be released by the garbage collector. We will explore memory I/O issues in more detail in Section 5.
They loop N times for N operations, while a for loop only allows looping once.

// setup:
const numbers = Array.from({ length: 10_000 }).map(() => Math.random())

// 1. functional
const result =
  numbers
    .map(n => Math.round(n * 10))
    .filter(n => n % 2 === 0)
    .reduce((a, n) => a + n, 0)

// 2. imperative
let result = 0
for (let i = 0; i < numbers.length; i++) {
  let n = Math.round(numbers[i] * 10)
  if (n % 2 !== 0) continue
  result = result + n
}

Comparison Results

Methods like Object.values(), Object.keys(), and Object.entries() also encounter similar issues, as they allocate more data, and memory access is the root of all performance issues. I swear I will show you this in Section 5.

4. Avoid Indirect Access#

Another place to look for optimization gains is any indirect access; I can see three main sources:

const point = { x: 10, y: 20 }
 
// 1. Proxy objects are harder to optimize because their get/set functions may be running custom logic, so the engine cannot make its usual assumptions.
const proxy = new Proxy(point, { get: (t, k) => { return t[k] } })
// Some engines can eliminate the cost of proxies, but these optimizations are costly and easily broken.
const x = proxy.x
 
// 2. Often overlooked, but accessing objects via `.` or `[]` is also an indirect access. In simple cases, the engine is likely to optimize the cost:
const x = point.x
// But each additional access increases the cost and makes it harder for the engine to make assumptions about the state of "point":
const x = this.state.circle.center.point.x
 
// 3. Finally, function calls can also incur costs. The engine is generally good at inlining these functions:
function getX(p) { return p.x }
const x = getX(p)
// But this cannot be guaranteed. Especially if the function call does not come from a static function but from parameters, etc.:
function Component({ point, getX }) {
  return getX(point)
}

Proxy benchmarking is currently particularly brutal on V8. The last time I checked, proxy objects always returned from JIT to the interpreter, and from these results, it may still be the case.

// 1. Proxy access
const point = new Proxy({ x: 10, y: 20 }, { get: (t, k) => t[k] })
 
for (let _ = 0, i = 0; i < 100_000; i++) { _ += point.x }


// 2. Direct access
const point = { x: 10, y: 20 }
const x = point.x
 
for (let _ = 0, i = 0; i < 100_000; i++) { _ += x }

Comparison Results

I also wanted to show the comparison between accessing deeply nested objects and direct access, but the engine is good at optimizing object access through escape analysis when there are hot loops and constant objects here. To prevent this, I inserted some indirect methods.

Translator's Note:
A "hot loop" refers to a loop that runs frequently during program execution, i.e., a part of the code that is executed repeatedly in large quantities. Because this part of the code is executed so many times, it has a significant impact on the program's performance, making it a key point for performance optimization. This will appear later.

// 1. Nested access
const a = { state: { center: { point: { x: 10, y: 20 } } } }
const b = { state: { center: { point: { x: 10, y: 20 } } } }
const get = (i) => i % 2 ? a : b
 
let result = 0
for (let i = 0; i < 100_000; i++) {
  result = result + get(i).state.center.point.x }


// 2. Direct access
const a = { x: 10, y: 20 }.x
const b = { x: 10, y: 20 }.x
const get = (i) => i % 2 ? a : b
 
let result = 0
for (let i = 0; i < 100_000; i++) {
  result = result + get(i) }

Comparison Results

5. Avoid Cache Misses#

This point requires a bit of low-level knowledge, but it also has an impact even in JavaScript, so I will explain it. From the CPU's perspective, fetching memory from RAM is slow. To speed it up, it primarily uses two optimization methods.

5.1 Prefetching#

The first is prefetching: it fetches more memory in advance, hoping that this memory is of interest to you. It always guesses that if you request a memory address, you will be interested in the memory area that follows it. Therefore, accessing data in order is key. In the example below, we can observe the impact of accessing memory in random order.

// setup:
const K = 1024
const length = 1 * K * K
 
// These points are created one after another, so they are allocated in memory in order.
const points = new Array(length)
for (let i = 0; i < points.length; i++) {
  points[i] = { x: 42, y: 0 }
}
 
// This array contains the same data as above, but shuffled randomly.
const shuffledPoints = shuffle(points.slice())


// 1. Sequential
let _ = 0
for (let i = 0; i < points.length; i++) { _ += points[i].x }


// 2. Random
let _ = 0
for (let i = 0; i < shuffledPoints.length; i++) { _ += shuffledPoints[i].x }

Comparison of Objects

So what should I do?

Applying this concept in practice may be the most difficult, as JavaScript does not provide a way to specify the location of objects in memory, but you can leverage this knowledge to your advantage, such as manipulating data before reordering or sorting. You cannot assume that objects created in order will remain in the same location over time, as the garbage collector may move them. But there is one exception: numeric arrays, preferably TypedArray instances.

For more detailed examples, see this link *.

Note that it contains some optimizations that are now outdated, but overall it is still accurate.

Translator's Note:
JavaScript's TypedArray instances provide an efficient way to handle binary data. Compared to regular arrays, TypedArray not only uses memory more efficiently, but they access contiguous memory regions, allowing for better performance during data operations. This is closely related to the concept of prefetching, as handling contiguous memory blocks is usually faster than randomly accessing memory, especially when dealing with large amounts of data.
For example, if you are processing large numerical data, such as image processing or scientific computing, using TypedArray like Float32Array or Int32Array can yield better performance. Since these arrays operate directly on memory, they can read and write data faster, especially during sequential access. This not only reduces the overhead of the JavaScript runtime but also leverages modern CPU's prefetching and caching mechanisms to accelerate data processing.

5.2 Caches at L1/2/3#

The second optimization method used by CPUs is L1/L2/L3 caches: these caches act like faster RAMs but are also more expensive, so their capacity is much smaller. They contain data from RAM but act as LRU caches. Data enters when it is "hot" (being processed) and is written back to main RAM when new working data needs space. Therefore, the key here is to use as little data as possible, keeping the working data set in fast caches. In the example below, we can observe the effect of breaking each contiguous cache.

// setup:
const KB = 1024
const MB = 1024 * KB
 
// These are approximate sizes that fit these caches. If you don't get the same results on your computer, it may be because your sizes are different.
const L1  = 256 * KB
const L2  =   5 * MB
const L3  =  18 * MB
const RAM =  32 * MB
 
// We will access the same buffer for all test cases, but we will only access the first 0 to "L1" entries in the first case, 0 to "L2" entries in the second case, and so on.
const buffer = new Int8Array(RAM)
buffer.fill(42)
 
const random = (max) => Math.floor(Math.random() * max)

// 1. L1
let r = 0; for (let i = 0; i < 100000; i++) { r += buffer[random(L1)] }

// 2. L2
let r = 0; for (let i = 0; i < 100000; i++) { r += buffer[random(L2)] }

// 3. L3
let r = 0; for (let i = 0; i < 100000; i++) { r += buffer[random(L3)] }

// 4. RAM
let r = 0; for (let i = 0; i < 100000; i++) { r += buffer[random(RAM)] }

Comparison Results

So what should I do?

Ruthlessly eliminate every data or memory allocation that can be eliminated. The smaller the dataset, the faster the program runs. Memory I/O is the bottleneck for 95% of programs. Another good strategy is to break your work into chunks and ensure you process one small dataset at a time.

For more details on CPUs and memory, see this link.

Note

Regarding immutable data structures — immutability is great for clarity and correctness, but in terms of performance, updating an immutable data structure means copying containers, which leads to more memory I/O and cache flushing. You should avoid immutable data structures whenever possible.
Regarding the ... spread operator — it is very convenient, but using it creates a new object in memory each time. More memory I/O, slower cache!

6. Avoid Large Objects#

As mentioned in Section 2, engines use shapes to optimize objects. However, when shapes become too large, the engine has no choice but to use regular hash tables (like Map objects). As we saw in Section 5, cache misses can significantly degrade performance. Hash tables are prone to this because their data is often randomly and uniformly distributed across the memory regions they occupy. Let's see how this user mapping indexed by user ID performs.

// setup:
const USERS_LENGTH = 1_000

// setup:
const byId = {}
Array.from({ length: USERS_LENGTH }).forEach((_, id) => {
  byId[id] = { id, name: 'John'}
})
let _ = 0

// 1. [] access
Object.keys(byId).forEach(id => { _ += byId[id].id })

// 2. direct access
Object.values(byId).forEach(user => { _ += user.id })

Comparison Results

We can also observe how performance declines as the size of the object increases:

// setup:
const USERS_LENGTH = 100_000

Comparison Results

So what should I do?

As mentioned above, avoid frequently indexing large objects. It is better to convert objects to arrays in advance. Organizing IDs on the model can help you structure your data, as you can use Object.values() without having to reference key mappings to get IDs.

7. Use eval#

Some JavaScript patterns are difficult to optimize for the engine, and by using eval() or its derivatives, you can make these patterns disappear. In this example, we can observe how using eval() avoids the cost of creating objects with dynamic keys:

// setup:
const key = 'requestId'
const values = Array.from({ length: 100_000 }).fill(42)
// 1. without eval
function createMessages(key, values) {
  const messages = []
  for (let i = 0; i < values.length; i++) {
    messages.push({ [key]: values[i] })
  }
  return messages
}
 
createMessages(key, values)


// 2. with eval
function createMessages(key, values) {
  const messages = []
  const createMessage = new Function('value',
    `return { ${JSON.stringify(key)}: value }`
  )
  for (let i = 0; i < values.length; i++) {
    messages.push(createMessage(values[i]))
  }
  return messages
}
 
createMessages(key, values)

Comparison Results

Translator's Note:
In the version using eval (implemented via the Function constructor), the example first creates a function that dynamically generates an object with a dynamic key and given value. Then, this function is called in a loop to create the messages array. This method reduces runtime computation and object creation overhead by pre-compiling the function that generates objects.
While using eval and the Function constructor can avoid certain runtime overheads, they also introduce potential security risks since they can execute arbitrary code. Additionally, dynamically evaluated code may be difficult to debug and optimize, as it is generated at runtime, and the JavaScript engine may not be able to perform effective optimizations in advance.
Therefore, while this example demonstrates a performance optimization technique, it is recommended in practical applications to seek other methods to optimize performance while maintaining code safety and maintainability. In most cases, avoiding the need for extensive dynamic object creation and improving performance through static analysis and code refactoring would be a better choice.

Another great use case for eval is compiling a filtering predicate function, in which you can discard branches that you know will never be executed. Generally, any function that is to run in a very hot loop is suitable for this optimization.

Clearly, the common warnings about eval() also apply: do not trust user input, sanitize anything passed to eval() code, and avoid any potential XSS. Also, note that some environments do not allow access to eval(), such as browser pages with CSP.

8. Be Careful with Strings#

We have seen above that strings are more expensive than they seem. Well, I have good news and bad news here, and I will announce them in the only logical order (bad first, good later): strings are more complex than they appear, but they can also be used very efficiently.

String operations are a core part of JavaScript due to their context. To optimize code that heavily uses strings, engines must be creative. I mean, they must represent string objects using multiple string representations in C++. There are two general cases worth your concern because they apply to V8 (the most common engine so far) and generally to other engines as well.

First, strings concatenated using + do not create copies of the two input strings. This operation creates pointers to each substring. If it were TypeScript, it would look like this:

class String {
  abstract value(): char[] {}
}
 
class BytesString {
  constructor(bytes: char[]) {
    this.bytes = bytes
  }
  value() {
    return this.bytes
  }
}
 
class ConcatenatedString {
  constructor(left: String, right: String) {
    this.left = left
    this.right = right
  }
  value() {
    return [...this.left.value(), ...this.right.value()]
  }
}
 
function concat(left, right) {
  return new ConcatenatedString(left, right)
}
 
const first = new BytesString(['H', 'e', 'l', 'l', 'o', ' '])
const second = new BytesString(['w', 'o', 'r', 'l', 'd'])
 
// See, no array copies!
const message = concat(first, second)

Second, string slices (string slices) also do not need to create copies: they can simply point to a range in another string. Continuing from the example above:

class SlicedString {
  constructor(source: String, start: number, end: number) {
    this.source = source
    this.start = start
    this.end = end
  }
  value() {
    return this.source.value().slice(this.start, this.end)
  }
}
 
function substring(source, start, end) {
  return new SlicedString(source, start, end)
}
 
// This represents "He", but it still contains no array copies.
// It's a SlicedString to a ConcatenatedString to two BytesString
const firstTwoLetters = substring(message, 0, 2)

But the problem is: once you need to start changing these bytes, that is when you start incurring the cost of copying. Suppose we return to our String class and try to add a .trimEnd method:

class String {
  abstract value(): char[] {}
 
  trimEnd() {
    // `.value()` here might call our Sliced->Concatenated->2*Bytes string!
    const bytes = this.value()
 
    const result = bytes.slice()
    while (result[result.length - 1] === ' ')
      result.pop()
    return new BytesString(result)
  }
}

So let's jump to an example where we compare using mutation operations (mutation) and using concatenation operations (concatenation):

// setup:
const classNames = ['primary', 'selected', 'active', 'medium']

// 1. mutation
const result =
  classNames
    .map(c => `button--${c}`)
    .join(' ')

// 2. concatenation
const result =
  classNames
    .map(c => 'button--' + c)
    .reduce((acc, c) => acc + ' ' + c, '')

Comparison Results

So what should I do?

In general, avoid mutation as much as possible. This includes methods like .trim(), .replace(), etc. Consider how to avoid using these methods. In some engines, string templates may also be slower than +. Currently, this is the case in V8, but it may not be in the future, so always base it on benchmarks.

Regarding the SlicedString above, it is important to note that if a small substring of a large string still exists in memory, it may prevent the garbage collector from collecting the large string! So if you are dealing with large texts and extracting small strings from them, you may leak a lot of memory.

const large = Array.from({ length: 10_000 }).map(() => 'string').join('')
const small = large.slice(0, 50)
//    ^ will keep alive

The solution here is to use one of the mutation methods to our advantage. If we use one of these methods on small, it will force a copy, and the old pointer to large will be lost:

// replace a token that doesn't exist
const small = small.replace('#'.repeat(small.length + 1), '')

For more details, see string.h on V8 or JSString.h on JavaScriptCore.

Note

Regarding string complexity — I quickly skimmed over some things, but there are still many implementation details that add to the complexity of strings. Each string representation usually has a minimum length. For very small strings, concatenated strings may not be used. There are sometimes also restrictions, such as avoiding substrings pointing to substrings. Reading the C++ files linked above, even just the comments, can give a good understanding of the implementation details.

9. Specialization#

An important concept in performance optimization is specialization: tuning logic to fit the constraints of specific use cases. This often means figuring out which cases might occur and coding for those cases.

Suppose we are a merchant who sometimes needs to add tags to a product list. Based on experience, we know that tags are often empty. With this information, we can design our function specifically for this case:

// setup:
const descriptions = ['apples', 'oranges', 'bananas', 'seven']
const someTags = {
  apples: '::promotion::',
}
const noTags = {}
 
// Convert products to string, including their tags if applicable
function productsToString(description, tags) {
  let result = ''
  description.forEach(product => {
    result += product
    if (tags[product]) result += tags[product]
    result += ', '
  })
  return result
}
 
// Specialize it now
function productsToStringSpecialized(description, tags) {
  // We know `tags` is likely empty, so we check once in advance and can remove the `if` check from the inner loop
  if (isEmpty(tags)) {
    let result = ''
    description.forEach(product => {
      result += product + ', '
    })
    return result
  } else {
    let result = ''
    description.forEach(product => {
      result += product
      if (tags[product]) result += tags[product]
      result += ', '
    })
    return result
  }
}
function isEmpty(o) { for (let _ in o) { return false } return true }
 
// 1. not specialized
for (let i = 0; i < 100; i++) {
  productsToString(descriptions, someTags)
  productsToString(descriptions, noTags)
  productsToString(descriptions, noTags)
  productsToString(descriptions, noTags)
  productsToString(descriptions, noTags)
}


// 2. specialized
for (let i = 0; i < 100; i++) {
  productsToStringSpecialized(descriptions, someTags)
  productsToStringSpecialized(descriptions, noTags)
  productsToStringSpecialized(descriptions, noTags)
  productsToStringSpecialized(descriptions, noTags)
  productsToStringSpecialized(descriptions, noTags)
}

Comparison Results

This optimization can yield moderate improvements, but these improvements accumulate. They are a good complement to more significant optimizations like shapes and memory I/O. But be careful: if your conditions change, specialization may backfire, so be cautious when applying it.

Note

Branch prediction and branchless code — removing branches from code can greatly improve performance. For more details on branch predictors, read the classic answer on Stack Overflow: Why is processing a sorted array faster than processing an unsorted array?

10. Data Structures#

I won't go into too much detail about data structures, as this would require a separate article. But note that using the wrong data structure can have an impact on your use case that may be greater than any of the optimizations mentioned above. I recommend familiarizing yourself with native data structures like Map and Set, as well as linked lists, priority queues, trees (RB and B+), and tries.

But as a quick example, let's compare the performance of Array.includes and Set.has in a small list:

// setup:
const userIds = Array.from({ length: 1_000 }).map((_, i) => i)
const adminIdsArray = userIds.slice(0, 10)
const adminIdsSet = new Set(adminIdsArray)
// 1. Array
let _ = 0
for (let i = 0; i < userIds.length; i++) {
  if (adminIdsArray.includes(userIds[i])) { _ += 1 }
}
// 2. Set
let _ = 0
for (let i = 0; i < userIds.length; i++) {
  if (adminIdsSet.has(userIds[i])) { _ += 1 }
}

Comparison Results

As you can see, the choice of data structure can have a very significant impact.

As a real-world example, I once encountered a scenario where switching from an array to a linked list reduced the runtime of a function from 5 seconds to 22 milliseconds.

11. Benchmarking#

I left this section for last for one reason: I needed to establish credibility through the interesting parts above. Now that I have it (hopefully), let me tell you that benchmarking is the most important part of optimization work. It is not only the most important but also the most difficult. Even with 20 years of experience, I sometimes create flawed benchmarks or misuse profiling tools. Therefore, please do your best to create benchmarks correctly.

11.0 Top-Down#

Your top priority should always be to optimize the functions/code segments that account for the largest portion of run time. If you spend time optimizing parts that are not the most important, you are wasting time.

11.1 Avoid Micro-Benchmarks#

Run code in production mode and optimize based on those observations. JS engines are very complex, and their performance in micro-benchmarks often differs from real application scenarios. For example, look at this micro-benchmark:

const a = { type: 'div', count: 5, }
const b = { type: 'span', count: 10 }
 
function typeEquals(a, b) {
  return a.type === b.type
}
 
for (let i = 0; i < 100_000; i++) {
  typeEquals(a, b)
}

If you pay a little attention, you will notice that the engine will specialize the function for the shape { type: string, count: number }. But does this hold in real applications? Are a and b always this shape, or will they receive any shape? If you receive many shapes in production, then the behavior of this function will differ.

11.2 Question Your Results#

If you just optimized a function and it now runs 100 times faster than before, then question it. Try to overturn your results, run it in production mode, and stress test it. Similarly, also question your tools. Just using devtools to observe benchmarks can change their behavior.

11.3 Choose Your Targets#

Different engines optimize certain patterns better than others. You should benchmark against the engines that matter to you and prioritize which engine is more important. This is a real example in Babel, where improving V8 meant degrading JSC's performance.

12. Analysis and Tools#

Various discussions about analysis and development tools.

12.1 Browser Traps#

If you are profiling in a browser, make sure the browser profile you are using is clean and blank. For this, I even use a separate browser. If you have browser extensions enabled while profiling, they will disrupt your measurements. In particular, React devtools significantly affect results, making the rendering speed of the code appear slower than what is actually presented to users.

12.2 Sample vs Structural Analysis#

Browser performance profiling tools are sampling-based profilers that periodically sample your call stack. This has a significant drawback: some very small and frequently called functions may be called during these sampling gaps and may be severely underreported in the stack charts you see. To mitigate this, use Firefox devtools to customize the sampling interval or use Chrome devtools with CPU throttling.

12.3 Common Tools in Performance Optimization#

In addition to regular browser devtools, understanding these setting options may be helpful:

Chrome devtools has many experimental flags that can help you identify the reasons for slowdowns. The style invalidation tracker is very useful when you need to debug style/layout recalculations in the browser.
https://github.com/iamakulov/devtools-perf-features
The deoptexplorer-vscode extension allows you to load V8/chromium log files to understand when your code triggers optimizations and when you pass different shapes to functions. You don't need this extension to read log files, but it makes the experience more pleasant.
https://github.com/microsoft/deoptexplorer-vscode
You can also compile debug shells for each JS engine to gain more detailed insights into how they work. This allows you to run perf and other low-level tools and inspect the bytecode and machine code generated by each engine.
V8 Example | JSC Example | SpiderMonkey Example (missing)