Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: performance overhaul #26114

Draft
wants to merge 377 commits into
base: main
Choose a base branch
from
Draft

Conversation

GalacticHypernova
Copy link
Contributor

@GalacticHypernova GalacticHypernova commented Mar 6, 2024

NOTE:

This is a continuation and a rework of the previous PR that had some unexplainable issues so I decided to rewrite it. I have opened this to rework it more carefully and take care of all the edge cases and random errors. (For more info, please refer to #25771)

NOTE 2:

The 70 docs commits are due to me forgetting to flush the original patch-9 branch when I was done with that one. I usually start these PR's remotely, which means I don't get to name them something unique, and I just got onto the github app to try and re-run the failed tests, which still had the old patch-9 commits. That's my bad. I used the origin's files in every conflict so it ended up just not changing anything (as can be seen in the files changed).

πŸ”— Linked issue

Related: #25771

❓ Type of change

  • πŸ“– Documentation (updates to the documentation, readme or JSdoc annotations)
  • 🐞 Bug fix (a non-breaking change that fixes an issue)
  • πŸ‘Œ Enhancement (improving an existing functionality like performance)
  • ✨ New feature (a non-breaking change that adds functionality)
  • 🧹 Chore (updates to the build process or auxiliary tools and libraries)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

πŸ“š Description

This is a PR that aims to improve both build time and runtime performance to the max by refactoring everything that comes to mind with performance best practices, some of which include smarter and fewer object/array iterations, reduced indexes, smarter variable usage for constant values, and more.

Some key improvements in this PR:

  • Refactors of known costly javascript methods:
    In some places throughout the app, there are operations that rely on certain javascript methods that are known to be more expensive at certain conditions than manual implementation
    (similar to endsWith in perf(vite): rework endsWith to direct indexΒ #24746
    and startsWith in perf(kit, schema, nuxt): rework startsWith to direct indexΒ #24744).
    One key change under this category is the usage of Array.prototype.includes.
    Array.prototype.includes appears to be slower when checking for existence of a certain input between at the very least 2 elements than directly checking each element with strict equality (like when dealing with strings).
    After thorough benchmarking, both browser and node, direct comparisons showed marginally better performance, benchmarks attached below in that same order with the following benchmark:

    const a = "#text"
    console.time("start")
    for(let i = 0; i< 1000000; i++) {
      const inc = ['#comment', '#text'].includes(a)
    }
    console.timeEnd("start")
    console.time("stop")
    for(let i = 0; i< 1000000; i++) {
      const inc  = a === "#comment" || a === "#text"
    }
    console.timeEnd("stop")

    Browser:

    image

    Node:

    image

  • Reduced and reworked iterations
    Originally the main change in this PR, many parts in nuxt involve some sort of array/object manipulations, and in some places they are either repeated, redundant, or simply costly in terms of performance. The main examples of such cases are the small side PR's I have submitted like perf(kit): avoid duplicate join operationΒ #24717, perf(nuxt): avoid duplicate iterations over layersΒ #24730, and numerous others. I couldn't get all of it at the same time so I have decided to group the rest here, and also handle the iteration logic as opposed to mere count.

  • Reduced indexes:
    Some places in nuxt include working with indexes, like the length property of array/strings. While that alone isn't expensive, it still makes that repeated index, which especially with bigger inputs (like in HTML parsing or AST traversal) will gradually cost more and more microseconds, and might eventually become slightly noticeable in page performance.

  • Early returns
    Similar to perf: don't manipulate an empty valueΒ #25647, there are some places where some iterations are made on potentially empty values, which increase the overhead and therefore decrease performance. This PR adds guards to ensure there won't be iterations over an empty value.

  • Avoiding amortized operations where possible
    Methods like Array.prototype.push have an amortized complexity (O(1) in the case of push), which for the most part would serve most use cases well. But if we want to really fine-tune performance and take care of that amortization, we can remove them in favor of plain complexity if we know the size of the array. In multiple places around the Nuxt source, there are usages of push where the exact array size is known. This PR uses a hybrid approach that refactors said parts with known sizes by using the new Array(size) constructor with direct index modification, which would result in plain O(1) complexity, potentially outperforming the amortized O(1) complexity of push, while keeping push for dynamic arrays, variable sizes, and other parts where the size cannot be accurately obtained without causing increased time complexity (like Object.keys to obtain the size of an object).

πŸ“ Checklist

  • I have linked an issue or discussion.
  • I have added tests (if possible).
  • I have updated the documentation accordingly.

Copy link

stackblitz bot commented Mar 6, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Copy link
Contributor Author

@GalacticHypernova GalacticHypernova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes to self:

packages/nuxt/src/app/composables/state.ts Outdated Show resolved Hide resolved
@GalacticHypernova GalacticHypernova marked this pull request as ready for review June 9, 2024 04:58
@@ -89,7 +89,12 @@ export default defineUntypedSchema({
async $resolve (val, get) {
// TODO: remove in v3.10
val = val ?? await (get('experimental') as Promise<Record<string, any>>).then((e: Record<string, any>) => e?.inlineSSRStyles)
if (val === false || (await get('dev')) || (await get('ssr')) === false || (await get('builder')) === '@nuxt/webpack-builder') {
const [dev, ssr, builder] = await Promise.all([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems less performant in the most common case (running in dev mode) as previously we short-circuited if any of those values were true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, this might be a bit less performant in best case scenario, but it's difficult to make it more performant for the worst case scenario without sacrificing a tiny bit for the rest. Due to it being parallelized, the potential impact in best case is minimized and therefore justifies the potential improvement in worst case.
It's like #24718 , where for more than 1 plugin/middleware it could slow it down a bit due to the additional if check but in other cases it improves it, it's about the average improvement across a wide variety of possible scenarios.
Although I could run a benchmark real quick to see how it behaves, I'll report back with the results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, the benchmark seems to show that the second approach is much slower. (difference 10 seconds). That was my mistake, I apologize, and thanks for spotting that! Here's the benchmark I used:

async function get(key) {
  return new Promise((resolve) => {
    setTimeout(() => {
      const values = {
        dev: Math.random() > 0.5,
        ssr: Math.random() > 0.5,
        builder: '@nuxt/webpack-builder'
      };
      resolve(values[key]);
    }, 100);
  });
}

async function snippet1(val) {
  if (val === false || (await get('dev')) || (await get('ssr')) === false || (await get('builder')) === '@nuxt/webpack-builder') {
    return true;
  }
  return false;
}

async function snippet2(val) {
  const [dev, ssr, builder] = await Promise.all([
    get('dev'),
    get('ssr'),
    get('builder'),
  ]);
  if (val === false || dev || ssr === false || builder === '@nuxt/webpack-builder') {
    return true;
  }
  return false;
}

async function benchmark() {
  const iterations = 100;
  let start, end;
  
  console.time("e")
  for (let i = 0; i < iterations; i++) {
    await snippet1(false);
  }
  console.timeEnd("e")
  
  console.time("F")
  for (let i = 0; i < iterations; i++) {
    await snippet2(false);
  }
  console.timeEnd("F")
}

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when testing with val = true, I get that the new version is almost twice as fast:
image

So it's more so about which is more likely to occur. Which approach do you think we should take?
Actually, I have an idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an update that combines the best of both worlds. If val is false it will immediately return false before awaiting the Promise.all, as per the results of the benchmark when val is set to false.
Otherwise it awaits promise.all and handles everything else. This actually doesn't impact the best case scenario at all, when dev is true
image

And it does make sense when I think about it, considering Promise.all parallelizes it, so each promise is handled in a different core, separate from the rest. There is no dependency between each promise so there's nothing to cause slowdown. Of course, as the if conditions progress to the next, the performance difference becomes much bigger (for example, here it is when dev is hardcoded to false)
image

@GalacticHypernova GalacticHypernova marked this pull request as draft June 11, 2024 18:54
@GalacticHypernova
Copy link
Contributor Author

Marking as draft to resolve type issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants