Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ast): implement utf8 to utf16 span converter #8687

Merged
merged 1 commit into from
Jan 24, 2025
Merged

Conversation

Boshen
Copy link
Member

@Boshen Boshen commented Jan 24, 2025

closes #8629

@Boshen Boshen requested a review from overlookmotel January 24, 2025 08:07
Copy link

graphite-app bot commented Jan 24, 2025

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

@github-actions github-actions bot added A-parser Area - Parser A-ast Area - AST C-enhancement Category - New feature or request labels Jan 24, 2025
@Boshen Boshen changed the title feat(estree): implement utf8 to utf16 span converter feat(ast): implement utf8 to utf16 span converter Jan 24, 2025
@overlookmotel
Copy link
Contributor

I've think I've fixed it. The algorithm I suggested in #8629 (comment) was slightly wrong. Have to go out now, but will check it in full later today.

@overlookmotel
Copy link
Contributor

Is ModuleRecord also exposed via parser NAPI bindings? If so, I guess we'll need to translate those spans too.

Copy link

codspeed-hq bot commented Jan 24, 2025

CodSpeed Performance Report

Merging #8687 will not alter performance

Comparing utf8-utf16 (b7f13e6) with main (10e5920)

Summary

✅ 32 untouched benchmarks
🆕 1 new benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
🆕 estree[checker.ts] N/A 207.5 ms N/A

@Boshen Boshen marked this pull request as ready for review January 24, 2025 16:11
@Boshen
Copy link
Member Author

Boshen commented Jan 24, 2025

Is ModuleRecord also exposed via parser NAPI bindings? If so, I guess we'll need to translate those spans too.

Probably not, our aim is a normal estree. I'll work on the APIs in another set of PRs.

@Boshen
Copy link
Member Author

Boshen commented Jan 24, 2025

I've think I've fixed it. The algorithm I suggested in #8629 (comment) was slightly wrong. Have to go out now, but will check it in full later today.

I'll let you merge.

@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label Jan 24, 2025
Copy link
Contributor

overlookmotel commented Jan 24, 2025

Merge activity

  • Jan 24, 11:57 AM EST: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Jan 24, 11:57 AM EST: A user added this pull request to the Graphite merge queue.
  • Jan 24, 12:04 PM EST: A user merged this pull request with the Graphite merge queue.

@graphite-app graphite-app bot merged commit b7f13e6 into main Jan 24, 2025
28 checks passed
@graphite-app graphite-app bot deleted the utf8-utf16 branch January 24, 2025 17:04
@overlookmotel
Copy link
Contributor

Added more tests. It seems to be good now. It's pretty slow due to (I'm guessing) repeated binary searches. We can optimize it later on.

In case you're wondering why I switched Utf8ToUtf16 to being a single Vec, rather than 2 x Vecs SoA style: The problem with using 2 x Vecs is that when you push to them there's 2 x bounds checks instead of 1, so I imagine that'd hurt perf more than the gain from better cache locality.

To avoid the extra bounds check, we'd need a type which manages the 2 arrays together, growing both at the same time, and storing length + capacity only once. Then if an index is in bounds of one array, it's guaranteed to be in bounds of the other, and only 1 bounds check is required.

Same thing with Semantic's SoAs: oxc-project/backlog#11

Boshen added a commit that referenced this pull request Jan 26, 2025
## [0.48.1] - 2025-01-26

### Features

- b7f13e6 ast: Implement utf8 to utf16 span converter (#8687) (Boshen)
- 6589c3b mangler: Reuse variable names (#8562) (翠 / green)
- 29bd215 minifier: Minimize `Infinity.toString(radix)` to `'Infinity'`
(#8732) (Boshen)
- e0117db minifier: Replace `const` with `let` for non-exported
read-only variables (#8733) (sapphi-red)
- 9e32f55 minifier: Evaluate `Math.sqrt` and `Math.cbrt` (#8731)
(sapphi-red)
- 360d49e minifier: Replace `Math.pow` with `**` (#8730) (sapphi-red)
- 2e9a560 minifier: `NaN.toString(radix)` is always `NaN` (#8727)
(Boshen)
- cbe0e82 minifier: Minimize `foo(...[])` -> `foo()` (#8726) (Boshen)
- e9fb5fe minifier: Dce pure expressions such as `new Map()` (#8725)
(Boshen)

### Bug Fixes

- 0944758 codegen: Remove parens from `new (import(''), function() {})`
(#8707) (Boshen)
- 33de70a mangler: Handle cases where a var is declared in a block scope
(#8706) (翠 / green)
- d982cdb minifier: `Unknown.fromCharCode` should not be treated as
`String.fromCharCode` (#8709) (sapphi-red)
- e7ab96c transformer/jsx: Incorrect `isStaticChildren` argument for
`Fragment` with multiple children (#8713) (Dunqing)
- 3e509e1 transformer/typescript: Enum merging when same name declared
in outer scope (#8691) (branchseer)

### Performance

- dc0b0f2 manger: Remove useless `tmp_bindings` (#8735) (Dunqing)
- e472ced mangler: Optimize handling of collecting lived scope ids
(#8724) (Dunqing)
- 8587965 minifier: Normalize `undefined` to `void 0` before everything
else (#8699) (Boshen)

### Refactor

- 58002e2 ecmascript: Remove the lifetime annotation on
`MayHaveSideEffects` (#8717) (Boshen)
- 10e5920 linter: Move finishing default diagnostic message to
`GraphicalReporter` (#8683) (Sysix)
- 52a37d0 mangler: Simplify initialization of `slots` (#8734) (Dunqing)
- 6bc906c minifier: Allow mutating arguments in methods called from
`try_fold_known_string_methods` (#8729) (sapphi-red)
- bf8be23 minifier: Use `Ctx` (#8716) (Boshen)
- 0af0267 minifier: Side effect detection needs symbols resolution
(#8715) (Boshen)
- 32e0e47 minifier: Clean up `Normalize` (#8700) (Boshen)
- c792068 semantic: Simplify `ScopeTree::iter_bindings` (#8723)
(Dunqing)

### Testing

- 03229c5 minifier: Fix broken tests (#8722) (Boshen)

Co-authored-by: Boshen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0-merge Merge with Graphite Merge Queue A-ast Area - AST A-parser Area - Parser C-enhancement Category - New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ast: an extra pass for utf8 to utf16 span offsets
2 participants