Change ScriptSplitter to use charAt instead of toString endsWith #11130
+1
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
While developing some test container tests, using 60 separate SQL files, I noticed my TestContainers were taking upwards of 6 minutes to load. Each script file has a single, relatively small (under 100 lines), create/alter/insert statement.
After running a profiler I noticed that the culprit for the load time was this ScriptSplitter method.

Problem
Specifically the
StringBuilder.toString
method being called to check if the last character is whitespace or not results in the entire StringBuilder array to be re-built every time it encounters whitespace in the ScriptScanner.Based on my, very surface-level, analysis, it seems the result is
O(n*m)
performance wheren
is the content of a SQL file andm
is the number ofLexem
objects discovered by theScriptScanner
.Proposed Solution
By instead utilizing StringBuilder's charAt method, it should ensure this check is always
O(1)
by doing a direct byte array lookup to see if the last character is a space character.Also, using
StringBuilder.length() == 0
overStringBuilder.isEmpty
for backwards compatibility sake.