-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-34379][table] Fix adding catalogtable logic #24788
Conversation
hasAdded = true; | ||
} | ||
boolean alreadyExists = false; | ||
for (ContextResolvedTable table : tables) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how big the tables could be and how often the method will be called. Does it make sense to use e.g. a Map<K,V> to avoid the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. But note that tables
is of type Set
already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't get your point. I meant looping the Set
might have performance issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are many reasons to use Map
instead of Set
:
- the logic is point search instead of loop search as I mentioned below.
- O(1) than O(n) for better performance, because the The
DynamicPartitionPruningUtils
class will be used centrally for batch jobs[1], i.e. for large projects with many tables, it could be a bottleneck. - less code while using e.g. Map.putIfAbsent(K, V)
[1]
Line 103 in 0737220
if (DynamicPartitionPruningUtils.isDppDimSide(leftSide)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jeyhunkarimov, LGTM.
I think we should waiting for @JingGe's +1 also.
5ceaf0e
to
bdd150d
Compare
bdd150d
to
01a2aaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hasAdded = true; | ||
} | ||
boolean alreadyExists = false; | ||
for (ContextResolvedTable table : tables) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't get your point. I meant looping the Set
might have performance issue.
} | ||
boolean alreadyExists = false; | ||
for (ContextResolvedTable table : tables) { | ||
if (table.getIdentifier().equals(catalogTable.getIdentifier())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a typical hash map logic
hasAdded = true; | ||
} | ||
boolean alreadyExists = false; | ||
for (ContextResolvedTable table : tables) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are many reasons to use Map
instead of Set
:
- the logic is point search instead of loop search as I mentioned below.
- O(1) than O(n) for better performance, because the The
DynamicPartitionPruningUtils
class will be used centrally for batch jobs[1], i.e. for large projects with many tables, it could be a bottleneck. - less code while using e.g. Map.putIfAbsent(K, V)
[1]
Line 103 in 0737220
if (DynamicPartitionPruningUtils.isDppDimSide(leftSide)) { |
What is the purpose of the change
(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)
Brief change log
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation