Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize database access and ScoreSongOrder, MostSungSongOrder and FileTimeSongOrder #989

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BWagener
Copy link
Contributor

@BWagener BWagener commented May 1, 2024

What does this PR do?

Currently with a large library using the ScoreSongOrder and MostSungSongOrder sorting options is very slow (for me 10s per sort and per keystroke when searching while the sort is active).

I took the following approach to improve on this:

  • ensure initialization of sorting data is only done once, after all songs are loaded
    • on master the setup of the sorting data for ScoreSongOrder, MostSungSongOrder and FileTimeSongOrder can be executed on every keystroke when searching
  • optimize access patterns for song data from db
    • store song id in the Song class (and in cache)
    • use song id for all database accesses
      • on master every access requires a loop through all songs/high scores and string comparison for artist&title
    • use optimal data structures to store database and sorting data
      • on master songs and high scores are stored as sets, which require iteration and artist&title matching to access a specific song's information - this PR updates all data structures for DB data to unordered_maps (find operation has complexity of O(1) on average and O(n) in the worst case, compared to O(log n) for sets)
      • on master the sorting data for ScoreSongOrder, MostSungSongOrder and FileTimeSongOrder is stored in maps - this PR updates all data structures for sorting data to unordered_maps (find operation has complexity of O(1) on average and O(n) in the worst case, compared to O(log n) for maps)

Here are some anecdotal results from measuring the performance

All tests conducted with Debug builds


Startup


Launch (~24k songs, empty cache, empty db)

no noticeable change

this pr (3 separate launches)

profiler-songloader/debug:  build-list (29424.2 ms)  load-cache (1.7 ms)
profiler-songloader/debug:  build-list (29719.4 ms)  load-cache (2.9 ms)
profiler-songloader/debug:  build-list (30092.3 ms)  load-cache (2.4 ms)

master (3 separate launches)

profiler-songloader/debug:  build-list (28741.3 ms)  load-cache (2.4 ms)
profiler-songloader/debug:  build-list (30956.9 ms)  load-cache (1.2 ms)
profiler-songloader/debug:  build-list (28616.9 ms)  load-cache (2.3 ms)

Launch (~24k songs, with cache, with db)

~8s vs ~12.5s - 50% faster

this pr (3 separate launches)

profiler-songloader/debug:  load-cache (4715.3 ms)  build-list (3427.8 ms)
profiler-songloader/debug:  load-cache (4828.0 ms)  build-list (3435.2 ms)
profiler-songloader/debug:  load-cache (4892.2 ms)  build-list (3417.1 ms)

master (3 separate launches)

profiler-songloader/debug:  build-list (7777.9 ms)  load-cache (4794.4 ms)
profiler-songloader/debug:  build-list (7472.5 ms)  load-cache (4854.5 ms)
profiler-songloader/debug:  build-list (7434.6 ms)  load-cache (5204.7 ms)

Sorting


ScoreSongOrder (~24k songs, ~1.7k high scores)

~333x faster initialization as well as subsequent sorts

this pr (repeatedly triggering the tested sort order)

profiler-initialize_sort_internal/debug:  initialize (35.3 ms)
profiler-sort_internal/debug:  sort (35.0 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.9 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (38.0 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (34.8 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (37.9 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.2 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.1 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (37.5 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.1 ms)  prepare (0.0 ms)

master (repeatedly triggering the tested sort order)

profiler-sort_internal/debug:  prepare (9430.8 ms)  sort (72.5 ms)
profiler-sort_internal/debug:  prepare (9014.5 ms)  sort (77.8 ms)
profiler-sort_internal/debug:  prepare (8880.3 ms)  sort (70.8 ms)
profiler-sort_internal/debug:  prepare (10097.6 ms)  sort (74.8 ms)
profiler-sort_internal/debug:  prepare (10493.0 ms)  sort (76.8 ms)
profiler-sort_internal/debug:  prepare (9587.7 ms)  sort (75.4 ms)
profiler-sort_internal/debug:  prepare (11241.9 ms)  sort (74.6 ms)
profiler-sort_internal/debug:  prepare (11127.1 ms)  sort (74.8 ms)
profiler-sort_internal/debug:  prepare (10586.0 ms)  sort (75.4 ms)

MostSungSongOrder (~24k songs, ~1.7k high scores)

similar initialization - subsequent sorts 333x faster

this pr (repeatedly triggering the tested sort order)

profiler-initialize_sort_internal/debug:  initialize (9148.6 ms)
profiler-sort_internal/debug:  sort (31.5 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (33.9 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (34.8 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (33.4 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.2 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (33.3 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (35.0 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (33.2 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (34.6 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (33.8 ms)  prepare (0.0 ms)

master (repeatedly triggering the tested sort order)

profiler-sort_internal/debug:  prepare (9529.9 ms)  sort (77.3 ms)
profiler-sort_internal/debug:  prepare (9530.5 ms)  sort (76.0 ms)
profiler-sort_internal/debug:  prepare (9296.7 ms)  sort (72.8 ms)
profiler-sort_internal/debug:  prepare (10461.5 ms)  sort (76.0 ms)
profiler-sort_internal/debug:  prepare (10108.4 ms)  sort (75.9 ms)
profiler-sort_internal/debug:  prepare (10937.5 ms)  sort (80.3 ms)
profiler-sort_internal/debug:  prepare (10962.7 ms)  sort (76.7 ms)
profiler-sort_internal/debug:  prepare (8994.9 ms)  sort (70.4 ms)
profiler-sort_internal/debug:  prepare (9338.1 ms)  sort (84.2 ms)
profiler-sort_internal/debug:  prepare (11059.4 ms)  sort (81.6 ms)

FileTimeSongOrder (~24k songs)

similar initialization - subsequent sorts 6x faster

this pr (repeatedly triggering the tested sort order)

profiler-initialize_sort_internal/debug:  initialize (313.3 ms)
profiler-sort_internal/debug:  sort (112.6 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (111.2 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (111.6 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (112.4 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (111.5 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (115.4 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (111.9 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (111.5 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (115.9 ms)  prepare (0.0 ms)
profiler-sort_internal/debug:  sort (113.8 ms)  prepare (0.0 ms)

master (repeatedly triggering the tested sort order)

profiler-sort_internal/debug:  prepare (301.6 ms)  sort (294.5 ms)
profiler-sort_internal/debug:  prepare (307.1 ms)  sort (295.8 ms)
profiler-sort_internal/debug:  prepare (302.6 ms)  sort (295.8 ms)
profiler-sort_internal/debug:  prepare (303.2 ms)  sort (293.8 ms)
profiler-sort_internal/debug:  prepare (303.0 ms)  sort (294.0 ms)
profiler-sort_internal/debug:  prepare (303.6 ms)  sort (293.3 ms)
profiler-sort_internal/debug:  prepare (302.0 ms)  sort (296.8 ms)
profiler-sort_internal/debug:  prepare (312.8 ms)  sort (297.6 ms)
profiler-sort_internal/debug:  prepare (311.0 ms)  sort (302.9 ms)
profiler-sort_internal/debug:  prepare (304.5 ms)  sort (302.2 ms)

Todos

  • Update SongOrders if mutable information on a song changed (new highest score, new high scores in general) - looking for advice on how to do this, see [1] and [2]

@@ -261,7 +261,6 @@ void ScreenSing::instrumentLayout(double time) {

void ScreenSing::activateNextScreen()
{
m_database.addSong(m_song);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for advice why this was being called

@BWagener BWagener changed the title Draft: optimize most sung and highest score sorting optimize database access and ScoreSongOrder, MostSungSongOrder and FileTimeSongOrder May 4, 2024

Hiscore::HiscoreVector Hiscore::queryHiscore(std::optional<PlayerId> playerid, std::optional<SongId> songid, std::string const& track, std::optional<unsigned> max) const {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this method is very generic and could be used from many contexts, it was only being used from a single one. Accessing one of the maps was the more efficient alternative so I replaced the usage and deleted this function.

@@ -66,7 +66,7 @@ void Players::addPlayer (std::string const& name, std::string const& picture, st
pi.name = name;
pi.picture = picture;

pi.id = id.value_or(assign_id_internal());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign_id_internal() would always be called, as value_or doesn't short circuit.

this pointer.
*/
std::shared_ptr<Song> song;

Copy link
Contributor Author

@BWagener BWagener May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is no longer being used anywhere

sort_internal();
}

// TODO: determine appropriate time and location to call this function
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for help on this

@BWagener BWagener marked this pull request as ready for review May 4, 2024 17:43
@@ -36,73 +42,54 @@ void SongItems::save(xmlpp::Element* songs) {

SongId SongItems::addSongItem(std::string const& artist, std::string const& title, std::optional<SongId> _id) {
SongItem si;
si.id = _id.value_or(assign_id_internal());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign_id_internal() would always be evaluated here, as value_or can't short circuit

@twollgam
Copy link
Contributor

twollgam commented May 4, 2024

@BWagener there are type issues (look at the mac build). I think somewhere the song id is an int but should be SongId.

@BWagener
Copy link
Contributor Author

BWagener commented May 4, 2024

@BWagener there are type issues (look at the mac build). I think somewhere the song id is an int but should be SongId.

Ah thanks, I was mistakenly assuming it's an unrelated issue as the other builds were successful.

Would it be okay to explicitly cast them to SongId? I intentionally defined the id as a signed int on the Song class to handle missing ids. With the changes made in this PR it is guaranteed every loaded song has an ID, I just needed to track the state of a new song to check it here

An alternative would be using an optional instead, what do you think?

@twollgam
Copy link
Contributor

twollgam commented May 4, 2024

IIRC I started with a value for missing/undefined ids but it was refactored away with a std::optional. Even if that was not my own idea I would suggest this solution. Reason: You tell (with code) where an id is defined and where it could be undefined. That does not work if you simply use an int or SongId. This is where in the past comments were used.

Copy link

sonarcloud bot commented Jun 10, 2024

Quality Gate Passed Quality Gate passed

Issues
33 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants