Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Affected Rows in Collection for counting #1280

Open
shopblocks opened this issue Jun 14, 2016 · 4 comments
Open

Use Affected Rows in Collection for counting #1280

shopblocks opened this issue Jun 14, 2016 · 4 comments

Comments

@shopblocks
Copy link

Performing a count on a RecordSet is very memory intensive.

Performing a count on 20,000 rows using

<?php count($result_set);

I can see that it uses the util/Collection class, which implements count like this:

<?php

public function count() {
    $count = iterator_count($this);
    $this->rewind();
    return $count;
}

I believe the slowness is caused by the iterator_count call, which loops over all of the records.

How come something like this can't be done:

<?php
public function count() {
    return count($this->_data);
}

A quick way to recreate the memory issues is to do the following:

<?php 
public function test_record_set()
    {
        $memory_start = memory_get_usage();

        $records = [];

        $num = 20000;

                foreach (range(1, $num) as $key => $value) {
                    $records[] = new Record(['data' => ['id' => $value]]);
                }

                $record_set = new RecordSet([
                    'data' => $records,
                ]);

        \app\libraries\utils\Debug::dd('Memory Used: ' . ( memory_get_usage() - $memory_start ));
    }
@nateabele
Copy link
Member

Trying to count 20k objects in memory is a bad idea. It's better to run a separate query with the same conditions explicitly to get the count. RecordSet fetches records lazily and stores them in $this->_data, so it has to iterate through the entire set before $this->_data would be any good for counting anyway.

@shopblocks
Copy link
Author

Should $this->_data not be keeping an internal counter perhaps, so that if that is available, count($object) returns the already-calculated value?

@mariuswilms
Copy link
Member

As @nateabele said, for your use case it's best to use a count query. You never want that many records in a result set at once. At one point you'll start processing them in some way.

I agree with @shopblocks that Collection could be optimized in a way: create a FixedCollection that allows us to pass a count manually, which will be returned by it instead of looking at it's items. The count would come from http://php.net/manual/de/pdostatement.rowcount.php and the MongoDB equivalent.

@mariuswilms mariuswilms changed the title Count on RecordSet is very memory intensive and slow Use Affected Rows in Collection for counting Jun 17, 2016
@shopblocks
Copy link
Author

Thats fine, we have changed our code to have a Model::count() query too, but this enhancement would still be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants