Ticket #379 (closed defect: fixed)

Opened 3 years ago

Last modified 4 months ago

Memory Exhausted while processing a large table

Reported by: prakash.ved.v@… Owned by: hans
Priority: normal Milestone: 1.3.1
Component: Runtime (PHP5) Version: 1.2.1
Severity: normal Keywords: memoryerror bug largetable
Cc:

Description

I have a table with close to 5000 records, to process them I have written a small application thats suppose to run as a cron job. After processing 1500 records it fails with this error message

PHP Fatal error: Allowed memory size of 20971520 bytes exhausted (tried to allocate 1177 bytes) in /usr/home/project/e4s-demo/lib/symfony/vendor/creole/drivers /mysql/MySQLResultSet.php on line 57

Each Record size < 128 KB

I have also set all vars to null after use and unset them too, but no use. I think the problem is with Propel/Creole, I think it cannot handle large tables efficiently. On the other hand if I issue MySQL native commands it works fine. Increasing PHP memory size is a no solution.

Thanks and Regards

Attachments

propel Bug.txt Download (3.0 KB) - added by prakash.ved.v@… 3 years ago.
Propel code causing error in Symfony
propel.patch Download (0.7 KB) - added by tpk 21 months ago.
Proposed patch.

Change History

Changed 3 years ago by prakash.ved.v@…

Propel code causing error in Symfony

Changed 3 years ago by hans

  • status changed from new to closed
  • resolution set to invalid

The better approach to dealing with large recordsets is to hydrate the objects individually:

<?php
// example of how to manually hydrate objects
$rs = AuthorPeer::doSelectRS(new Criteria());
while($rs->next()) {
  $a = new Author();
  $a->hydrate($rs);
}

You may also wish to try Propel 1.3 (note that the API above will not work with 1.3, see wiki:Users/Documentation/1.3/Upgrading)

If this is an actual bug, then it may be a duplicate of memory leak issue described by #226.

Changed 3 years ago by prakash.ved.v

  • status changed from closed to reopened
  • resolution invalid deleted

I have already tried that, and still it gives me memory exhausted error.

$c = clone $crit;

JobPeer::addSelectColumns($c); $startcol = (JobPeer::NUM_COLUMNS - JobPeer::NUM_LAZY_LOAD_COLUMNS) + 1; EmployerPeer::addSelectColumns($c);

$c->addJoin(JobPeer::IDEMPLOYER, EmployerPeer::IDEMPLOYER); $rs = BasePeer::doSelect($c, $con); $results = array(); echo "Processing ... $jCount Jobs.\n"; while($rs->next()) {

$omClass = JobPeer::getOMClass();

$cls = Propel::import($omClass); $obj1 = new $cls(); $obj1->hydrate($rs);

$omClass = EmployerPeer::getOMClass();

$cls = Propel::import($omClass); $obj2 = new $cls(); $obj2->hydrate($rs, $startcol);

$newObject = true; foreach($results as $temp_obj1) {

$temp_obj2 = $temp_obj1->getEmployer(); if ($temp_obj2->getPrimaryKey() === $obj2->getPrimaryKey()) {

$newObject = false;

$temp_obj2->addJob($obj1); break;

}

} if ($newObject) {

$obj2->initJobs(); $obj2->addJob($obj1);

} $job = $obj1;

Perhaps this is a duplicate entry, do we have solution to this ?

Changed 3 years ago by prakash.ved.v

Even after upgrading to Propel 1.3 the problem still persists.

Changed 3 years ago by soenke

What you could do for debbuging is adding

echo memory_get_usage() . "\n";

before and after suspicious code pieces. Maybe we can see anything more then.

Thanks!

Changed 3 years ago by Remy Wetzels <remy@…>

Do you unset the objects directly after you stop using them? That might help somewhat...

<?php
// example of how to manually hydrate objects
$rs = AuthorPeer::doSelectRS(new Criteria());
while($rs->next()) {
  $a = new Author();
  $a->hydrate($rs);
  ... do your stuff ...
  $a->save();
  unset($a);
}

Changed 3 years ago by prakash.ved.v

Remy, I have tried all of these, infact I even tried releasing all the objects immidiately after I see they are not required any further in the code.

Here are some more debugging information.

WRT Propel Bug.txt [Attached] 1. setLimit(100) PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allo cate 6456 bytes) in C:\xampp\htdocs\e4s\lib\model\om\BaseJob.php on line 760 2. setLimit(10) PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allo cate 8904 bytes) in C:\xampp\htdocs\e4s\lib\symfony\vendor\propel\util\BasePeer. php on line 444 3. setLimit(1) PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allo cate 4752 bytes) in C:\xampp\htdocs\e4s\lib\symfony\vendor\propel\util\Criteria. php on line 1507

As we can see, the error received is not consistent. I would also like to add that records are getting processed, lets say if the limit is 100 it does process 800 records and then it throws fatal error, and 850 records when limit is 10.

I think we need to look beyond unset of variables and look into the flow of how propel handles queries for large tables. One thing is for sure that there is unnecessary garbage that propel keeps in memory after every execution, and it doesn't flush it out.

Changed 3 years ago by apinstein

  • status changed from reopened to closed
  • resolution set to duplicate

This is in fact a dupe of #226: http://propel.phpdb.org/trac/ticket/226

I was one of the first people to really thrash propel in such situations and did help find bugs that caused unbounded memory use like you're seeing. I think they fixed some things, but I think that one remains:

The main problem was creation of circular references between related objects. For instance, if you loaded one object and then its related object through $obj->getRelatedObj() then this caused a circular reference which prevented the PHP GC from correctly disposing of the object when it went out of scope.

For right now, this has to be fixed from userland, like so.

In my case, I have a MlsProperty object which is related to LOTS of things. It's the main object that I'm looping on in my loop where I "hydrate" the object.

I have added a function called "resetCaches()" to MlsProperty which I call at the end of each turn through the loop to "clear" out all of these circular references.

function resetCaches() {

// propagate if ($this->collMlsRooms) {

foreach ($this->collMlsRooms as $room) {

$room->resetCaches();

}

} if ($this->collMlsPropertyFeatures) {

foreach ($this->collMlsPropertyFeatures as $f) {

$f->resetCaches();

}

} if ($this->collMlsPropertyImages) {

foreach ($this->collMlsPropertyImages as $i) {

$i->resetCaches();

}

} $this->collMlsRooms = $this->collMlsPropertyFeatures = $this->collMlsPropertyImages = NULL;

}

I then have similar functions on each of the related objects that do the same thing.

It was very simple to implement and works in my case extremely well.

I see that the bug is still open in Propel, which is too bad. I made this fix about a year ago and haven't had any problems since, but it is difficult to figure out a way to make Propel do this automatically.

I think that we had decided to revisit this for 1.3 and I guess it wasn't figured out yet there either. "UniqueInstance" is supposed to make it easier, hopefully.

Good luck,

Alan

Changed 3 years ago by apinstein

Forgot one thing...

If you care to learn more about this issue, search the dev archive for "Circular reference" and you'll see the many postings where we worked on this.

I am very confident that the solution I gave you should work. Sorry I didn't speak up sooner; mine has worked for so long I had forgotten that it was fixed in MY code, not Propel's....

Alan

Changed 3 years ago by anonymous

  • status changed from closed to reopened
  • resolution duplicate deleted

This example doesnt have circular references and still chews memory

foreach ($somefilelist as $somefile) { .. extract info and do whatever ... $file = new File(); $file->setCategoryID($categoryID); $file->setName($name); $file->save(); unset($file); echo "Memory Used: ".memory_get_usage() . "\n"; }

Changed 2 years ago by hans

We can add the reset method to 1.3. We do already have (I believe) clear*() methods, but we can add a new method that clears all of the associated collections.

Changed 2 years ago by hans

(In [920]) Refs #379, #226 - Adding a clearAllReferences() method to provide userland workaround for PHP gc.

Changed 2 years ago by hans

I added a clearAllReferences() method in r920 (based on the code Alan provided) to provide a user-space workaround for the GC. Please give this a shot and let us (me) know if this does the trick.

Note that this method has a $deep param, which defaults to FALSE. So to clear all references on each associated object too, you would need to call:

<?php

$o->clearAllReferences($deep=true);

I'm going to mark this fixed. Re-open if you run into any issues w/ this method or if this isn't somehow doing the trick.

Changed 2 years ago by hans

  • status changed from reopened to closed
  • resolution set to fixed

Changed 21 months ago by tpk

  • status changed from closed to reopened
  • resolution fixed deleted

Just encountered the issue and tried to use clearAllReferences as a solution. It doesn't work, and upon inspection it becomes quite clear why: what gets removed are the collections, but not the object references themselves. To give you an example, if you had, say, a Product object with ProductDescription objects attached to it, the structure would look like:

Product

collProductDescriptions

ProductDescription

aProduct

now, calling $product->clearAllReferences(true) won't help, because even after unset($product) there are still as many references to $product as there were entries in collProductDescription (this is due to the fact that Product::addProductDescription()) creates a bi-directional link between the two.

To prove the theory I ran the same code with and without the proposed patch and the memory consumption dropped dramatically after applying it.

Changed 21 months ago by tpk

Proposed patch.

Changed 21 months ago by hans

Thanks for the patch!

Changed 21 months ago by hans

Oh, tpk, if you want to go ahead an apply that, please do. It does make sense since we're now doing bi-directional relationships.

Changed 20 months ago by andy@…

I see this patch was applied in [1062] - is this ticket due to be closed now?

Changed 19 months ago by ron

  • priority changed from high to normal
  • severity changed from major to normal
  • milestone changed from 1.3 to 1.3.1

Changing milestone to 1.3.1, and reducing severity to normal because chances are this is now fixed. Will close if for 1.3.1 if no new information is added.

Changed 4 months ago by francois

  • status changed from reopened to closed
  • resolution set to fixed
  • milestone changed from To be scheduled to 1.3.1
Note: See TracTickets for help on using tickets.