2010-11-02 00:03:50 +03:00
|
|
|
Batch Processing
|
|
|
|
================
|
|
|
|
|
2010-11-01 23:16:12 +03:00
|
|
|
This chapter shows you how to accomplish bulk inserts, updates and
|
|
|
|
deletes with Doctrine in an efficient way. The main problem with
|
|
|
|
bulk operations is usually not to run out of memory and this is
|
|
|
|
especially what the strategies presented here provide help with.
|
|
|
|
|
2010-11-02 00:03:50 +03:00
|
|
|
.. warning::
|
|
|
|
|
|
|
|
An ORM tool is not primarily well-suited for mass
|
2010-11-01 23:16:12 +03:00
|
|
|
inserts, updates or deletions. Every RDBMS has its own, most
|
|
|
|
effective way of dealing with such operations and if the options
|
|
|
|
outlined below are not sufficient for your purposes we recommend
|
|
|
|
you use the tools for your particular RDBMS for these bulk
|
|
|
|
operations.
|
|
|
|
|
|
|
|
|
|
|
|
Bulk Inserts
|
|
|
|
------------
|
|
|
|
|
|
|
|
Bulk inserts in Doctrine are best performed in batches, taking
|
|
|
|
advantage of the transactional write-behind behavior of an
|
|
|
|
``EntityManager``. The following code shows an example for
|
|
|
|
inserting 10000 objects with a batch size of 20. You may need to
|
|
|
|
experiment with the batch size to find the size that works best for
|
|
|
|
you. Larger batch sizes mean more prepared statement reuse
|
|
|
|
internally but also mean more work during ``flush``.
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$batchSize = 20;
|
|
|
|
for ($i = 1; $i <= 10000; ++$i) {
|
|
|
|
$user = new CmsUser;
|
|
|
|
$user->setStatus('user');
|
|
|
|
$user->setUsername('user' . $i);
|
|
|
|
$user->setName('Mr.Smith-' . $i);
|
|
|
|
$em->persist($user);
|
2013-02-08 19:32:03 +04:00
|
|
|
if (($i % $batchSize) === 0) {
|
2010-11-01 23:16:12 +03:00
|
|
|
$em->flush();
|
|
|
|
$em->clear(); // Detaches all objects from Doctrine!
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
Bulk Updates
|
|
|
|
------------
|
|
|
|
|
|
|
|
There are 2 possibilities for bulk updates with Doctrine.
|
|
|
|
|
|
|
|
DQL UPDATE
|
|
|
|
~~~~~~~~~~
|
|
|
|
|
|
|
|
The by far most efficient way for bulk updates is to use a DQL
|
|
|
|
UPDATE query. Example:
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$q = $em->createQuery('update MyProject\Model\Manager m set m.salary = m.salary * 0.9');
|
|
|
|
$numUpdated = $q->execute();
|
|
|
|
|
|
|
|
Iterating results
|
|
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
An alternative solution for bulk updates is to use the
|
|
|
|
``Query#iterate()`` facility to iterate over the query results step
|
|
|
|
by step instead of loading the whole result into memory at once.
|
|
|
|
The following example shows how to do this, combining the iteration
|
|
|
|
with the batching strategy that was already used for bulk inserts:
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$batchSize = 20;
|
|
|
|
$i = 0;
|
|
|
|
$q = $em->createQuery('select u from MyProject\Model\User u');
|
|
|
|
$iterableResult = $q->iterate();
|
2014-01-15 02:44:38 +04:00
|
|
|
foreach ($iterableResult as $row) {
|
2010-11-01 23:16:12 +03:00
|
|
|
$user = $row[0];
|
|
|
|
$user->increaseCredit();
|
|
|
|
$user->calculateNewBonuses();
|
2013-02-08 19:32:03 +04:00
|
|
|
if (($i % $batchSize) === 0) {
|
2010-11-01 23:16:12 +03:00
|
|
|
$em->flush(); // Executes all updates.
|
|
|
|
$em->clear(); // Detaches all objects from Doctrine!
|
|
|
|
}
|
|
|
|
++$i;
|
|
|
|
}
|
2013-02-08 19:32:03 +04:00
|
|
|
$em->flush();
|
2010-11-01 23:16:12 +03:00
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. note::
|
|
|
|
|
|
|
|
Iterating results is not possible with queries that
|
2010-11-01 23:16:12 +03:00
|
|
|
fetch-join a collection-valued association. The nature of such SQL
|
|
|
|
result sets is not suitable for incremental hydration.
|
|
|
|
|
|
|
|
|
|
|
|
Bulk Deletes
|
|
|
|
------------
|
|
|
|
|
|
|
|
There are two possibilities for bulk deletes with Doctrine. You can
|
|
|
|
either issue a single DQL DELETE query or you can iterate over
|
|
|
|
results removing them one at a time.
|
|
|
|
|
|
|
|
DQL DELETE
|
|
|
|
~~~~~~~~~~
|
|
|
|
|
|
|
|
The by far most efficient way for bulk deletes is to use a DQL
|
|
|
|
DELETE query.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$q = $em->createQuery('delete from MyProject\Model\Manager m where m.salary > 100000');
|
|
|
|
$numDeleted = $q->execute();
|
|
|
|
|
|
|
|
Iterating results
|
|
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
An alternative solution for bulk deletes is to use the
|
|
|
|
``Query#iterate()`` facility to iterate over the query results step
|
|
|
|
by step instead of loading the whole result into memory at once.
|
|
|
|
The following example shows how to do this:
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$batchSize = 20;
|
|
|
|
$i = 0;
|
|
|
|
$q = $em->createQuery('select u from MyProject\Model\User u');
|
|
|
|
$iterableResult = $q->iterate();
|
|
|
|
while (($row = $iterableResult->next()) !== false) {
|
|
|
|
$em->remove($row[0]);
|
2013-02-08 19:32:03 +04:00
|
|
|
if (($i % $batchSize) === 0) {
|
2010-11-01 23:16:12 +03:00
|
|
|
$em->flush(); // Executes all deletions.
|
|
|
|
$em->clear(); // Detaches all objects from Doctrine!
|
|
|
|
}
|
|
|
|
++$i;
|
|
|
|
}
|
2013-02-08 19:32:03 +04:00
|
|
|
$em->flush();
|
2010-11-01 23:16:12 +03:00
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. note::
|
|
|
|
|
|
|
|
Iterating results is not possible with queries that
|
2010-11-01 23:16:12 +03:00
|
|
|
fetch-join a collection-valued association. The nature of such SQL
|
|
|
|
result sets is not suitable for incremental hydration.
|
|
|
|
|
|
|
|
|
|
|
|
Iterating Large Results for Data-Processing
|
|
|
|
-------------------------------------------
|
|
|
|
|
|
|
|
You can use the ``iterate()`` method just to iterate over a large
|
|
|
|
result and no UPDATE or DELETE intention. The ``IterableResult``
|
|
|
|
instance returned from ``$query->iterate()`` implements the
|
|
|
|
Iterator interface so you can process a large result without memory
|
|
|
|
problems using the following approach:
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. code-block:: php
|
2010-11-01 23:16:12 +03:00
|
|
|
|
|
|
|
<?php
|
|
|
|
$q = $this->_em->createQuery('select u from MyProject\Model\User u');
|
|
|
|
$iterableResult = $q->iterate();
|
2014-01-15 02:44:38 +04:00
|
|
|
foreach ($iterableResult as $row) {
|
2010-11-01 23:16:12 +03:00
|
|
|
// do stuff with the data in the row, $row[0] is always the object
|
|
|
|
|
|
|
|
// detach from Doctrine, so that it can be Garbage-Collected immediately
|
|
|
|
$this->_em->detach($row[0]);
|
|
|
|
}
|
|
|
|
|
2010-12-03 22:13:10 +03:00
|
|
|
.. note::
|
|
|
|
|
|
|
|
Iterating results is not possible with queries that
|
2010-11-01 23:16:12 +03:00
|
|
|
fetch-join a collection-valued association. The nature of such SQL
|
|
|
|
result sets is not suitable for incremental hydration.
|
|
|
|
|
|
|
|
|
|
|
|
|