Location: PHPKode > projects > PHPCrawl > PHPCrawl_080/documentation/multiprocessing_modes.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" dir="ltr">
<head>
  <title>PHPCrawl webcrawler library for PHP</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <link type="text/css" rel="stylesheet" media="all" href="style.css" />
</head>

<body>

<div id="wrapper">

  <div id="page">
    
      <div id="top">
        <h1 style="margin: 0px; float: left;">PHPCrawl webcrawler library</h1>
        
        <div style="margin-left: 670px; margin-top: 14px; font-size: 12px;">Docs for version 0.8x</div>
      </div>
      
      <div id="container">
      
        <div id="left">
        
          <ul>
          <li><a href="index.html">About PHPCrawl</a></li>
          <li>
          Documentation
           <ul id="submenu">
           <li><a href="requirements.html">Requirements</a></li>
           <li><a href="quickstart.html">Installation & Quickstart</a></li>
           <li><a href="example.html">Example</a></li>
           <li><a href="multiprocesses.html">Using multi-processes</a></li>
           <li><a href="multiprocessing_modes.html">Multiprocessing Modes</a></li>
           <li><a href="spidering_huge_websites.html">Spidering huge websites</a></li>
           <li><a href="faq.html">FAQ</a></li>
           <li><a href="classreferences/index.html" target="blank"><u>Complete Class References</u></a></li>
           </ul>
          </li>
          
          <li class="fat"><a href="http://sourceforge.net/projects/phpcrawl/files/PHPCrawl/" target="_blank">Download PHPCrawl</a></li>
          <li><a href="testinterface.html">Testinterface</a></li>
          <li><a href="versionhistory.html">Version history</a></li>
          <li><a href="http://sourceforge.net/projects/phpcrawl/forums/forum/307696" target="_blank">Forum</a></li>
          <li><a href="http://sourceforge.net/tracker/?group_id=89439&atid=590146" target="_blank">Report a bug</a></li>
          </ul>
         
         <div id="sf">
         <a href="http://sourceforge.net/projects/phpcrawl"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=89439&amp;type=14" width="150" height="40" alt="Get PHPCrawl at SourceForge.net. Fast, secure and Free Open Source software downloads" /></a>
         </div>
         
         <div id="sf">
         <form action="https://www.paypal.com/cgi-bin/webscr" method="post">
         <input type="hidden" name="cmd" value="_s-xclick">
         <input type="hidden" name="hosted_button_id" value="M53G4LP6XNHM4">
         <input type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_donate_SM.gif" border="0" name="submit" alt="PayPal - The safer, easier way to pay online!">
         <img alt="" border="0" src="https://www.paypalobjects.com/de_DE/i/scr/pixel.gif" width="1" height="1">
         </form>
         </div>

        </div>
        
        <div id="content">
        <h3>Multiprocessing modes</h3><br />
        
        PHPCrawl supports two different types of multiprocessing.<br /><br />
        
        The first one and the default is "<b>MPMODE_PARENT_EXECUTES_USERCODE</b>". When running this mode, the overridable function handleDocumentInfo() containing
        usercode always gets executed by the main-process of the crawler. This means that the code provided by the user never gets executed simultaneously and so you
        don't have to care about concurrent file/database/handle-accesses or similar things. This is the <b>recommended</b> multiprocessing-mode.<br /><br />
        
        Example of starting the crawler in this mode:<br />
        
        <p id ="code">
        <span style="color: #000000">
        <span style="color: #0000BB">
        $crawler</span><span style="color: #007700">-&gt;</span><span style="color: #0000BB">goMultiProcessed</span><span style="color: #007700">(</span><span style="color: #0000BB">5</span><span style="color: #007700">,<br /></span><span style="color: #0000BB">PHPCrawlerMultiProcessModes</span><span style="color: #007700">::</span><span style="color: #0000BB">MPMODE_PARENT_EXECUTES_USERCODE</span><span style="color: #007700">);
        <br /></span>
        </span>
        </p>
        
        or simply
        
        <p id ="code">
        <span style="color: #000000">
        <span style="color: #0000BB">$crawler</span><span style="color: #007700">-&gt;</span><span style="color: #0000BB">goMultiProcessed</span><span style="color: #007700">(</span><span style="color: #0000BB">5</span><span style="color: #007700">);
        <br /></span>
        </span>
        </p>
        
        <br />
        
        The second one is "<b>MPMODE_CHILDS_EXECUTES_USERCODE</b>".<br />
        In this mode all used child-processes are calling the function handleDocumentInfo() directly from their process-context, so the code you provided to the overridden
        method handleDocumentInfo() probably will be executed simultaneously by the different child-processes. This may result in a better performance, but you always should take care
        of concurrent file/data/handle-accesses and all other typical things to take care of when using parallel-computing.<br /><br />
        
        When using the "MPMODE_CHILDS_EXECUTES_USERCODE" mode and you use any handles like database-connections or filestreams in your extended crawler-class, you should open
        them within the overridden mehtod <a href="classreferences/PHPCrawler/method_detail_tpl_method_initChildProcess.htm" target="blank">initChildProcess()</a> instead of opening them from the constructor.
        For more details see the documentation of the <a href="classreferences/PHPCrawler/method_detail_tpl_method_initChildProcess.htm" target="blank">initChildProcess()</a>-method.<br /><br />

        Example of starting the crawler in this mode:<br />
        <p id ="code">
        <span style="color: #000000">
        <span style="color: #0000BB">
        $crawler</span><span style="color: #007700">-&gt;</span><span style="color: #0000BB">goMultiProcessed</span><span style="color: #007700">(</span><span style="color: #0000BB">5</span><span style="color: #007700">,<br /></span><span style="color: #0000BB">PHPCrawlerMultiProcessModes</span><span style="color: #007700">::</span><span style="color: #0000BB">MPMODE_CHILDS_EXECUTES_USERCODE</span><span style="color: #007700">);
        <br /></span>
        </span>
        </p>
        </div>
        
      </div>
  
  </div>
  
  
  
</div>

</body>
</html>
Return current item: PHPCrawl