<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Fast Chinese Word Segmentation</title>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<style type="text/css">
body {
margin: 5px;
background-color: #FFFFFF;
color: #000000;
font-family: Arial, ËÎÌå;
font-size: 100%;
}
td {
padding-right: 2ex;
background: #FFFFFF;
text-align: right;
font-size: 100%;
}
a {
color: #004080;
text-decoration: none;
}
a:visited {
color: #800080;
text-decoration: none;
}
a:hover {
color: #FF0000;
text-decoration: underline;
}
.header_title {
margin-bottom: 0.2em;
text-align: center;
font-size: x-large;
font-weight: bold;
}
.header_subtitle {
text-align: center;
line-height: 2.5ex;
font-size: small;
}
h1 {
margin-bottom: 0.5ex;
color: #006699;
font-size: large;
font-weight: bold;
}
h2 {
margin-bottom: 0.5ex;
color: #006699;
font-size: medium;
font-weight: bold;
}
pre {
padding: 1ex;
border: 1px solid #999999;
background-color: #EEEEEE;
color: #000000;
font-family: "Courier New";
}
</style>
</head>
<body>
<div class="header_title">Fast Chinese Word Segmentation</div>
<div class="header_subtitle">
µ±Ç°°æ±¾: <b>0.05.2</b>, ×îºó¸üÐÂ: <b>2005Äê08ÔÂ11ÈÕ</b><br>
Wudi <<a href="mailto:wudicgi@yahoo.de">wudicgi-at-yahoo.de</a>>, <a href="http://spaces.msn.com/members/wudicgi" target="_blank">MSN Space</a>
</div>
<br>
<h1>Ãû³Æ</h1>
<p>Fast Chinese Word Segmentation - ¿ìËÙÖÐÎÄ·Ö´Ê</p>
<h1>¸ÅÒª</h1>
<pre>include_once 'cwordseg_fast.lib.php';
$str = 'Äã²»¾ÍÊÇÕâÑùÒ»ÌìÒ»Ìì»Î¹ýÀ´µÄÂï';
$Segmentation = new Segmentation;
$Segmentation->load('cwordict_fast.tab');
$Segmentation->setLowercase(FALSE);
$Segmentation->setSegmentEnglish(TRUE);
$result = $Segmentation->segmentString($str);
echo $result;</pre>
<h1>¼ò½é</h1>
<p>±¾ Class ʹÓÃÄæÏò×î´óÆ¥Å䣨RMM£©·¨¶ÔÖÐÎĽøÐзִʣ¬Òò´ËһЩ´íÎó²»ÄÜÍêÈ«±ÜÃ⡣ͬʱ£¬±¾ Class ¿ÉÒÔ¶ÔÓ¢ÎĽøÐмòµ¥µÄ´¦Àí¡£</p>
<h1>ËÙ¶È</h1>
<table cellSpacing="1" cellPadding="4" border="0" style="margin-top: 2ex; width: 82ex; background-color: #C0C0C0;">
<tr>
<td style="background-color: #EEEEEE;">Ratio of Chinese</td>
<td style="background-color: #EEEEEE;">File Size</td>
<td style="background-color: #EEEEEE;">Time</td>
<td style="background-color: #EEEEEE;">Speed</td>
<td style="background-color: #EEEEEE;">Change</td>
</tr>
<tr>
<td colspan="5" style="text-align: left;"><b>v0.05.2</b>, P-M 1.6G, WinXP SP2, PHP 5.0.4, CLI, ĬÈÏÑ¡Ïî, ´Êµä´óС: 73,270 ´Ê</td>
</tr>
<tr>
<td>99%</td>
<td>211KB</td>
<td>2.65s</td>
<td>79.62KB/s</td>
<td>- 02.16KB/s</td>
</tr>
<tr>
<td>39%</td>
<td>213KB</td>
<td>2.05s</td>
<td>103.90KB/s</td>
<td>+ 12.09KB/s</td>
</tr>
<tr>
<td>0%</td>
<td>413KB</td>
<td>3.25s</td>
<td>127.08KB/s</td>
<td>+ 20.09KB/s</td>
</tr>
<tr>
<td colspan="5" style="text-align: left;"><b>v0.05.0 - v0.05.1</b>, P-M 1.6G, WinXP SP2, PHP 5.0.4, CLI, ĬÈÏÑ¡Ïî, ´Êµä´óС: 73,270 ´Ê</td>
</tr>
<tr>
<td>99%</td>
<td>211KB</td>
<td>2.58s</td>
<td>81.78KB/s</td>
<td>+ 00.00KB/s</td>
</tr>
<tr>
<td>39%</td>
<td>213KB</td>
<td>2.32s</td>
<td>91.81KB/s</td>
<td>+ 00.00KB/s</td>
</tr>
<tr>
<td>0%</td>
<td>413KB</td>
<td>3.86s</td>
<td>106.99KB/s</td>
<td>+ 00.00KB/s</td>
</tr>
</table>
<h1>·½·¨</h1>
<h2>bool load ( string filename )</h2>
<h2>bool setLowercase ( bool enable )</h2>
<h2>bool setSegmentEnglish ( bool enable )</h2>
<h2>string segmentString ( string data )</h2>
<h2>string segmentFile ( string filename )</h2>
<p></p>
<h1>ÀúÊ·</h1>
<h2>v0.05.2 (08/11/2005)</h2>
<ul type="square">
<li> Ôö¼Ó·½·¨ getDictName()</li>
<li> Ôö¼Ó˽Óз½·¨ _segmentLines()</li>
<li> ´Êµä¸ñʽ±ä»¯£¬µÚÒ»ÐÐÃèÊö´ÊµäÀàÐͼ°´ÊµäÃû³Æ£¬Òò´Ë±ØÐë¸ü»»´Êµä£¬·ñÔò³ÌÐò½«²»¹¤×÷</li>
<li> ²»¼ì²éµØÎ»×Ö½ÚµÄ ASCII Öµ£¬Ö»Òª¸ßλ×Ö½ÚµÄ ASCII Öµ´óÓÚ 0x81 ¾ÍÈÏΪÊǺº×Ö</li>
<li> һЩСµÄÐÞ¸Ä</li>
</ul>
<h2>v0.05.1 (08/04/2005)</h2>
<ul type="square">
<li> ÔÚ¼ÓÔØ´Êµäǰ¼ì²éÎļþÊÇ·ñ´æÔÚ</li>
<li> ·½·¨ÃûÊ×´ÎÊ××ÖĸСд£¬ÆäËû´ÊÊ××Öĸ´óд</li>
<li> һЩСµÄÐÞ¸Ä</li>
</ul>
<h2>v0.05.0 (07/12/2005)</h2>
<ul type="square">
<li> ×î³õµÄ¹«¿ª²âÊÔ°æ</li>
</ul>
<h1>AUTHOR</h1>
<p>2005, Wudi <<a href="mailto:wudicgi@yahoo.de">wudicgi-at-yahoo.de</a>>, <a href="http://spaces.msn.com/members/wudicgi" target="_blank">MSN Space</a></p>
<br>
</body>
</html>