<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>-Flyぁ梦- &#187; 字符串</title>
	<atom:link href="http://blog.11034.org/tag/%e5%ad%97%e7%ac%a6%e4%b8%b2/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.11034.org</link>
	<description></description>
	<lastBuildDate>Sun, 22 Jun 2025 08:59:05 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.2.38</generator>
	<item>
		<title>整理下字符串的一些数据结构和算法</title>
		<link>http://blog.11034.org/2012-12/string.html</link>
		<comments>http://blog.11034.org/2012-12/string.html#comments</comments>
		<pubDate>Thu, 06 Dec 2012 13:27:52 +0000</pubDate>
		<dc:creator><![CDATA[-Flyぁ梦-]]></dc:creator>
				<category><![CDATA[ACM]]></category>
		<category><![CDATA[数据结构和算法]]></category>
		<category><![CDATA[字符串]]></category>
		<category><![CDATA[树]]></category>

		<guid isPermaLink="false">http://blog.stariy.org/?p=1412</guid>
		<description><![CDATA[别看字符串挺简单，还真牵扯到好多数据结构和算法啊，给跪了，要将所有的都好好掌握真心是一项艰难的任务。这里就列举 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>别看字符串挺简单，还真牵扯到好多数据结构和算法啊，给跪了，要将所有的都好好掌握真心是一项艰难的任务。这里就列举下自己做zoj时候碰到过的一些，然后配合自己做过的例题讲一下。不过每种算法、数据结构具体的描述和代码，还请自己去搜索吧。<span id="more-1412"></span></p>
<h2>算法</h2>
<h3>KMP算法</h3>
<p>字符串在普通项目中用得最多的必然是匹配和查询问题了，KMP算法通过O(M)的时间预处理短串，然后在O(N)的时间内搜索长串得到结果，算是最经典的字符串匹配算法了，其中预处理字符串next数组的方法非常具有扩展性。KMP并不太容易懂，推荐百度搜索“Matrix67 KMP”这篇文章，比较好懂。自己写一遍KMP再配合着OJ做一些题，就有感觉了。</p>
<p><strong>zoj 2177 Period</strong> : 由重复子串不重叠构成的所有前缀。这里用到了next函数的思想。</p>
<p><strong>zoj 2177 Crack</strong> : int值的KMP。搜索到串后，再分别往前往后继续匹配，比较最大值。</p>
<p><strong>zoj 3587 Marlon&#8217;s String</strong> : 串A和串B，求出A[i1, j1] + A[i2, j2] = B有多少种。KMP中每一次i与j的匹配都意味着一个前缀的匹配，后缀通过字符串反序同理前缀处理。</p>
<p><strong>zoj 3643 Keep Deleting</strong> : KMP和栈的结合。</p>
<h3>最小表示法</h3>
<p>一个字符串按照字典序的最小表示序列。可以用来判断某些字符串是否循环同构。</p>
<p>一开始看懂了，最近又忘了，不过知道有这个算法，应用起来套进去好像比较容易。</p>
<p><strong>zoj 1729 Hidden Password</strong> : 求字符串的最小表示位于原字符串的位置。</p>
<h3>Manacher算法</h3>
<p>用来求一个字符串中最长回文子串的算法，比用后缀数组的方案既快又节约内存。没怎么看懂，好像也就这么个用途，勉强记得有这个算法吧 =.=。</p>
<p><strong>poj 3974 Palindrome</strong> : 求最长回文子串的长度。</p>
<h3>栈处理</h3>
<p>主要是字符串里的括号匹配的问题，很容易想到用栈。</p>
<p><strong>zoj 1423 (Your)((Term)((Project)))</strong> : 删除多余的括号。</p>
<p><strong>zoj 2483 Boolean Expressions</strong> : 计算布尔表达式的值。</p>
<p><strong>zoj 2704 Brackets</strong> : 判断带圆括号和方括号的表达式是否正确。</p>
<h2>数据结构</h2>
<h3>Trie树</h3>
<p>这算是最简单的了，很容易看懂和应用的数据结构。用来查找某些字符串是否存在于一个字符串集合当中，查找花费O(N)的时间（这里用Java的HashMap和String自带的hashCode方法，也比较不错）。Trie树适合被搜索对象是一个word，搜索其是否整体匹配单词源集合某一个。</p>
<p><strong>zoj 1109 Language of FatMouse</strong> : Trie树最基础应用</p>
<p><strong>zoj 1888 Zipf&#8217;s Law</strong> : 查找一段文章中出现次数为N的单词。也是Trie树普通运用，最后选择遍历Trie树也可以，当然也可以优化下。</p>
<p><strong>zoj 2346 Shortest Prefixes</strong> : N个串，将每个串缩短至最短的前缀串来唯一表示这个串，但在N个串中不能有冲突。</p>
<h3>AC自动机 &#8211; Trie树升级版</h3>
<p>AC自动机，就是在Trie树的基础上给每个节点加一个fail指针，类似KMP的next数组的感觉，每次匹配失败后可以继续向前匹配。为什么叫自动机呢，学过计算理论应该还是比较好理解的。AC自动机的建立，在Trie树已经建立后，加一步建立fail指针的过程，需要通过bfs遍历一遍Trie树。然后搜索的时候没匹配一个节点，都要依次循着fail指针向上查询直到root节点查找是表示为串结尾的节点，这些都是符合的情况。</p>
<p>AC自动机适合处理那些一长段文字中去寻找某些个单词的情况，即适合被搜索对象是一个长串，搜索其部分匹配单词源集合。</p>
<p>附上一份自己写的AC自动机的代码，<a title="一个OOP的AC自动机代码" href="/2012-12/ac_automachine.html" target="_blank">Click here</a>。</p>
<p><strong>zoj 3228 Searching the String</strong> : 可重叠和不可重叠地去查询某个串存在的次数。</p>
<p><strong>zoj 3430 Detect the Virus</strong> : Base64解码后再匹配。题目比较恶心&#8230;然后这里字符集有256个，包括&#8217;\0&#8217;，所以用insert(char *s, int len)比较好，然后很恶心的一点习惯写Java后才发现C++的char默认是signed的&#8230;这里要转换成unsigned char，不然OJ上就不断地SF了。</p>
<h3>后缀树和后缀自动机</h3>
<p>将一个字符串的所有后缀作为不同的串，插入到Trie树形成的树，同理再加上fail指针就是后缀自动机。后缀树和后缀自动机都是对一个长串进行某些处理，比如寻找某些子串、寻找最长回文子串、寻找两个串的某些共同属性（比如最长公共部分等）。但是后缀树算法和程序比较复杂，做题的时候好像很少用到，而是用较为简易的后缀数组的来代替。</p>
<h3>后缀数组</h3>
<p>后缀树的简化版本，用几个数组来记录后缀串的不同属性。网上流行的模板代码里，sa数组按字典顺序记录后缀的序号（ra[rk] = index），rank数组与sa数组互逆，按照后缀序号记录后缀的字典序大小（rank[index] = rk）。然后为了模拟后缀树，还有height数组，记录height[i] = LCP(ra[i &#8211; 1], ra[i])，LCP为Longest Common Prefix（最长公共前缀），即有了后缀树里的Least Common Ancestors（最近公共祖先）的意思。后缀数组开了这么多数组比较耗费内存，所以不适合字符串长度很长（超过一百万）的题目。</p>
<p>建立后缀数组，有两种比较复杂难懂的算法，倍增算法和DC3算法，前者O(NlogN)后者O(N)的复杂度，反正LZ看不懂也不会写用的也是网上找的模板。计算height值是O(N)的算法，也不懂。任意两个后缀的LCP值是它们sa值之间的height值的最小值，这里包含了一个RMQ（Range Minimum Query）问题。</p>
<p>后缀数组的详细可以看一下《后缀数组—处理字符串的有力工具》这篇高中生写的论文，膜拜&#8230;虽然觉得里面某些代码和题解有点问题，不过算法描述很详细很值得去读而理解。</p>
<p>附上一份网上找的蛮好用的后缀数组代码，<a title="一套可用的后缀数组代码" href="/2012-12/suffix_array.html" target="_blank">Click here</a>。</p>
<p><strong>ural 1297 Palindrome</strong> : 求最长回文子串。poj也有类似一题但是数据量太大，可以用前面提到的Manacher算法求解。</p>
<p><strong>zoj 2737 Occurrence</strong> : 求串B所有的循环同构串在串A中出现的次数和。解决循环的办法就是设B&#8217; = B + B，S = B&#8217; + &#8216;$&#8217; + A，然后对S进行后缀数组处理。</p>
<p><strong>zoj 3199 Longest Repeated Substring</strong> : 含有不可重叠且连续子串的子串。</p>
<p><strong>zoj 3296 Connecting the Segments</strong> : 这道题真是牛逼了，结合了后缀数组、RMQ、贪心之最小区间覆盖，很经典很值得一练的题。后缀数组的作用并不是求出所有的回文子串（无法求出被真包含的回文子串，比如aabaa，aabaa本身是回文子串，其中真包含的aba也是回文子串，aba就无法得到），但是不影响区间覆盖。</p>
<p><strong>zoj 3395 Stammering Aliens</strong> : 可重叠子串多于k个。这里需要将height数组分组，很多后缀数组的题目都需要如此处理，原因就是上面提到的LCP取最小值。</p>
<p>学习后缀数组真心花了好多天，一开始只会按照套路和题解去套用后缀数组的方法解题，做了几个题目就慢慢理解后缀数组的意义所在，就能自己想办法去利用后缀数组了。不做题，不coding，理解不了，但是痛苦的是一段时间不做的话又会忘了，这个怎么破 T_T。</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-to be continued&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>&nbsp;</p>
<h4  class="related_post_title">看看 字符串 , 树</h4><ul class="related_post"><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/ac_automachine.html" title="一个OOP的AC自动机代码">一个OOP的AC自动机代码</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/suffix_array.html" title="一套可用的后缀数组代码">一套可用的后缀数组代码</a></li><li>2012-07-03 -- <a target="_blank" href="http://blog.11034.org/2012-07/trie_in_php.html" title="敏感词过滤，PHP实现的Trie树">敏感词过滤，PHP实现的Trie树</a></li></ul><h4 class="related_post_title">看看 ACM , 数据结构和算法 </h4><ul class="related_post"><li>2013-05-27 -- <a target="_blank" href="http://blog.11034.org/2013-05/java_map.html" title="java.util中几个Map的性能测试">java.util中几个Map的性能测试</a></li><li>2013-05-07 -- <a target="_blank" href="http://blog.11034.org/2013-05/rectangle_overlap.html" title="判断矩形是否重叠">判断矩形是否重叠</a></li><li>2013-01-15 -- <a target="_blank" href="http://blog.11034.org/2013-01/pack_in_zoj.html" title="背包练习小集合">背包练习小集合</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/suffix_array.html" title="一套可用的后缀数组代码">一套可用的后缀数组代码</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/ac_automachine.html" title="一个OOP的AC自动机代码">一个OOP的AC自动机代码</a></li>]]></content:encoded>
			<wfw:commentRss>http://blog.11034.org/2012-12/string.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>一套可用的后缀数组代码</title>
		<link>http://blog.11034.org/2012-12/suffix_array.html</link>
		<comments>http://blog.11034.org/2012-12/suffix_array.html#comments</comments>
		<pubDate>Thu, 06 Dec 2012 13:14:53 +0000</pubDate>
		<dc:creator><![CDATA[-Flyぁ梦-]]></dc:creator>
				<category><![CDATA[数据结构和算法]]></category>
		<category><![CDATA[字符串]]></category>

		<guid isPermaLink="false">http://blog.stariy.org/?p=1420</guid>
		<description><![CDATA[后缀数组的代码算法难度比较高，反正没怎么看也不打算看懂，从网上找了好久找到一份比较好用的代码，效率也比较高，据 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>后缀数组的代码算法难度比较高，反正没怎么看也不打算看懂，从网上找了好久找到一份比较好用的代码，效率也比较高，据说是用的倍增算法？看不懂 =.= <span id="more-1420"></span></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
</pre></td><td class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">// r : 为字符串转化成的int数组, 0 - len-1 为源字符串, r[len] = 0</span>
<span style="color: #666666;">// sa : 1 - len有效, 值从0到len-1表示后缀序号</span>
<span style="color: #666666;">// rank : 0 - len-1有效, 值从1到len表示后缀字典序大小</span>
<span style="color: #666666;">// height : 2 - len有效, height[i] 为 sa[i]和sa[i-1]的LCP值</span>
&nbsp;
<span style="color: #666666;">// da : n为字符串长度+1, m为字符集大小（用于基数排序）</span>
<span style="color: #666666;">// calh : n为字符串长度</span>
&nbsp;
<span style="color: #339900;">#define MAX 40010</span>
<span style="color: #0000ff;">int</span> height<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,rank<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,r<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,sa<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
<span style="color: #0000ff;">int</span> ts<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,ta<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,tb<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,tv<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span>,pos<span style="color: #008080;">;</span>
<span style="color: #0000ff;">bool</span> cmp<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> <span style="color: #000040;">*</span>y,<span style="color: #0000ff;">int</span> a,<span style="color: #0000ff;">int</span> b,<span style="color: #0000ff;">int</span> l<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">return</span> y<span style="color: #008000;">&#91;</span>a<span style="color: #008000;">&#93;</span><span style="color: #000080;">==</span>y<span style="color: #008000;">&#91;</span>b<span style="color: #008000;">&#93;</span><span style="color: #000040;">&amp;&amp;</span>y<span style="color: #008000;">&#91;</span>a<span style="color: #000040;">+</span>l<span style="color: #008000;">&#93;</span><span style="color: #000080;">==</span>y<span style="color: #008000;">&#91;</span>b<span style="color: #000040;">+</span>l<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
<span style="color: #666666;">//计算sa和rank数组</span>
<span style="color: #0000ff;">void</span> da<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> n,<span style="color: #0000ff;">int</span> m<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">int</span> i,j,<span style="color: #000040;">*</span>x<span style="color: #000080;">=</span>ta,<span style="color: #000040;">*</span>y<span style="color: #000080;">=</span>tb,p<span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>m<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>x<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>r<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000040;">++</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>m<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000040;">+</span><span style="color: #000080;">=</span>ts<span style="color: #008000;">&#91;</span>i<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span>n<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&gt;=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000040;">--</span><span style="color: #008000;">&#41;</span> sa<span style="color: #008000;">&#91;</span><span style="color: #000040;">--</span>ts<span style="color: #008000;">&#91;</span>x<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>i<span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>j<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span>,p<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>p<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>j<span style="color: #000040;">*</span><span style="color: #000080;">=</span><span style="color: #0000dd;">2</span>,m<span style="color: #000080;">=</span>p<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
        p<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span>n<span style="color: #000040;">-</span>j<span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> y<span style="color: #008000;">&#91;</span>p<span style="color: #000040;">++</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>i<span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000080;">&gt;=</span>j<span style="color: #008000;">&#41;</span> y<span style="color: #008000;">&#91;</span>p<span style="color: #000040;">++</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000040;">-</span>j<span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>m<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> tv<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>x<span style="color: #008000;">&#91;</span>y<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>tv<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000040;">++</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>m<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> ts<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000040;">+</span><span style="color: #000080;">=</span>ts<span style="color: #008000;">&#91;</span>i<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span>n<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&gt;=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000040;">--</span><span style="color: #008000;">&#41;</span> sa<span style="color: #008000;">&#91;</span><span style="color: #000040;">--</span>ts<span style="color: #008000;">&#91;</span>tv<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>y<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
        swap<span style="color: #008000;">&#40;</span>x,y<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        x<span style="color: #008000;">&#91;</span>sa<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
        p<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
            <span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>cmp<span style="color: #008000;">&#40;</span>y,sa<span style="color: #008000;">&#91;</span>i<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span>,sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span>,j<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span> x<span style="color: #008000;">&#91;</span>sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>p<span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>
            <span style="color: #0000ff;">else</span> x<span style="color: #008000;">&#91;</span>sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>p<span style="color: #000040;">++</span><span style="color: #008080;">;</span>
        <span style="color: #008000;">&#125;</span>
    <span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span>
<span style="color: #666666;">//计算height数组</span>
<span style="color: #0000ff;">void</span> calh<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> n<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">int</span> i,k,tmp<span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;=</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> rank<span style="color: #008000;">&#91;</span>sa<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>i<span style="color: #008080;">;</span>
    k<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span>i<span style="color: #000080;">=</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i<span style="color: #000080;">&lt;</span>n<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
        tmp<span style="color: #000080;">=</span>sa<span style="color: #008000;">&#91;</span>rank<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000040;">-</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
        <span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #008080;">;</span>r<span style="color: #008000;">&#91;</span>i<span style="color: #000040;">+</span>k<span style="color: #008000;">&#93;</span><span style="color: #000080;">==</span>r<span style="color: #008000;">&#91;</span>tmp<span style="color: #000040;">+</span>k<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>k<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> <span style="color: #008080;">;</span>
        height<span style="color: #008000;">&#91;</span>rank<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#93;</span><span style="color: #000080;">=</span>k<span style="color: #008080;">;</span>
        k<span style="color: #008080;">?</span><span style="color: #000040;">--</span>k<span style="color: #008080;">:</span><span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    <span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
&nbsp;
<span style="color: #0000ff;">int</span> main<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
	<span style="color: #0000ff;">int</span> N, len<span style="color: #008080;">;</span>
	<span style="color: #0000ff;">char</span> line<span style="color: #008000;">&#91;</span>MAX<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
	<span style="color: #0000ff;">while</span><span style="color: #008000;">&#40;</span><span style="color: #0000dd;">scanf</span><span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;%d&quot;</span>, <span style="color: #000040;">&amp;</span>N<span style="color: #008000;">&#41;</span> <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">EOF</span> <span style="color: #000040;">&amp;&amp;</span> N <span style="color: #000080;">&gt;</span> <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		<span style="color: #0000dd;">getchar</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
		<span style="color: #0000dd;">gets</span><span style="color: #008000;">&#40;</span>line<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
		len <span style="color: #000080;">=</span> <span style="color: #0000dd;">strlen</span><span style="color: #008000;">&#40;</span>line<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
		<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> len<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> r<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> line<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">-</span> <span style="color: #FF0000;">'a'</span> <span style="color: #000040;">+</span> <span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>	<span style="color: #666666;">// 须保证r数组的值都 &gt; 0</span>
		r<span style="color: #008000;">&#91;</span>len<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
		da<span style="color: #008000;">&#40;</span>len <span style="color: #000040;">+</span> <span style="color: #0000dd;">1</span>, <span style="color: #0000dd;">27</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span> <span style="color: #666666;">// 27表示所有r数组内的值必须小于这个值, r从1到26 (基数排序)</span>
		calh<span style="color: #008000;">&#40;</span>len<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
	<span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span></pre></td></tr></table></div>

<h4  class="related_post_title">看看 字符串</h4><ul class="related_post"><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/string.html" title="整理下字符串的一些数据结构和算法">整理下字符串的一些数据结构和算法</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/ac_automachine.html" title="一个OOP的AC自动机代码">一个OOP的AC自动机代码</a></li><li>2012-07-03 -- <a target="_blank" href="http://blog.11034.org/2012-07/trie_in_php.html" title="敏感词过滤，PHP实现的Trie树">敏感词过滤，PHP实现的Trie树</a></li></ul><h4 class="related_post_title">看看 数据结构和算法 </h4><ul class="related_post"><li>2013-05-27 -- <a target="_blank" href="http://blog.11034.org/2013-05/java_map.html" title="java.util中几个Map的性能测试">java.util中几个Map的性能测试</a></li><li>2013-05-07 -- <a target="_blank" href="http://blog.11034.org/2013-05/rectangle_overlap.html" title="判断矩形是否重叠">判断矩形是否重叠</a></li><li>2013-01-15 -- <a target="_blank" href="http://blog.11034.org/2013-01/pack_in_zoj.html" title="背包练习小集合">背包练习小集合</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/string.html" title="整理下字符串的一些数据结构和算法">整理下字符串的一些数据结构和算法</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/ac_automachine.html" title="一个OOP的AC自动机代码">一个OOP的AC自动机代码</a></li>]]></content:encoded>
			<wfw:commentRss>http://blog.11034.org/2012-12/suffix_array.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>一个OOP的AC自动机代码</title>
		<link>http://blog.11034.org/2012-12/ac_automachine.html</link>
		<comments>http://blog.11034.org/2012-12/ac_automachine.html#comments</comments>
		<pubDate>Thu, 06 Dec 2012 12:56:18 +0000</pubDate>
		<dc:creator><![CDATA[-Flyぁ梦-]]></dc:creator>
				<category><![CDATA[数据结构和算法]]></category>
		<category><![CDATA[oop]]></category>
		<category><![CDATA[字符串]]></category>
		<category><![CDATA[树]]></category>

		<guid isPermaLink="false">http://blog.stariy.org/?p=1416</guid>
		<description><![CDATA[网上代码很多，但是大多ACMer的风格，呃不是我说，代码可读性和封装性是比较欠缺的&#8230;也许Java出 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>网上代码很多，但是大多ACMer的风格，呃不是我说，代码可读性和封装性是比较欠缺的&#8230;也许Java出身的码农也就这点还有些优势了吧&#8230;纯自己手动敲的，build_ac函数（建立fail指针的过程）学习了网上的教程后模仿着写的，而且带clear()释放内存。<span id="more-1416"></span></p>
<p>AC自动机基于Trie树，Trie树基于字符表，宏定义SIZE表明字符集大小，宏定义MINCHAR表明字符集中最小的字符，这里的定义只适合26个英文小写字母。</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
</pre></td><td class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #339900;">#define SIZE 26</span>
<span style="color: #339900;">#define MINCHAR ('a')</span>
<span style="color: #0000ff;">struct</span> TrieNode<span style="color: #008000;">&#123;</span>
	<span style="color: #0000ff;">int</span> tail<span style="color: #008080;">;</span>		<span style="color: #666666;">//表明此节点为某一字符串结尾</span>
	TrieNode<span style="color: #000040;">*</span> fail<span style="color: #008080;">;</span>		<span style="color: #666666;">//失败指针</span>
	TrieNode<span style="color: #000040;">*</span> nodes<span style="color: #008000;">&#91;</span>SIZE<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>	<span style="color: #666666;">//节点指针的数组</span>
	TrieNode<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> SIZE<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span> nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
		tail <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span>, fail <span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
	<span style="color: #008000;">&#125;</span>
	<span style="color: #0000ff;">void</span> clear<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> SIZE<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
			<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
				nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>clear<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
				<span style="color: #0000dd;">delete</span> nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
				nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
			<span style="color: #008000;">&#125;</span>
		<span style="color: #008000;">&#125;</span>
	<span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
<span style="color: #0000ff;">struct</span> Trie<span style="color: #008000;">&#123;</span>
	TrieNode<span style="color: #000040;">*</span> root<span style="color: #008080;">;</span>
&nbsp;
	Trie<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span> root <span style="color: #000080;">=</span> <span style="color: #0000dd;">new</span> TrieNode<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span> <span style="color: #008000;">&#125;</span>
&nbsp;
	<span style="color: #666666;">//清空Trie树，释放内存</span>
	<span style="color: #0000ff;">void</span> clear<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span> root<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>clear<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span> <span style="color: #008000;">&#125;</span>
&nbsp;
	<span style="color: #666666;">//插入一个字符串</span>
	<span style="color: #0000ff;">void</span> insert<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">char</span> <span style="color: #000040;">*</span>s, <span style="color: #0000ff;">int</span> len<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		TrieNode<span style="color: #000040;">*</span> N <span style="color: #000080;">=</span> root<span style="color: #008080;">;</span>
		<span style="color: #0000ff;">int</span> idx<span style="color: #008080;">;</span>
		<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> len<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
			idx <span style="color: #000080;">=</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">char</span><span style="color: #008000;">&#41;</span>s<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">-</span> MINCHAR<span style="color: #008080;">;</span>
			<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>N<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span> <span style="color: #000080;">==</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
				N<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">new</span> TrieNode<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
			<span style="color: #008000;">&#125;</span>
			N <span style="color: #000080;">=</span> N<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
		<span style="color: #008000;">&#125;</span>
		N<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>tail <span style="color: #000080;">=</span> <span style="color: #0000dd;">1</span><span style="color: #008080;">;</span>
	<span style="color: #008000;">&#125;</span>
&nbsp;
	<span style="color: #666666;">//插入字符串完毕后，建立fail指针</span>
	<span style="color: #0000ff;">void</span> build_ac<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		queue<span style="color: #000080;">&lt;</span>TrieNode<span style="color: #000040;">*</span><span style="color: #000080;">&gt;</span> q<span style="color: #008080;">;</span>
		TrieNode <span style="color: #000040;">*</span>nd, <span style="color: #000040;">*</span>child, <span style="color: #000040;">*</span>pt<span style="color: #008080;">;</span>
		root<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail <span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
		q.<span style="color: #007788;">push</span><span style="color: #008000;">&#40;</span>root<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
		<span style="color: #0000ff;">while</span><span style="color: #008000;">&#40;</span><span style="color: #000040;">!</span>q.<span style="color: #007788;">empty</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
			nd <span style="color: #000080;">=</span> q.<span style="color: #007788;">front</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
			q.<span style="color: #007788;">pop</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
			pt <span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
			<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> SIZE<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
				child <span style="color: #000080;">=</span> nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
				<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>child <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
					<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>nd <span style="color: #000080;">==</span> root<span style="color: #008000;">&#41;</span> child<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail <span style="color: #000080;">=</span> root<span style="color: #008080;">;</span>
					<span style="color: #0000ff;">else</span><span style="color: #008000;">&#123;</span>
						pt <span style="color: #000080;">=</span> nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail<span style="color: #008080;">;</span>
						<span style="color: #0000ff;">while</span><span style="color: #008000;">&#40;</span>pt <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
							<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>pt<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
								child<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail <span style="color: #000080;">=</span> pt<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
								<span style="color: #0000ff;">break</span><span style="color: #008080;">;</span>
							<span style="color: #008000;">&#125;</span>
							pt <span style="color: #000080;">=</span> pt<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail<span style="color: #008080;">;</span>
						<span style="color: #008000;">&#125;</span>
						<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>pt <span style="color: #000080;">==</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span> child<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail <span style="color: #000080;">=</span> root<span style="color: #008080;">;</span>
					<span style="color: #008000;">&#125;</span>
					q.<span style="color: #007788;">push</span><span style="color: #008000;">&#40;</span>child<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
				<span style="color: #008000;">&#125;</span>
			<span style="color: #008000;">&#125;</span>
		<span style="color: #008000;">&#125;</span>
	<span style="color: #008000;">&#125;</span>
&nbsp;
	<span style="color: #666666;">//建立fail指针完毕后，在指定的字符串文本数据中来搜索匹配串</span>
	<span style="color: #0000ff;">void</span> process<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">char</span> <span style="color: #000040;">*</span>s, <span style="color: #0000ff;">int</span> len<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
		TrieNode <span style="color: #000040;">*</span>nd <span style="color: #000080;">=</span> root, <span style="color: #000040;">*</span>nd2<span style="color: #008080;">;</span>
		<span style="color: #0000ff;">int</span> idx<span style="color: #008080;">;</span>
		<span style="color: #0000ff;">for</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>i <span style="color: #000080;">&lt;</span> len<span style="color: #008080;">;</span>i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
			idx <span style="color: #000080;">=</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">char</span><span style="color: #008000;">&#41;</span>s<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">-</span> MINCHAR<span style="color: #008080;">;</span>
			<span style="color: #0000ff;">while</span><span style="color: #008000;">&#40;</span>nd <span style="color: #000040;">!</span><span style="color: #000080;">=</span> root <span style="color: #000040;">&amp;&amp;</span> nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span> <span style="color: #000080;">==</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span> nd <span style="color: #000080;">=</span> nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail<span style="color: #008080;">;</span>
			<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span> <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>
				nd <span style="color: #000080;">=</span> nd<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>nodes<span style="color: #008000;">&#91;</span>idx<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
				nd2 <span style="color: #000080;">=</span> nd<span style="color: #008080;">;</span>
				<span style="color: #0000ff;">while</span><span style="color: #008000;">&#40;</span>nd2 <span style="color: #000040;">!</span><span style="color: #000080;">=</span> root<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
					<span style="color: #0000ff;">if</span><span style="color: #008000;">&#40;</span>nd2<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>tail <span style="color: #000080;">==</span> <span style="color: #0000dd;">1</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#123;</span>
						<span style="color: #666666;">//找到一个匹配</span>
					<span style="color: #008000;">&#125;</span>
					nd2 <span style="color: #000080;">=</span> nd2<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>fail<span style="color: #008080;">;</span>
				<span style="color: #008000;">&#125;</span>
			<span style="color: #008000;">&#125;</span>
		<span style="color: #008000;">&#125;</span>
	<span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></pre></td></tr></table></div>

<h4  class="related_post_title">看看 oop , 字符串 , 树</h4><ul class="related_post"><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/string.html" title="整理下字符串的一些数据结构和算法">整理下字符串的一些数据结构和算法</a></li><li>2014-07-07 -- <a target="_blank" href="http://blog.11034.org/2014-07/ruby_on_rails.html" title="ruby on rails">ruby on rails</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/suffix_array.html" title="一套可用的后缀数组代码">一套可用的后缀数组代码</a></li><li>2012-07-03 -- <a target="_blank" href="http://blog.11034.org/2012-07/trie_in_php.html" title="敏感词过滤，PHP实现的Trie树">敏感词过滤，PHP实现的Trie树</a></li><li>2012-03-22 -- <a target="_blank" href="http://blog.11034.org/2012-03/java_util_collections.html" title="java.util中的集合类解析">java.util中的集合类解析</a></li></ul><h4 class="related_post_title">看看 数据结构和算法 </h4><ul class="related_post"><li>2013-05-27 -- <a target="_blank" href="http://blog.11034.org/2013-05/java_map.html" title="java.util中几个Map的性能测试">java.util中几个Map的性能测试</a></li><li>2013-05-07 -- <a target="_blank" href="http://blog.11034.org/2013-05/rectangle_overlap.html" title="判断矩形是否重叠">判断矩形是否重叠</a></li><li>2013-01-15 -- <a target="_blank" href="http://blog.11034.org/2013-01/pack_in_zoj.html" title="背包练习小集合">背包练习小集合</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/string.html" title="整理下字符串的一些数据结构和算法">整理下字符串的一些数据结构和算法</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/suffix_array.html" title="一套可用的后缀数组代码">一套可用的后缀数组代码</a></li>]]></content:encoded>
			<wfw:commentRss>http://blog.11034.org/2012-12/ac_automachine.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>敏感词过滤，PHP实现的Trie树</title>
		<link>http://blog.11034.org/2012-07/trie_in_php.html</link>
		<comments>http://blog.11034.org/2012-07/trie_in_php.html#comments</comments>
		<pubDate>Tue, 03 Jul 2012 14:43:25 +0000</pubDate>
		<dc:creator><![CDATA[-Flyぁ梦-]]></dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Trie]]></category>
		<category><![CDATA[UTF-8]]></category>
		<category><![CDATA[字符串]]></category>
		<category><![CDATA[数据结构]]></category>

		<guid isPermaLink="false">http://blog.stariy.org/?p=1174</guid>
		<description><![CDATA[项目需求，要做敏感词过滤，对于敏感词本身就是一个CRUD的模块很简单，比较麻烦的就是对各种输入的敏感词检测了。 [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>项目需求，要做敏感词过滤，对于敏感词本身就是一个CRUD的模块很简单，比较麻烦的就是对各种输入的敏感词检测了。用Trie树来实现是比较通用的一种办法吧，之前一直没机会用过这种数据结构，正好试着写了一下。</p>
<p>因为用PHP实现，关联数组用的很舒服。第一个要解决的是字符集的问题，如果在Java中就比较好办统一的Unicode，在PHP中因为常用UTF-8字符集，默认有1-4个字节不同的长度来表示一个字符，于是写了个Util类来将普通的UTF-8字符串转换成字符数组，每一个元素是一个UTF-8串形成的字符。这一点比较容易实现的，根据UTF-8字符集的格式而来就好。<br />
<span id="more-1174"></span></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> static <span style="color: #000000; font-weight: bold;">function</span> get_chars<span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$s</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$utf8_str</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$len</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">return</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$chars</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #990000;">ord</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">7</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>		<span style="color: #666666; font-style: italic;">//0xxx xxxx, asci, single</span>
			<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$c</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">else</span> <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">15</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span> 	<span style="color: #666666; font-style: italic;">//1111 xxxx, first in four char</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$c</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #000088;">$i</span> <span style="color: #339933;">+=</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">else</span> <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">5</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">7</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span> 	<span style="color: #666666; font-style: italic;">//111x xxxx, first in three char</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$c</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #000088;">$i</span> <span style="color: #339933;">+=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">else</span> <span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span> 	<span style="color: #666666; font-style: italic;">//11xx xxxx, first in two char</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$c</span><span style="color: #339933;">.</span><span style="color: #000088;">$s</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$chars</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>字符单位确认以后，就是写Trie树了。简单的算法，从根路径开始给每个字符建一个关联数组，当字符串结束的时候，用一个null表示结尾。</p>
<p>删除一个串，只要找到串中任意一个字符的子元素数量为1，就表示只有这个串了，整个删除就好了；若子元素数量大于1，则继续根据字符找下去，直到末尾的null。</p>
<p>查找一个串（完全匹配），一直根据字符找到null为止就表明存在，任一字符不存在就表明串不存在。</p>
<p>验证一个长串是否含有任一串，这边算法比较挫，按照每个字符开始都在Trie树种搜索一遍，走的回头路比较多，复杂度有O(n * m)，n为长串长度，m为Trie树深度，不过因为中文Trie树深度很浅，勉强还过得去（英文字符串深度很长）。</p>
<p>然后因为PHP没有全局缓存的机制，每次都要从数据库中读取全部的敏感词，然后建立Trie树再去匹配串的话太麻烦了，采取的办法是将Trie内部的关联数组序列化后直接保存在数据库中，每次只要读取这条数据，然后反序列化，Trie树就回来了。当然进行串的插入和删除，将更新这个序列化数据。</p>
<p>可改进的地方：</p>
<ol>
<li>当某一条路径只有这个串即关联数组数量为1时，可以压缩子树</li>
<li>改进Trie树为AC自动机，即每个节点都添加一个失败指针，指向匹配失败后回到树的哪个节点，这样就仅仅是O(n)的复杂度了。建树的过程比较复杂，对每个插入的串的子串进行处理，运行时查询的效率非常高</li>
</ol>
<p>贴代码：</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> TrieTree<span style="color: #009900;">&#123;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000088;">$tree</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> insert<span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$chars</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span>UTF8Util<span style="color: #339933;">::</span><span style="color: #004000;">get_chars</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">//串结尾字符</span>
		<span style="color: #000088;">$count</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$count</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span><span style="color: #339933;">,</span> <span style="color: #000088;">$T</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">//插入新字符，关联数组</span>
			<span style="color: #009900;">&#125;</span>
			<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> remove<span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$chars</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span>UTF8Util<span style="color: #339933;">::</span><span style="color: #004000;">get_chars</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span>_find<span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>	<span style="color: #666666; font-style: italic;">//先保证此串在树中</span>
			<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>
			<span style="color: #000088;">$count</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span><span style="color: #339933;">;</span>
			<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$count</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>		<span style="color: #666666; font-style: italic;">//表明仅有此串</span>
					<span style="color: #990000;">unset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
					<span style="color: #b1b100;">return</span><span style="color: #339933;">;</span>
				<span style="color: #009900;">&#125;</span>
				<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">function</span> _find<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$count</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$count</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span><span style="color: #339933;">,</span> <span style="color: #000088;">$T</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> find<span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$chars</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span>UTF8Util<span style="color: #339933;">::</span><span style="color: #004000;">get_chars</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span>_find<span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> contain<span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #339933;">,</span> <span style="color: #000088;">$do_count</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$chars</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span>UTF8Util<span style="color: #339933;">::</span><span style="color: #004000;">get_chars</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$utf8_str</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$len</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$chars</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$Tree</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$count</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span><span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span><span style="color: #339933;">,</span> <span style="color: #000088;">$Tree</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>	<span style="color: #666666; font-style: italic;">//起始字符匹配</span>
				<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$Tree</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span><span style="color: #000088;">$j</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
					<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$chars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
					<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">,</span> <span style="color: #000088;">$T</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
						<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$do_count</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
							<span style="color: #000088;">$count</span><span style="color: #339933;">++;</span>
						<span style="color: #009900;">&#125;</span>
						<span style="color: #b1b100;">else</span><span style="color: #009900;">&#123;</span>
							<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
						<span style="color: #009900;">&#125;</span>
					<span style="color: #009900;">&#125;</span>
					<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span><span style="color: #339933;">,</span> <span style="color: #000088;">$T</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
						<span style="color: #b1b100;">break</span><span style="color: #339933;">;</span>
					<span style="color: #009900;">&#125;</span>
					<span style="color: #000088;">$T</span> <span style="color: #339933;">=</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$T</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$c</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				<span style="color: #009900;">&#125;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$do_count</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #000088;">$count</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">else</span><span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> contain_all<span style="color: #009900;">&#40;</span><span style="color: #000088;">$str_array</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">foreach</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$str_array</span> <span style="color: #b1b100;">as</span> <span style="color: #000088;">$str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">contain</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
				<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> export<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #990000;">serialize</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> import<span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">tree</span> <span style="color: #339933;">=</span> <span style="color: #990000;">unserialize</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<h4  class="related_post_title">看看 Trie , UTF-8 , 字符串 , 数据结构</h4><ul class="related_post"><li>2013-10-31 -- <a target="_blank" href="http://blog.11034.org/2013-10/hulu.html" title="hulu校招">hulu校招</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/string.html" title="整理下字符串的一些数据结构和算法">整理下字符串的一些数据结构和算法</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/suffix_array.html" title="一套可用的后缀数组代码">一套可用的后缀数组代码</a></li><li>2012-12-06 -- <a target="_blank" href="http://blog.11034.org/2012-12/ac_automachine.html" title="一个OOP的AC自动机代码">一个OOP的AC自动机代码</a></li><li>2012-03-22 -- <a target="_blank" href="http://blog.11034.org/2012-03/java_util_collections.html" title="java.util中的集合类解析">java.util中的集合类解析</a></li></ul><h4 class="related_post_title">看看 PHP </h4><ul class="related_post"><li>2012-06-28 -- <a target="_blank" href="http://blog.11034.org/2012-06/get_video_cover_image.html" title="各大视频网站的视频截图抓取">各大视频网站的视频截图抓取</a></li>]]></content:encoded>
			<wfw:commentRss>http://blog.11034.org/2012-07/trie_in_php.html/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
