1 回答

TA貢獻(xiàn)1821條經(jīng)驗(yàn) 獲得超5個(gè)贊
使用 DOMDocument 及其好朋友 XPath 可靠地從您的有效 html 中提取標(biāo)題標(biāo)簽。
用于nodeValue()
從標(biāo)題標(biāo)簽的 innerHTML 生成無(wú)標(biāo)簽字符串。(演示 nodeValue() 生成的內(nèi)容)
用于preg_match()
排除前導(dǎo)空格和數(shù)字,然后匹配第一個(gè)、兩個(gè)或三個(gè)單詞。(模式的稍微改變的演示)
如果匹配項(xiàng)至少包含一個(gè)單詞,則用連字符替換空格并將該字符串添加為 id 屬性。
代碼:(演示)
$html = <<<HTML
<div>
<p>Paragraph one is okay </p>
<h2>This will work without problem</h2>
<p>Paragraph two is okay </p>
<h2><a href="#">This heading has anchor</a></h2>
<p>Paragraph one is okay </p>
<h2> This heading start with space</h2>
<p>Paragraph two is okay </p>
<h3>1. <a href="https://www.example1.com/">This wont work</a></h3>
<p>Paragraph one is okay </p>
<h3>2. <a href="https://www.example2.com/">Not working</a></h3>
<p>Paragraph two is okay </p>
<h3>3. Neither this one</h3>
<h3>But this works again</h3>
</div>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//h2 | //h3") as $node) {
if (preg_match('~^\s*(?:\d+\.)?\s*\K\S+(?:\s+\S+){1,2}~', $node->nodeValue, $m)) {
$node->setAttribute('id', str_replace(' ', '-', strtolower($m[0])));
}
}
echo $dom->saveHTML();
輸出:
<div>
<p>Paragraph one is okay </p>
<h2 id="this-will-work">This will work without problem</h2>
<p>Paragraph two is okay </p>
<h2 id="this-heading-has"><a href="#">This heading has anchor</a></h2>
<p>Paragraph one is okay </p>
<h2 id="this-heading-start"> This heading start with space</h2>
<p>Paragraph two is okay </p>
<h3 id="this-wont-work">1. <a href="https://www.example1.com/">This wont work</a></h3>
<p>Paragraph one is okay </p>
<h3 id="not-working">2. <a href="https://www.example2.com/">Not working</a></h3>
<p>Paragraph two is okay </p>
<h3 id="neither-this-one">3. Neither this one</h3>
<h3 id="but-this-works">But this works again</h3>
</div>
- 1 回答
- 0 關(guān)注
- 214 瀏覽
添加回答
舉報(bào)