关于网页抓取：用php DOMDocument遍历时随机获取非对象类型

Getting non-object type randomly when traversing with php DOMDocument

下面是我的代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

$xpath = new DOMXPath($doc);
// Start from the root element
$query = '//div[contains(@class,"hudpagepad")]/div/ul/li/a';
$nodeList = @$xpath->query($query);

// The size is 104
$size = $nodeList->length;

for ( $i = 1; $i <= $size; $i++ ) {
$node = $nodeList->item($i-1);
$url = $node->getAttribute("href");

$error = scrapeURL($url);
}

function scrapeURL($url) {
$cfm = new DOMDocument();
$cfm->loadHTMLFile($url);
$cfmpath = new DOMXPath($cfm);
$pointer = $cfm->getElementById('content-area');
$filter = 'table/tr';

// The problem lies here
$state = $pointer->firstChild->nextSibling->nextSibling->nodeValue;

$nodeList = $cfmpath->query($filter, $pointer);
}

基本上这会遍历链接列表并使用 scrapeURL 方法抓取每个链接。

我不知道这里的问题，但随机我得到一个非对象类型错误，试图获取 $pointer，有时它通过没有任何错误并且值是正确的。

有人知道这里的问题吗？我猜测问题发生的时候是页??面没有正确加载的时候？