XPATH与Python，Selenium，

XPATH partial match tr id with Python, Selenium,

我也可以使用正确的XPATH来提取tr id = " review_ "元素吗？
我设法获取了元素，但是很幸运地发现了ID，因为它们是部分匹配

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

<table class="admin">
<thead>"snip"</thead>
<tbody>
<tr id="review_984669" class="">
<td>weird_wild_and_wonderful_mammals</td>
<td>1</td>
<td><input type="checkbox" name="book_review[approved]" id="approved" value="1" class="attribute_toggle"></td>
<td><input type="checkbox" name="book_review[rejected]" id="rejected" value="1" class="attribute_toggle"></td>
<td>February 27, 2019 03:56</td>
<td>Show</td>
<td>
<span class="rest-in-place" data-attribute="review" data-object="book_review" data-url="/admin/new_book_reviews/984669">
bad
</span>
</td>
</tr>
<tr id="review_984670" class="striped">

我将Selenium与Chrome一起使用来提取页面上的唯一表格。

1	Table_Selenium_Elements = driver.find_element_by_xpath('//*[@id="admin"]/table')

然后我使用下面的方法从每一行获取数据。

1
2
3
4
5
6
7
8
9
10

for Pri_Key, element in enumerate(Table_Selenium_Elements.find_elements_by_xpath('.//tr')):
# Create an empty secondary dict for each new Pri Key
sec = {}
# Secondary dictionary needs a Key. Keys are items in column_headers list
for counter, Sec_Key in enumerate(column_headers):
# Secondary dictionary needs Values for each key.
# Values are individual items in each sub-list of column_data list
# Slice the sub list with the counter to get each item
sec[Sec_Key] = element.get_attribute('innerHTML')[counter]
pri[Pri_Key] = sec

这仅显示每个中的数据，即
" weird_wild_and_wonderful_mmmmals "，" 1 "

但是我实际上也需要tr id = review_xxx。我不知道该怎么做。
ID号会发生变化，因此可能是xpath \\'contains \\'表达式或xpath \\'begins_with \\'表达式。

由于我是菜鸟，所以我想我已经捕获了review_ID，但未通过for循环正确提取。

有人可以告诉我正确的XPATH来提取父tr和子tds。
...然后我将调整for循环。
谢谢
山姆

相关讨论

1	driver.find_element_by_class_name('striped')

或

1 2	# If it is the last row in the table. driver.find_elements_by_css_selector('tbody tr')[-1]

或

1 2	# If it is surely the 2nd row in the table. driver.find_elements_by_css_selector('tbody tr')[1]

相关讨论

基于具有以下选择器示例的html，您可以获取所有行：

1
2
3

admin_table_rows = driver.find_elements_by_css_selector(".admin tbody > tr")
admin_table_rows = driver.find_elements_by_css_selector(".admin tr[id^='review_']")
admin_table_rows = driver.find_elements_by_xpath("//table[@class='admin']//tr[starts-with(@id,'review_')]")

要获取id属性，可以使用element.get_attribute("id")方法。

这里是如何抓取数据的示例：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)

admin_table_rows = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,".admin tr[id^='review_']")))

for row in admin_table_rows:
row_id = row.get_attribute("id").replace("review_","")
label = row.find_element_by_css_selector("td:nth-child(1)")
num = row.find_element_by_css_selector("td:nth-child(2)")
date = row.find_element_by_css_selector("td:nth-child(3)")
href = row.find_element_by_css_selector("a").get_attribute("href")

相关讨论

您只是要一个xPath来定位表元素本身吗？

在您的示例中，您有一个xPath查找您拥有的表

1	[@id="admin"]

\\'admin \\'是类，而不是ID。如果您仅将其切换为

，它是否有效？

1	Table_Selenium_Elements = driver.find_element_by_xpath('//*[@id="admin"]/table')

相关讨论