在Whatsapp中遇到同样的问题,已通过以下代码解决:
driver.find_element(By.ID, 'pane-side').send_keys(Keys.PAGE_DOWN)
以下是完整的经过测试且有效的代码:
导入所需库:
import tempfile
import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
逻辑基础:
# 初始化Firefox浏览器驱动并打开网页
driver = webdriver.Firefox()
driver.get('https://web.whatsapp.com/')
driver.maximize_window()
time.sleep(25)
phone = [] # 存储电话号码列表
date = [] # 存储日期列表
msg = [] # 存储消息列表
SCROLL_PAUSE_TIME = 5.5 # 滚动暂停时间(单位:秒)
while True:
elements = WebDriverWait(driver, 40).until(lambda driver: driver.find_elements(By.XPATH, '//div[@class="lhggkp7q ln8gz9je rx9719la"]'))
# 使用ID定位并滚动到底部
driver.find_element(By.ID, 'pane-side').send_keys(Keys.PAGE_DOWN)
rep = 0 # 重复计数器
for element in elements:
elem1 = element.find_element(By.XPATH, './/div[@class="_21S-L"]').text
if elem1 not in phone:
phone.append(elem1)
print(elem1)
elem2 = element.find_element(By.XPATH, './/span[@class="aprpv14t"]').text
date.append(elem2)
elem3 = element.find_element(By.XPATH, './/div[@class="vQ0w7"]').text
msg.append(elem3)
else:
rep += 1
time.sleep(SCROLL_PAUSE_TIME)
# 如果重复计数超过2,则跳出循环
if rep > 2:
print('退出无限循环')
break
打印结果:
import pandas as pd
result = [phone, date, msg]
df = pd.DataFrame(result).T
print(len(phone))
print(phone)
df.columns = ['PHONE', 'DATE', 'MESSAGE']
df