加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
process.py 1.45 KB
一键复制 编辑 原始数据 按行查看 历史
zhangzhijintong 提交于 2023-11-13 13:58 . crt.sh backlog processor
from bs4 import BeautifulSoup
import json
import sys
def process(htmlpath):
# htmlpath = "./crtsh/test.html"
jsonfile = open(htmlpath.replace(".html", ".json"), 'w')
with open(htmlpath, "r", encoding='utf-8') as html_file:
html = html_file.read()
soup = BeautifulSoup(html, "html.parser")
logs = []
Table1 = soup.find_all("table")[0]
trs = Table1.find_all("tr")
trs = trs[3:-1]
log = {}
for tr in trs:
infos = tr.find_all('td')
try:
log["log_name"] = infos[0].text
log["url"] = infos[1].text
log["MMD(hrs)"] = infos[2].text
log["Latest STH(UTC)"] = infos[3].text
log["Entries"] = {"Tree Size": infos[4].text, "Backlog": infos[5].text,
"Latest Entry Age": infos[6].text}
log["Last get-sth call(UTC)"] = infos[7].text
log["Google Uptime%"] = infos[8].text
log["Chrome (Status)"] = infos[9].text
log["Chrome Roots Missing"] = infos[10].text
log["Apple (Status)"] = infos[11].text
log["Apple Roots Missing"] = infos[12].text
logs.append(log)
except:
print(infos)
json.dump(logs, jsonfile)
jsonfile.close()
if __name__ == "__main__":
htmlpath = sys.argv[1]
process(htmlpath)
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化