天眼查企业动态脚本
start.bat 双击启动脚本
tyc_qydt.py 脚本是获取天眼查企业动态的脚本

脚本逻辑
1：读取待采集企业的天眼查id
select id,xydm,tycid from ssqy_tyc where state=1  order by date_time asc  limit 1
2：根据天眼查id访问新闻列表接口，查询第一页
url = f'https://capi.tianyancha.com/cloud-yq-news/company/detail/publicmsg/news/webSimple?_={t}&id={tyc_code}&ps={pageSize}&pn=1&emotion=-100&event=-100'
3：获取总数，计算分页数
 if (total > 0):
        if (total % pageSize == 0):
            totalPage = total // pageSize
        else:
            totalPage = total // pageSize + 1
4：循环分页数，一次拿一页数据
 for num in range(1, totalPage+1):
5：循环每页数据，判断库中该企业是否存在对应新闻
sel_sql = '''select social_credit_code from brpa_source_article where source_address = %s and social_credit_code=%s '''
如果不存在，使用智能抽取获取正文
    contentText = smart.extract_by_url(link).text
    采集成功
        保存新闻到库中
        insert_sql = '''insert into brpa_source_article(social_credit_code,title,summary,content,publish_date,source_address,origin,author,type,lang) values(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)'''
    采集失败
        保存到失败列表中
        insert_err_sql = f"insert into dt_err(xydm,`from`,url,title,pub_date,zhaiyao,create_date,state,pageNo,pageIndex) values('{social_code}','{source}','{link}','{title}','{time_format}','{info_page['abstracts']}',now(),1,{num},{pageIndex})"

如果存在
跳过
6：更新企业状态、采集成功数、总数、失败数等信息
updateEndSql = f"update ssqy_tyc set state={stateNum},total={total},okCount={okCount},errorCount={errorCount},repetCount={repetCount} ,date_time=now() where id={id}"


