提交 dcf7eea1 作者: 薛凌堃

求是网作者调整

上级 ad1e6d7b
......@@ -113,16 +113,20 @@ if __name__=='__main__':
author = new.find('font', face='楷体').text.replace('/', '').replace('\u3000', ' ').replace('\xa0', '')
except:
continue
if len(author)>4:
continue
# if len(author)>4:
# continue
# if '(' in author or '本刊' in author or '国家' in author\
# or '中共' in author or '记者' in author or '新闻社' in author\
# or '党委' in author or '调研组' in author or '研究中心' in author\
# or '委员会' in author or '博物' in author or '大学' in author or '联合会' in author :
if '(' in author or '本刊' in author or '国家' in author \
or '中共' in author or '记者' in author or '新闻社' in author \
or '党委' in author or '”' in author\
or '大学' in author or '洛桑江村' in author:
# if '(' in author or '本刊' in author \
# or '记者' in author or '新闻社' in author \
# or '”' in author\
# or '大学' in author or '洛桑江村' in author:
# continue
if '国资委党委' in author:
pass
else:
continue
new_href = new.find('a')['href']
is_member = r.sismember('qiushileaderspeech::' + period_title, new_href)
......@@ -159,7 +163,7 @@ if __name__=='__main__':
'sourceAddress': new_href,
"createDate": nowDate
}
# log.info(dic_news)
log.info(dic_news)
if sendKafka(dic_news):
r.sadd('qiushileaderspeech::' + period_title, new_href)
log.info(f'采集成功----{dic_news["sourceAddress"]}')
......
qiushi_leaderspeech.py
qiushi_leaderspeech.py
11.06:
部署到231上 任务计划程序,定时7点每天执行
todo:需要进一步确定索要领导的名单,包括习、国资委、其他部委领导
部署的暂时只有领导人
本地上传的作者是国务院国资委
\ No newline at end of file
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论