Daily Producer Skill

个性化每日资讯日报的完整生产系统。

启动协议

当用户表达"生成日报 / 跑 daily / 今天的日报"时：

读取 config/profile.yaml
不存在 → 读取并执行 init/daily-init.md 初始化流程（参考 reference/profile_template.yaml 模板）
存在 → 进入日报生产流程

禁止： 未读取 profile 就问用户"你关注什么"；profile 存在时主动触发初始化。

运行时指令：用户提供信息源

用户在对话中提到某个网站、账号、平台或 URL，想纳入日报采集范围时，agent 必须正确写入 config/profile.yaml，而不是仅在本次临时使用。

判断类型并写入对应字段

情况一：网站 / 媒体 / 官方博客（有 URL，无 opencli 适配器） → 写入 sources.websites.cn 或 sources.websites.global

sources:
  websites:
    global:
      - name: "The Verge AI"
        url: "https://www.theverge.com/ai-artificial-intelligence"
        type: "media"       # media | official | community

情况二：直达 URL（每次必看的固定页面，跳过搜索直接抓取） → 写入 sources.direct

sources:
  direct:
    - "https://openai.com/news/"
    - "https://www.anthropic.com/news"

情况三：有 opencli 适配器的平台（微博/知乎/Twitter/Reddit 等） → 写入 sources.platforms.cn 或 sources.platforms.global，参考 reference/opencli_platforms.yaml 确认平台名

sources:
  platforms:
    cn:
      - name: "小红书"
        opencli: "xiaohongshu"
        commands:
          - "search \"{keyword}\" --limit 10"
        login_required: yes

操作流程

读取当前 config/profile.yaml
判断类型，写入对应字段（追加，不覆盖已有内容）
告知用户"已添加到 profile.yaml，下次生成日报时生效"
若用户希望立即生效（本次日报也包含），在 Step 02 采集时额外处理该来源

禁止： 仅在对话中记住该来源而不写入 profile.yaml；禁止覆盖已有的 sources 列表。

生产流水线

共 11 步。步骤 01-05、07-09 有自动化脚本，步骤 06 由 AI 执行，步骤 00 和 10 为 feedback 系统集成。

每步的详细说明、参数、输入输出格式见 reference/pipeline/ 目录。

profile.yaml
    ↓
00  【读取历史 feedback】       自动加载前一天 data/feedback/{date}.json
    ↓
01  build_queries.py           生成搜索查询
    ↓
02  collect_sources_with_opencli.py  采集候选池
    ↓
03  filter_index.py            时间筛选
    ↓
04  collect_detail.py          深抓正文
    ↓
05  prepare_payload.py         去噪打分（自动读取 feedback 加权）
    ↓
06  【AI】                     生成日报 JSON
    ↓
07  validate_payload.py        校验 JSON
    ↓
08  render_daily.py            渲染 HTML
    ↓
09  send_feishu_card.py        飞书卡片通知（交互卡片，禁止降级为纯文本）
    ↓
10  feedback_server.py         启动反馈服务（后台，保持运行）

Step 01: 生成搜索查询

从 profile 的 topics/keywords 生成两类查询：platform（纯关键词给各平台搜索）和 google（带 after: 日期过滤）。

python3 scripts/build_queries.py --date {date} --window 3

→ 详见 reference/pipeline/01_build_queries.md

Step 02: 采集候选池

用 opencli 从 profile 配置的所有平台（微博/小红书/B站/Twitter/Reddit 等）和网站（机器之心/量子位/TechCrunch 等）采集资讯。

python3 scripts/collect_sources_with_opencli.py --date {date} --max-keywords 5 --max-results 5

采集前自动运行 opencli doctor 检查连接
cn 关键词分发给国内平台，en 关键词分发给国外平台
Reddit 自动探测 opencli 可用性，不通走 API+代理
每次请求间隔 3 秒防限流

→ 详见 reference/pipeline/02_collect_sources.md → 各平台输出字段参考 reference/opencli_output_formats.md

Step 03: 时间筛选

过滤掉超出时间窗口的旧内容。无时间字段的条目直接过滤，网站类条目（Google site: 搜索自带时间过滤）直接保留。

python3 scripts/filter_index.py --date {date} --window 3

→ 详见 reference/pipeline/03_filter_index.md

daily-producer

Install via CLI (Recommended)