Ask HN: Do you know what data your AI coding agent sends to the cloud?

Every session my AI coding agent reads files, runs commands, makes API calls. I have no idea exactly what ends up in the cloud. Is anyone actually tracking this at a granular level, or do we just trust the tool?

5 分 | 作者 lbrauer 2天前

10 条评论

  • boundless88 1天前
    I'm really curious, if you use open-source tools like Codex. And you actually go through the source code carefully to make sure there are no backdoors, doesn't that mean you can use it without worrying about your data getting leaked—at least to some degree?
  • zambelli 2天前
    I trust the tool in that I don't send anything sensitive in there! Unless I built it, I assume it's going somewhere.

    We have a policy at work around this where our most sensitive data can only be passed to on prem models.

    That being said, I have no evidence of anything going to the cloud or frontier providers doing anything with chat history other than storing it for later.

    Self-hosted + custom harness for anything I don't want getting out at all.

    • lbrauer 2天前
      Makes sense. Does your custom harness give you a record of what actually crossed the boundary, or is it mostly trust-based blocking?
      • zambelli 1天前
        My harness is only being used with on prem models, so I don't have any checks in place. If the gguf is somehow calling home, I'm not catching it.
  • You don't. Even if you read the policy, it would be jumbled in legalese. Instead, give it access to only the kind of data you are okay with being sent to the cloud. Also, the company reputation at stake matters more than their policies.
  • I have started treating AI coding tools more like giving temporary contractor access to my machine than just using auto complete.
  • warren455 1天前
    [flagged]
  • utilvox 1天前
    [dead]
  • lukassbrad 2天前
    [flagged]
  • [dead]
  • Leena-ch 1天前
    [flagged]
  • TuahaJawaid 1天前
    [dead]