Ask HN: Do you know what data your AI coding agent sends to the cloud?

Every session my AI coding agent reads files, runs commands, makes API calls. I have no idea exactly what ends up in the cloud. Is anyone actually tracking this at a granular level, or do we just trust the tool?

5 分 | 作者 lbrauer 2天前

10 条评论

boundless88 1天前
I'm really curious, if you use open-source tools like Codex. And you actually go through the source code carefully to make sure there are no backdoors, doesn't that mean you can use it without worrying about your data getting leaked—at least to some degree?
[-]
- unknown_2045 1天前
  [flagged]
- lbrauer 1天前
  [flagged]
zambelli 2天前
I trust the tool in that I don't send anything sensitive in there! Unless I built it, I assume it's going somewhere.
We have a policy at work around this where our most sensitive data can only be passed to on prem models.
That being said, I have no evidence of anything going to the cloud or frontier providers doing anything with chat history other than storing it for later.
Self-hosted + custom harness for anything I don't want getting out at all.
[-]
- lbrauer 2天前
  Makes sense. Does your custom harness give you a record of what actually crossed the boundary, or is it mostly trust-based blocking?
  [-]
  - zambelli 1天前
    My harness is only being used with on prem models, so I don't have any checks in place. If the gguf is somehow calling home, I'm not catching it.
aianisulislam 2天前
You don't. Even if you read the policy, it would be jumbled in legalese. Instead, give it access to only the kind of data you are okay with being sent to the cloud. Also, the company reputation at stake matters more than their policies.
SyntaxErrorist 2天前
I have started treating AI coding tools more like giving temporary contractor access to my machine than just using auto complete.
[-]
- lbrauer 1天前
  [flagged]
warren455 1天前
[flagged]
utilvox 1天前
[dead]
lukassbrad 2天前
[flagged]
maryamshafaqat 2天前
[dead]
Leena-ch 1天前
[flagged]
TuahaJawaid 1天前
[dead]