-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for application level artifacts #135
Comments
Hey @splitice The first challenge was redirecting the output of The next challenge is:
As you are probably aware this project "Just works" by configuring the Here are the options i've thought about there may be others so feel free to add to the list: EmptyDirThe emptyDir option creates storage that only lasts as long as the workload pod although it does survive crashes. Today you can use a sidecar attached at pod startup to monitor the heapdump location and upload the heapdump. The sidecar approach adds additional workload and complexity to each deployment and could possibly be improved by extending the coredump agent to setup INotify events on predefined locations within the container files system in The monitoring could be set up when either specific annotated pods are created or preferably when pods with a specific named emptyDir is created. There is a sample rust pod watcher as part of kube-rs that should help as a starting poing This is quite complex but has the benefits of running on plain kubernetes without additional privileges or storage configurations than is currently required. HostPathThe HostPath option is more straight forward as the heapdump could be stored beyond the lifecycle of the pod and the agent could be extended to monitor an additional folder. This is more inline with how the current agent works but as it requires the workload pods to have elevated privileges I don't think it's the best approach. ReadWriteManySimilar to the HostPath option its possible to assign the workload pod and the agent pod a shared ReadWriteMany volume. This has the benefit of not requiring privileged access to the host but will require RWX compatible storage in the k8s environment which may not be available by default in some common scenarios. If the emptydir scenario doesn't work then this would be my preferred next option. Adding @mhdawson as we discussed the status of dump support in node recently and he may have found other options. |
@No9 I love your response. Everything I was thinking (but in substantially more depth). My opinion would be to avoid ReadWriteMany, it's simply not available for every K8s platform. EmptyDir would be ideal if no other limitations prevent implementation. I think the lifecycle limitation could be avoided by only supporting multi container pods, these should keep the emptyDir around on a container restart (according to my understanding) Otherwise I think hostpath is actually possible without elevated permissions (as long as the parent dir is chmod 0777) but I might be wrong. One fourth option (as perhaps a last resort) would be a custom csi driver. We have one for overlay filesystem mounts and suspect a custom mount (fuse or tmpfs) accessible to the collector could also work. |
Thanks @splitice It was a bit of a brain dump but glad it made sense. Agreed ReadWriteMany config is a heavy requirement. Thinking about it further I'm not sure about approaching EmptyDir by keeping the file system alive with sidecar pods or for uploading the heapdump. Especially when it comes to memory issues as k8s could evict the whole pod not just the individual containers and then we've lost the heapdump. No matter what the permissions on the folder HostPaths in OpenShift require specific security exceptions as they are an attack vector. The general position across the k8s eco-system is they shouldn't be used for regular workloads if possible and I think they should be avoided. This project used to use an object storage fuse when it first started It was very platform specific and while there are providers for most clouds consolidating them into a single release was beyond the scope of this project. Depending on what we can find out about CSI my current approach would be:
Now I have written it out if we took the approach of 1-4 there is no obvious benefit of hosting it in this project as the zip and info file creation currently relies on access to the host crictl which wouldn't be available so would have to be rewritten although the upload config may be useful. This probably needs a bit more research as we should look to see if there is prior art that might be useful in things like logging agents. |
Would it be possible to mark a emptyDir volume for collection in place native core dumps (as well)?
The reason I ask is that we also want to collect heapdumps made on OOM from nodejs (there are probably other applications that exhibit similar behaviour). Node does this with
--heapsnapshot-near-heap-limit
and could be configured to output to an emptyDir volume or similar.My theory would be to have core-dump-handler responsible for uploading any files to it's object store that it finds in these folders along with some basic metadata.
Thoughts? Aligned enough with the goals of this project, or something seperate?
The text was updated successfully, but these errors were encountered: