Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dumping changes to a config.yaml file places changed lines at the beginning #100

Open
bthorsted opened this issue Jun 26, 2020 · 7 comments

Comments

@bthorsted
Copy link

bthorsted commented Jun 26, 2020

As the title says, when I make a change to some of the config variables and dump it to the config.yaml file, the edited variables are now inserted at the very top of the file rather than the position they used to be in. This is especially frustrating when the config file is "supposed" to start with a few lines of comments indicating authorship and explanations for new users.

OS and versions

OS: Ubuntu 18.04
Python: 3.8.2 64-bit
confuse: 1.1.0

Steps to reproduce

  1. Create a config.yaml with some leading comments and some parameters on the following lines
  2. load a config.yaml file
  3. read a parameter from the file
  4. change the parameter
  5. dump the edited configuration to config.yaml
  6. open config.yaml in a text editor

Example config.yaml

############################################
#          AAA_image_segmentation          #
#               Config file                #
#          Author: Bjarne Thorsted         #
#      e-mail: bt*******@*************     #
############################################

appName: AAAml
# Usage instructions:
...
active_labels: [2, 3, 4, 5]

Example python script

import os
import confuse

os.environ['AAAMLDIR'] = os.curdir

config = confuse.Configuration("AAAml")
config.sources[0].default = True

active_labels = config["active_labels"].get()
config["active_labels"].set([2, 3])

with open("config copy.yaml", mode="w") as cfg_file:
    cfg_file.write(config.dump())

with open("config copy.yaml", mode="r") as cfg_file:
    print(cfg_file.read())

console output:

active_labels: [2, 3]
############################################
#          AAA_image_segmentation          #
#               Config file                #
#          Author: Bjarne Thorsted         #
#      e-mail: bt*******@*************     #
############################################

appName: AAAml
# Usage instructions:
...
@sampsyo
Copy link
Member

sampsyo commented Jun 26, 2020

Interesting use case! Indeed, this will not work yet. To clarify, our current round-tripping of comments is quite hacky, but it could plausibly be greatly improved by switching to ruamel.yaml (cf #52). Regardless, however, the scheme works by trying to preserve the formatting of a given source, whereas this use case is about merging multiple sources while preserving the formatting of one of them. It certainly sounds possible, but it will take some creativity to figure out exactly how to do that in a predictable way!

@bthorsted
Copy link
Author

bthorsted commented Jun 28, 2020

Thank you for the reply. I think you may have misunderstood my use case (or I have not used confuse the way I think I have). I am trying to load a config.yaml, update one of the values in said file and then write the changes back to the same file. In the code above, I use "config copy.yaml" to make sure I don't ruin the original config file. I suppose I could just have used the copy as backup and revert if something got screwed up, but this way seemed like an easier approach.

@sampsyo
Copy link
Member

sampsyo commented Jun 28, 2020

Indeed—that's the tricky thing here. When you do config["active_labels"].set([2, 3]), you are actually not modifying any configuration data structure in place! Confuse works by "layering" together multiple configurations and then providing you a "merged" view on them. So .set() is non-destructive, in the sense that it is not modifying the original configuration—it is just creating a new configuration that has those new values in it, as an overlay on top of the old configuration.

That's why this is harder than it looks for Confuse to do: it needs to merge configuration sources together, even when they do not align perfectly, when you do dump.

@bthorsted
Copy link
Author

Well, now I am confused (pardon the pun). What is the purpose of preserving comment lines during dump(), if I can only write an unmodified version of the default config file? And am I correct in gathering from your response that as of version 1.3.0 I cannot modify a config file and dump() while preserving comments?

@sampsyo
Copy link
Member

sampsyo commented Jun 28, 2020

Well, here's the thing: you can actually dump a "changed" version of the config file, but the way that "change" works is pretty subtle. You're not actually modifying one data structure in place, as d[k] = v typically does in Python. Instead, Confuse works by maintaining multiple layers and merging them together. So when you dump a "changed" config, you do see the updated data—but because Confuse is preserving the illusion that data was actually modified, whereas it has actually kept the original data and the updates separate.

I think we may be getting a little side-tracked though. The main thing is that preserving the order of things when dumping, even after these "changes," would be great but it is more complicated than it sounds. I'd be interested to see if we can do it anyway but it will take some effort.

@bthorsted
Copy link
Author

As you mentioned earlier, ruamel.yaml has some features that should make it easier. Since, I haven't yet had a look at any of this, maybe you can tell me how far away you are from implementing something similar?

Also, I've read about bespON, which is somewhat similar to YAML and ships with complete round-tripping. Maybe there are some insights to be gleaned from the source code?

@sampsyo
Copy link
Member

sampsyo commented Jun 29, 2020

I don't currently have plans to implement it, so any help would be greatly appreciated!

However, part of the point of the discussion above was to emphasize plain round-tripping from the serialization library, while helpful, is not enough. There are also Confuse-specific issues related to the "layering" concept that would need to be resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants