-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Representing file contents with FSTree #71
Comments
I'm glad to see you find this project interesting, these ideas are good. Some of my first thoughts (but I will digest what you wrote more thorough over the next day or so):
node having a relatively small available heap may cause grief here. It maybe possible to keep files under some size threshold in memory though. In practice, the number of files that change between rebuilds is fairly small. The disk accesses associated with them stay in relative check. This admittedly doesn't help when large changsets occur (such as initial build, or branch change or rebase..)
Ya i believe that is the idea, Merge's become in-memory and on access, rather then eager and on-disk. More specifically the merge becomes more a |
Yes, you are probably right. I thought for some reason that broccoli-filter was currently keeping stuff in memory, but that's totally wrong. It writes to disk. So no
I'm not sure I understand exactly what you mean by "on access". Are you wanting for merges to be lazy, i.e. we perform them only when the consuming plugin accesses files? It seems like this would add a lot of complexity for fairly moderate performance gains; and there's the technical issues that merges can fail, so if we perform them conditionally, we might miss errors. |
So, to be clear, what I'm suggesting in my original post is that we do perform the merge fully, just in-memory with FSTree structures, so if nothing has changed, then no file system access happens. |
👍 |
Another thought: Any time it looks like we're emulating a LOAD_PATH type of thing with MergeTrees, it might be a good idea to explore getting the compiler to actually take multiple search paths instead of merging stuff. I suspect that compiler authors don't always appreciate how important it is to have multiple search paths, and that merging makes it very hard to generate good error messages. As a result we often end up with compilers that take a single base path when they should really take an arbitrary number. |
@joliss one of the ideas was, for FSTree via FSMergeTree to provide an FS abstraction that presented the LOAD_PATH (and filter) concepts transparently. In many cases (like the typescript compiler or sass or similar) this would be enough. This may allow more complex projections to be created (via fs-tree) without compilers even noticing.. |
I'd like to turn FSTree into an abstraction that can represent directory trees including their file contents.
Let me start with my current use case:
I'm working on generalizing the broccoli-filter baseclass to allow for outputting multiple files per input file. Let us call it BetterFilter. Here's what I'm planning to do -- bear with me, FSTree will come in in a bit. Say we have an input node
IN
containingand we want to write a plugin that outputs
To facilitate this, the BetterFilter baseclass will for each input file create a directory in its cache, like
CACHE/foo.js-output/
, into which the plugin can write all of its files, like so:On each rebuild, BetterFilter checks the mtimes of each input file and rebuilds those that have changed. Then, it merges all the
CACHE/*-output
directories into itsOUT
(outputPath
) directory.I'd like to use FSTree to address
two problemsthe following problem:Merging every time will likely be a bottleneck. So I'd like to do the merging in-memory and enlist FSTree's help to update only the files that have changed. At the moment,
FSTree.applyPatch
assumes that all files come from the sameinput
directory, so this isn't yet possible.To address this, we could imagine adding something like an
entries.contentPath
property to tell FSTree where the file comes from (i.e. where the symlink should point).Some plugins might not need to write the file to disk. For example, we could re-implement the current Filter by subclassing BetterFilter. But then we would end up writing the result ofprocessString
into the cache directory and symlink it, rather than writing it directly into the output directory. So it would be great if BetterFilter allowed plugins, as an alternative to writing files intoCACHE/foo.js-output/
, to return an FSTree instance that stores all the file contents in memory.To address this, we could imagine adding something like anentries.contentBuffer
property to tell FSTree to write out the buffer directly rather than symlinking some file.Before I take a stab at implementing something along these lines, I just wanted to check if you're conceptually on board with this. Then I'll send PRs (for fs-tree-diff and for node-walk-sync) and we can discuss details.
One particular question: I suspect that for a clean implementation, rather than being able to specify a different input path every time we call
applyPatch
, we may have to require that theinput
directory be fixed when we create the FSTree, so that we can store paths inentry.contentPath
. My guess is that this will be OK, since in Broccoli >=1.0, input paths are stable, but I wanted to double-check if that's actually the case.The text was updated successfully, but these errors were encountered: