If a node has exceeded its normal maximum disk usage, it will look for things to free. It will begin by cancelling any outstanding prefetches (see the above paragraph), and freeing any partially downloaded inodes which have not been recently used. If still under storage pressure, it will begin searching for inodes it owns, yet is not the only owner of. It will free those, after a suitable handshake (so they don't free it themselves). Finally, if it can't find anything suitable to free, it will start returning -ENOSPC to write requests.
* use the L<Net-Cluster> library for communication
* cache data/metadata in a local filesystem store
* download data from other peers as needed
* distribute data around the network before it is needed
* attempt to make the storage network as redundant as possible
Disk pressure is calculated according to 3 settings: min, max, and hardmax. "min" > "max", and "max" > "hardmax". "hardmax" is not
The general idea here is like a network RAID5, where the system will intelligently send inode data to make sure at least one copy of it exists in every location. It could also be configured in a sort of network RAID1 mode, in which data is forcefully distributed to all nodes before a write call returns.
to be exceeded under any circumstances; running into it will result in -ENOSPC. "max" is the reasonable maximum storage size; exceeding it will cause something else to get freed, as above. "min" is a minimum size the filesystem will seek to fill; if the storage layer is using less than "min", the filesystem will seek something off-node to fill it.
The distance between "min" and "max", and the distance between "max" and "hardmax", should each be at least as large as any file you expect to store in the filesystem. (This could therefore be as large as several gigs, in many cases.) Having too little room between max and hardmax will result in -ENOSPC errors occurring more often, because the disk-freeing process is asynchronous. Having too little room between min and max will result in cache grinding, and added bandwidth consumption.
See [[/software/NoidfsDesign|NoidfsDesign]] for more specifics.
All read requests are served out of local storage if possible, and wait for the data to be fetched from the network otherwise. If fetching the requested data would bump the disk-usage above "max", presumably something else will get knocked out of the cache. If fetching the requested data would bump the disk- usage above "hardmax", the data is served in "degraded mode"; the data is fetched over the network and returned directly to the application, without being cached (similar in concept to NFS's normal mode of operation).