Recent Changes - Search:

Navigation

Cy/VOS

Wiki

Users

FileReferences

Category: Non-GUI proposals and ideas

Intelligent file references

The lowest common denominator for referencing a file is a string path. Some file systems use this entirely, e.g. FAT. Some file systems have more intelligent referencing, but can fall back to a path, for example Macintosh reserves one character, ':', for path representations of file locations although files are often represented by the {volume ID, directory ID, filename} triplet.

One of the biggest drawbacks to using string paths is that symlinks break if you move their target. It is also not recommended to move or rename files that are curreny referenced by a program but not locked for access, because some programs can lose track of the file in the process (Mac OS has several ways to reference files, some programs can actually track files that have moved since you last had their window up; however, some programs can't track this, and it's hard to tell which programs do and don't).

My favourite method for referencing a file that gets around all this is nothing more than a mere UInt32 (or UInt64 if you're desperate). That's it. The numeric ID references an entry in the central file system directory or database, and remains constant no matter where you place the file or what you call it or do to it. Ask for the file by ID and the system can instantly find it anywhere on the disc.

This is not a new idea entirely. UNIX file systems give inodes numbers, such that any file can be referenced numerically at a low level. However, files are not user-accessible without first pointing a path at that inode number; this is to facilitate a hard link system. You can't bypass this because UNIX uses directory-entry-level security and thus a full path must be presented to API calls in order to determine access rights for that file.

Macintosh has had a lot more success with this idea, because it never implemented hard links like UNIX has. Macintosh aliases (symlinks) instantly adapt to changes to their target because the OS can look up immediately where the file went to. The contents of File > Open Recent menus also tend to be dynamic and track changes in their items. I can't explain more about how it works than this because the Macintosh alias record is seemingly very complex and I don't yet know how it works.

There is, however, a down side. If I install a new version of program on my Macintosh to where the old version used to live (with idential path and executable names) and put the old copy in the Trash, all aliases used for program launch (Apple menu, dock etc) will fail, telling me that "The application "foobar" cannot be opened because it is in the Trash". That's right, the Mac adapts so well it tracks old copies right into the trash. The solution is to replace the old version's executable with the new executable in order that the new copy will inherit the same file ID as the original and aliases will point at it. This gets even more fun if you intend to preserve the original copy of the program intact...

I wonder if the better approach would be for symlinks to first look for their target by path -- for the latest copy of the program -- and fall back to the numeric ID for if the file cannot be found at its existing location, it can then be located at wherever it got moved.

Of course, anyone wanting hard links might not like this idea, I am curious as to whether anyone has a better one.

File References and the VFS

Unlike some OSes which try to mould foreign file systems into behaving like their native file systems and running through the native API, Cy/VOS is supposedly going to have as its upper level, a virtual file system (VFS). Given that the string path is the lowest common denominator of most file systems, the Cy/VOS file system API needs to be done in terms of paths in order to address items on volumes running on any supported file system.

The other reason for using paths is for the sake of human understanding. "/Daemons/WWW/root_site/" means more than, say, '5218'.

The question then becomes, how does a developer then take advantage of the particular advantages of more advanced file systems, such as Macintosh-style numeric file references?

I don't know whether file handles could be seen as unions of a variety of handles for various referencing types. Maybe a union of a character pointer (string path), a UInt64 (for database and single-directory file systems). Or maybe not a union but a struct, such that the file system can return all valid referencing types in effect. In that as long as a program stores the struct completely (and all its pointed-at data e.g. unlimited strings for paths), on passing that back to the API the API can choose the most appropriate file reference available, no matter what might change in the future (e.g. the file reference now pointing to a volume of a different file system mounted to the same mount point as before, e.g. a virtual volume of data copied off a previously-mounted CD-ROM).

Edit - History - Print - Recent Changes - Search
Page last modified on November 21, 2004, at 08:39 PM