Here are my current plans. Any comments?
Everything will be stored in files. It will be very portable. It will be
fast. It will be robust. You can edit all the text files safely.
A SmallWiki folder is a directory. It has one directory called resources
and one called pages. A page is stored as a text file in the "page"
directory. A resource is stored as a binary file in the "resources"
directory. A folder will probably have other files in its directory. If we
make new kinds of structures, we can make new subdirectories.
Each new version of a page gets added to the end of the file. Each delta
has a timestamp, the author, maybe the version number, and the data. A
timestamp line starts with T, an author line with A, the version number with
V, and the data lines with D. The delta ends with a line that starts with
E. Lines end with one of a set of end of line characters, including CR and
LF. Blank lines are ignored. This should make it so we don't care about
the end-of-line rules of the creator of the file, so it should be easy to
more from Unix to Windows.
Resources are only stored in the file system. Folders and pages are stored
in the image. The disk version of the folders and pages are only read when
the image is starting up. Otherwise, they are only written to, not read.
It should be easy to write the storage manage to handle new pages, new
folder, page edits, and resources. However, I am worried about renames.
Renaming a file is easy. But don't we also have to change all the files
that are in existing pages?
Also, I said this is fast, but it has to open a file for each write, and it
might be opening files in huge directories.
First, there won't be huge directories. If directories get too big then
we'll split them. If a directory is "big" (for some definition of big)
then
it will divide its contents into groups with the same first letter in their
name. If it is really big then it will divide them into groups with the
same first two letters in their name.
Also, the storage manager could cache open files. It could try to reuse
open files and close them on a LRU basis. Since there is a lot of locality
of writing, this should reduce the number of file opens. But I will
measure the performance before implementing this, because I am not sure it
will be necessary. I am pretty sure the first one will be, though. 10K
files in one directory takes a long time to search.
In addition to writing a storage manager to update these files, I'll have to
write something to build up a wiki from a file system, and will have to make
proxies for resources so they don't have to be in the image.
Please tell me what is wrong with this.
-Ralph Johnson