View Full Version : Is there a point?
26th December 2004, 07:28 AM
Is there a point to making directories as far as the OS is concerned? It obviously serves a cosmetic purpose for people, but since the OS sees directories as nothing more than files, does it really impact OS/HDD performance in any meaningful way?
26th December 2004, 11:51 AM
It makes it better and much easier for the user to find things. I know that if you start an image search in WinXP it'll look in that user's 'My Pictures' directory first, as it's the logical place that a user would put images. The same applies for sound files.
It makes sense to have files logically sotred on the HD, as it'll improve access times for accessing multiple files of the same type (if they're grouped together, think about the 'My Pictures' directory).
It also makes sense to have different directories for different users.
26th December 2004, 11:57 AM
Valid. Now would simply pointing via reference be a more efficient idea since the OS needed to only look in one place regardless of file type? [Specifically when the user performs a search for an item.] I know that using a conventional system requires searching from the upper most directory defined in the search which I would think is harder on the OS.
26th December 2004, 12:08 PM
Is there a point to making directories as far as the OS is concerned?
what you mean? what way of organizing files other than in directory tree you propose? and don't come with this db-filesystem buzzword - OS still needs it's files somewhere, db-filesystem is convinient to humans which cannot organize theirs data... OS and programs can do it...
It obviously serves a cosmetic purpose for people, but since the OS sees directories as nothing more than files, does it really impact OS/HDD performance in any meaningful way?
I don't quite understand you... if you have set of files divided in directories the OS searches given directory for file - not all files on filesystem but just a subset in given directory - this affects performance... that is obvious.
26th December 2004, 12:21 PM
If the OS sees directories as nothing more than files, how is the use of directories actually affecting the performance of the OS and the drive itself? A DB would readily render the physical location of a file arbitrary really. So I don't know where you get off implying that using a system like that would place the files in a void...If the OS views directories as files then is it not also true that to the OS everything pretty much is in the root directory regardless of what pretty files we use to make things more manageable for ourselves?
26th December 2004, 12:41 PM
If the OS sees directories as nothing more than files, how is the use of directories actually affecting the performance of the OS and the drive itself?
it does make difference when you do ls on directory with 10000 files and with 4 - does it? so when you have things arranged in a tree (assuming you know which directory contains given file) you only search for this file in a subset of all files (this subset is determined by files in given dir)...
A DB would readily render the physical location of a file arbitrary really.
still you have to name these files... and you will end up in namespace similar to directories with DB overload... so what is the point? I understand putting your documents or photos or music in DB - these types of data usualy contains metadata and also can be indexed. but what is a point of storing each system file in DB and doing lookups every time for it? f.e. what is the point of storing program libs in DB? programs can organize themselves, they don't need DB to organize theirs data... people on the other hand always do. and since all innovations must be made on some base I think when DB filesystems will become common they will all operate in user space above traditional filesystem... to be honest - there is no other way to do so.
So I don't know where you get off implying that using a system like that would place the files in a void...If the OS views directories as files then is it not also true that to the OS everything pretty much is in the root directory regardless of what pretty files we use to make things more manageable for ourselves?
that is what I've said - such filesystem can be convinient for user, but not for system - it will slow it down. so good way to do it is spearate user data from system data... and use given filesystem to given data (db for user filesystem, traditional for system files) - see? there is also place for traditional model as well. also take in mind that computing is not only PC, you also have gizmos like routers, portable mp3 players, cameras etc. - you simply cannot implement such filesystems on those devices since they need small footprint and are already using usualy FAT or something like this (ext2?).
26th December 2004, 01:19 PM
Well in the case of *nix storing program libs in a DB may help programs actually find the required dependencies, irrespective of where an install might have put them. [Even with standard paths it just doesn't happen all the time...sigh]What does actually naming the files have to do with anything? If the OS can put the system files where it likes then catalouge those files, why would it slow it down? [Not nessecary for anything except the OS to handle those files on boot and shutdown specifically.]The OS itself wouldn't nessecarily look up the core files at all [absolute path?] you could probably get it to just perform OS file indexing on shutdown, then have it load the appropriate file on boot. I doubt the index for the OS files would ever become so large that it would really put a hit on start up time. The main issue aside from redundancy in the case of desktops would be making sure the user file index is loaded without creating a dramatic boot time increase. Then you would have to make sure you made your rule based browser CLI and GUI as sharp as possible. I understand what you are saying about user space but system space isn't going to span more than a few conventional directories. Plus if certain DAPS are microcosms of real PC's mixing the two you could get some rather nasty slowdown because of having to seek the file in the tree if you don't code right.
26th December 2004, 01:59 PM
Well in the case of *nix storing program libs in a DB may help programs actually find the required dependencies, irrespective of where an install might have put them. [Even with standard paths it just doesn't happen all the time...sigh]
I don't need software that installs itself anywhere it want... Linux is POSIX compatible and has some standards - beacouse of this you can take some piece of software and compile it under BSD, Linux, other unices and even Windows - I don't want to drop all this functionality just to allow obscure programs to install themselves anywhere it wishes with a strong payload of resources (looking up DB)... look if you have incorrectly written software no DB filesystem or other buzzword gizmo is gonna change it. if software is looking after foo library that simply don't even exist DB makes no change here... we already have dependency tracking systems (think apt, think yum) at package level and they work OK for me... it would be huge overload to lookup every lib in relational database (even with cache etc.) - obviously bigger payload than to have library path simply hardcoded as filename... what makes you 100% that DB wil solve dependency problems? that makes no logic since software will still need to specify (to database) what it needs and you stated that some software don't even know that :)
What does actually naming the files have to do with anything?
you have reffer somehow to what you are looking. don't you? will it be filename fe. foo.so.1 or simply foo - you still have to know exactly what it is. you cannot ask database "give me library, you know, this library, you know which one..." :)
If the OS can put the system files where it likes then catalouge those files, why would it slow it down?
beacuse it must to search (do an lookup) for this file (name it object in DB model) - with hardcoded library name and system giving library directory it don't need to search after anything, it wont get two matches - only simpe the file is present or is not.
The OS itself wouldn't nessecarily look up the core files at all [absolute path?] you could probably get it to just perform OS file indexing on shutdown, then have it load the appropriate file on boot. I doubt the index for the OS files would ever become so large that it would really put a hit on start up time. The main issue aside from redundancy in the case of desktops would be making sure the user file index is loaded without creating a dramatic boot time increase.
what makes you thing that such lookups will be 100% correct? it is like good for searching thru photos or music where you get few matches and choose from it. not nesseserly good for system to get three shared objects matches from which only one is correct, system will have to do the lookup again... and maybe again... it will hit the performance.
I understand even some kind of system files (configs) can benefit from database (not exactly relational) f.e. compiled binary registers such as gconf are usualy way faster than reading flat file over and over but this is not relational...
I don't think that putting everything into relational database will make much sense. some data can be treated this way and will benefit. some certainly won't...
26th December 2004, 02:22 PM
also look at it this way:
relational databases have been here already some time (few decades) filesystems also were needed almost always. so why nobody has come with useful implementation of such idea? well because it made no sense... :)
putting files (look - files) into RDB started to give sense after evolution of digital data. look at how PC's are used right now - people tend to have loads of music files, loads of digital photos and there will be loads of movies soon, people do loads of email conversations, im chat, browse a lot (I mean internet adresses)... few years ago it haven't be this way and files you kept or your disk were usualy documents, not too much music or photos... now that makes sense to store such data in RDB,because you get a lot of it and most of it already has good metadata associated...
each such file (photo, record) has metadata associated with it. photo has f.e.:
* file name - it often describes order in which photos were taken
* directory it resides - it often inicates that given file is part of larger collection
* date - helps put the pictures on timeline
* EXIF data - which device made the photo, its dimensions, camera settings etc. (probably not much helpful when searching)
* with RBD-FS you could also assign comments (I don't know if EXIF already does not do that) like "grandpa" with files
* also you can do (limited and not to strict) analysis of files f.e. seaching color balance if it was an nigthshot or some face recognition and compare face on one picture with faces on other picture named "grandpa"
the last point is bit advanced. now here I see clear advantages of accessing such files via RDB (over plain PATH) - I can select photos from given time, or access photos with "grandpa" association, or select photos taken in dark etc. probably other stuff like if this photo was printed, quality etc. so such RDB-FS should be extreemly configurable/modular to fit my neeeds - here I can sacrifice performance for my convinience.
* has name usualy and resides in directory and also common things like filedate etc.
* has TAG describing its genere, album, year... etc.
* can have atributes such as my ranking, play count, playlist parent etc.
* technical data, quality etc.
here also I can se clear benefit over traditional model... also think this way of mail messages, documents, something else (maybe even config files) that I've forgot or don't though about. this makes sense to me, make this system modular/fast/configurable and probably networked with multiuser support (I've recently installed Google Picassa on my friends computer, too bad it indexed all users files giving everybody acess to everyones files so her father has found some rather brave photos ;) Windows is amazing - it even don't set the file permissions right) and I will use it with pleasure...
one concern here - for this to work well it have to has very good support from data files themselvs - so each datatype (think here, outlook email, open office document, photoshops image, media player music etc.) should export have standardized methods for accessing its metadata to make it consistent (there is no point of having RDB-FS without metadata) and allow system to use it. now as we know windows/closed software world - they will have very hard times to come with something standard :) so we (oss/free software world) have clear advantage here - if somebody will come up with useful implementation (I think something like freedesktop.org here) programs will just start to use it...
my point is this is great idea and will for sure be soon implemented. but just for user data. I don't se any benefit for system files here (well there may be something associated with packaging but that is it and that probably should also be treated at userspace level). I mean it makes little sense and benefit for things like binaries, drivers, libs and so on...
26th December 2004, 02:32 PM
Hah, hmm I see libs installed in directory they should be installed in..yet I still get magic error messages[Files nessecary to start X magically disappeared for no reason at all on my laptop recently.], I don't really see your point, its not as if the conventional look ups or other methods always work. You say if you have incorrectly written db software as if the people who write stuff for *nix and windows always write their stuff correctly, won't nessecarily be any different with a relational DB. I don't remember implying that either oh well. Why would the system nessecarily need to re query over and over again makes no sense. A one time query for the OS itself on boot or shutdown [preferably]should be enough, then everything else can rely on a path variable like they do now. If we can't pull that off in a reasonable fashion, the main system files would be located in the only truly nessecary conventional directory on the system. [Why wouldn't the OS just be dead specific and refer to what it needed at a given time by primary key, which is gauranteed to be unique in any well coded relational database, the system keys do not nessecarily need to be interrelated for a specific query. In short you would have it so that the OS queries OS files with absolute precision so as only to come up with the one or however many files it actually needs at a given time. It's a matter of fine tuning. ] The rest of the OS should do just fine after the boot has finished. I think however that the best solution would be to hardcode that path or the references for the OS, however I don't actually have to make folders to do this now do I? Getting back to my original question is there really a point to using conventional directories when I can use a system that will create virtual directories and I can just point the OS to its core files using an absolute path?
What will be more interesting is how to pull of suitable redundancy for program/lib DB and user level [data files, like music etc.] DB. I think that is far more pressing than whether or not to have the kernel use a DB.
This what I am thinking too I have a boatload of video files and would love to display only certain ones given a certain criterion, I type much faster than I use a mouse. A rule based file browser front end for SQL like queries for my data would be a god send. [I want it and I want it now..130+gigs and exploding! I can't really keep up with it....]
26th December 2004, 02:41 PM
It is really a shame Winfs probably wont work half as good as what either of us can envision on this thread...Your nightmare scenarios with the DB and the os probably would come true. :)
vBulletin® v3.8.7, Copyright ©2000-2013, vBulletin Solutions, Inc.