Title: Being Sensitive to Case Author: Don McCall, Senior Support Engineer, Hewlett-Packard The differences between PC and Unix filesystems are legion: how big they can be, what attributes they support, what characters are legal for file and directory names, etc. Samba does a truly heroic job of handling these differences. One way it does that is by recognizing and handling the difference between an operating system that is case sensitive (Unix), and one that simply preserves case (Microsoft Windows9x, WinNT, and Windows2000). In this article I'd like to examine this difference. Let us take for our case (no pun intended) study four filenames: FILEONE.TXT filetwo.txt Filethree.txt FiLeFoUr.TxT The first file (FILEONE.TXT) is all uppercase. The second file (filetwo.txt) is all lowercase. The third file is a special 'case' (again forgive the pun) whose first letter ONLY is uppercase. The fourth file (FiLeFoUr.TxT) is a truly 'mixed' case, representative of the myriad permutations a file name could take on. In Unix, if we used the touch command to create these four filenames, we would get exactly what we requested, i.e., a set of files in the directory whose names exactly matched the case we typed in: ? rw-r-r-- 1 root sys 0 Dec 19 15:17 FILEONE.TXT ? rw-r-r-- 1 root sys 0 Dec 19 15:18 FiLeFoUr.TxT ? rw-r-r-- 1 root sys 0 Dec 19 15:18 Filethree.txt ? rw-r-r-- 1 root sys 0 Dec 19 15:18 filetwo.txt Note that the Unix command 'll' doesn't try to pretty this up - it shows it just as it sees it. I personally approve of commands that actually do what you tell them to, without embellishment. But that's me. Now let's move on to another operating system, Windows NT. For this discussion, we will be using Windows NT 4.0 Workstation, with a NTFS filesystem. Let us again create our four example files, this time using the Windows Explorer graphical user interface (GUI), and the pulldown menu new/text document. We type in (over the obtuse and chatty default name New Text document.txt that Windows NT Explorer favors for text documents) "FILEONE.TXT" and hit return. Immediately we run into a snag - the file we just created appears as "Fileone.txt", NOT "FILEONE.TXT". This is Windows NT Explorer interface being 'nice' to you, assuming that NOONE would actually want to look at a filename all in uppercase. I mean, how pedestrian can one get? Be reassured however; if you go into the 'command prompt' window, and actually do a 'dir' in that directory, you will see that the file DOES actually show up as all uppercase - FILEONE.TXT. Hmmm - this command prompt window appears to be useful. Let's remember it for our next examples, shall we? Again in the Windows NT Explorer GUI, we create our next example file, typing in "filetwo.txt" all lowercase and hit return. Ah, that's much better. Explorer actually shows us the filename just as we typed it, "filetwo.txt". And the 'dir' command in the command prompt window agrees. We now begin to feel a bit more comfortable. (Fools that we are...) Lets move on to our third example file, "Filethree.txt" . Ok, I admit it - I have thrown in a curve here. this is not a standard 8.3 filename - (8.3 being a curiosity from the days when Microsoft operating systems did not have a concept of 'long file names', and all files were required to be no more than 8 characters long, plus an optional .xxx, where xxx represented some file 'type' meaningful to specific applications). This filename contains NINE letters before the "." Let's see how Explorer handles this: We type in (very carefully) "Filethree.txt". Wonderful! Explorer shows us just what we would expect, "Filethree.txt". Does the command prompt 'dir' command agree? Indeed it does. We appear to be getting the hang of this case thing. [apology to all you MS wizards out there who know that I have glossed over the fact that Windows NT has ALSO created a special '8.3' name to go along with this filename - that's a topic for another day...] Now our final example file - a ReAlLy mixed up case file. We again use Windows NT Explorer, and again, both the Explorer and the command prompt representation of the file is just as we would expect, "FiLeFoUr.TxT So, all in all, not a bad track record; except for our little 'helper' Explorer choosing to display filenames all in "UPPER" case as the more ascetically pleasing "Upper" case, we get what we ask for. No, not so fast - we forgot that Windows treats files that conform to the "8.3" specification differently than it does to 'long' filenames. Let's see what happens if we create an all UPPER case filename that does NOT conform to the "8.3" spec. We create (using explorer) a file named "FILEFIFTY.TXT" (note the 9 characters before the ".") and lo and behold, Explorer and the command prompt dir command are in complete and hearty agreement - the file is represented just as we asked, "FILEFIFTY.TXT". I guess if you're long enough, you don't HAVE to be ascetically pleasing... Ok, I've laid all this groundwork not to make fun (well, not entirely) of Windows NT's case handling abilities, but to illustrate a POINT. Windows NT 'preserves' case. IT is NOT 'case sensitive'. Internally, it matters not one whit to Windows NT whether you named the file FILE.txt, File.txt, file.txt, or FiLe.TxT. They're all the same name to Windows NT. Don't believe me? Try it - we have a directory with the file FILEFIFTY.TXT right here. We try to create (using explorer) a file named FileFifty.txt - BUZZZZZ, wrong answer. Explorer responds that "A file with the name you specified already exists. Specify a different filename". More proof? Ok, we like the 'command prompt' window - it has that nice, non-GUI, command line feel that we Unix heads are so comfortable with. If we use the 'dir' command to list a specific file, for instance: dir FILEFIFTY.TXT you will get the listing for FILEFIFTY.TXT but if you type in dir FileFifty.txt or dir FILEFifty.TXT or dir filefifty.txt or ... Well, you can work out all the permutations yourself - guess I should have stuck to a shorter filename. Bottom line is that ALL of these permutations will return the entry for the file FILEFIFTY.TXT. Getting the picture? As Sean O'Connor said in "Highlander": There can BE only ONE! Unlike Unix, where you can have in the same directory the files named: fileone.txt Fileone.txt FILEONE.txt FiLeOnE.TxT all existing at once, in Windows this is impossibility. That's the difference between a filesystem that PRESERVES case, and one that is actually CASE SENSITIVE. And that brings me to what I really want to talk about, which is how CIFS/9000 Server (Samba, to the rest of the world) deals with this. As you would expect from an application that was grown to bridge the gap between the Unix and Windows worlds, it is very flexible. Good news, bad news - with flexibility comes responsibility, and sometimes not a little confusion. In the interest of keeping this article short enough so that someone may actually READ it, lets restrict our conversation to case preservation/sensitivity, and leave out the 'mangled names' permutations. There are four configuration options that Samba provides to allow one to define its behavior when dealing with matters of 'case': preserve case = (yes/no) short preserve case = (yes/no) default case = (upper/lower) case sensitive = (yes/no) The first three options define, in essence, how a filename will be written to the Unix filesystem underneath Samba. These options loosely correspond to how Samba will PRESERVE case. "preserve case" and "short preserve case" both do the same thing; the first in the case of NON 8.3 filenames, and the second specifically for filenames conforming to the older 8.3 DOS filenaming conventions. If these options are set to "yes" (the default), then a file will be saved with the case as it is presented by the client. That is, if you create a file on a Samba share from Windows NT explorer with the name "FiLeNaMe.TxT", a Unix 'll' of the file will show that its name is indeed "FiLeNaMe.TxT". The "default case" option defines how a filename will be saved if either of the 'preserve case' options are set to "no". If "default case = lower" (which is the default) then the effect is the same as if you had "preserve case = yes". HOWEVER, if "default case = upper" then when "preserve case = no" a file will always be saved using all UPPER case letters, regardless of how the client 'presents' it. That is, the file we create in explorer named "FiLeNaMe.TxT", will actually be saved as "FILENAME.TXT" when we look at it with the Unix 'll' command. Whew! That's a lot of words to explain something this simple. Lets go back to our four 'example' files, and look at what actually happens. Let's take the Samba defaults first: preserve case = yes short preserve case = yes default case = lower Using Windows NT 4.0 Workstation Explorer interface, lets create our four files on a Samba share. In Explorer we type in the four file names: FILEONE.TXT Filetwo.txt filethree.txt FiLeFoUr.TxT A Unix 'll' command will show us: ? rwxr-r-- 1 ddmc users 0 Dec 19 16:40 FILEONE.TXT ? rwxr-r-- 1 ddmc users 0 Dec 19 16:41 FiLeFoUr.TxT ? rwxr-r-- 1 ddmc users 0 Dec 19 16:40 Filetwo.txt ? rwxr-r-- 1 ddmc users 0 Dec 19 16:41 filethree.txt Lovely! Just what we asked for. Now, being the good little researchers we are we change ONE thing at a time, and observe the results. preserve case = no short preserve case = yes default case = lower We remove and recreate the four files in the same manner as above, and our trust Unix 'll' command shows us: ? rwxr-r-- 1 ddmc users 0 Dec 19 16:45 filefour.txt ? rwxr-r-- 1 ddmc users 0 Dec 19 16:45 fileone.txt ? rwxr-r-- 1 ddmc users 0 Dec 19 16:45 filethree.txt ? rwxr-r-- 1 ddmc users 0 Dec 19 16:45 filetwo.txt Ooooh - this doesn't look good! I can understand 'filethree.txt' being converted to lower case; after all it is not an 8.3 filename, so it should fall under the auspices of the 'preserve case = no' option. But what about the other three? They are all good little 8.3 filenames. Why was case NOT preserved? Apparently, 'short preserve case' is dependent on 'preserve case'. That is to say, in order for 'short preserve case = yes' to work, 'preserve case = yes' must be set. Ok, lets move on to the case where we specify BOTH preserve case = yes and short preserve case = no. default case = lower (by default). Aha! This is more like it. In explorer, when we create (with the default name) New Text Document(2).txt ll shows us ? rwxr-r-- 1 ddmc users 0 Dec 19 17:10 New Text Document (2).txt When we create FileWW.txt in explorer IT becomes fileww.txt, as we would expect; since this file conforms to 8.3, the 'short preserve case' option is used, and the 'default case' of lower is used to create the filename on the Unix system. Another test; preserve case = yes short preserve case = no default case = upper Again, success - Creating "HeresALongFileName.TxT" in explorer, yields "HeresALongFileName.TxT" in our Unix listing. Creating "ShrtFiLe.TxT" in explorer, yields "SHRTFILE.TXT" in our Unix listing. This is good - we told Samba to preserve case for non 8.3 filenames, and it did. We told samba to use default case = upper for 8.3 filenames, and sure enough, it converted our mixed case "ShrtFiLe.TxT" to all uppercase. Are you getting happy yet? I sure am. This goes a long way towards explaining how Samba 'preserves' case or not, depending on some pretty flexible configuration options. But how is this going to affect our clients, when they start LOOKING for files? Well, remember that all Windows OS'es (WinNT, Win98, etc) PRESERVE case but are not case SENSITIVE. One result of this is a somewhat lassiz- faire attitude in applications and the os itself in trying to FIND a file of a specific name. A program could, for instance CREATE a file named FileName.TXT, and then when it next opened the file, could refer to it as filename.txt, and expect to find it. This presents certain problems when you are a TRUE casesensitive operating system like Unix. FileName.TXT and filename.txt could BOTH be present in the same directory; which one does the client really want? The answer is found in the 'case sensitive' configuration parameter. The other parameters defined how we SAVE filenames when we create files through Samba. This parameter determines the rules we follow when we try to RESOLVE a filename given us by a client, and it the crux of whether that client gets WHAT IT EXPECTS or not. By default, 'case sensitive' = no. This means that no matter HOW the windows os or application passes the filename to us, it will match the first filename that we stumble across that matches the requested filename REGARDLESS of case. For instance, if you have five files in your samba directory: Aa.txt contains aA.txt contains AA.txt contains aa.txt contains AA.TXT contains When you look at the directory via Windows Explorer, all five files will appear, with the unfortunate confusion that there will be TWO entries for Aa.txt ( remember our friendly windows explorer is going to 'translate' an all UPPERCASE 8.3 name as if only the first character in the name were capitalized). Unfortunately, with 'case sensitive' = no, you have no way to tell Samba WHICH file you want to open; doesn't matter which file you click on, samba is going to match it to the first file that has a caseless match for the characters you provide it. In our example above, for instance, we will ALWAYS get the text "AA.TXT" when we double click on ANY of the above filenames in the Windows Explorer window. This is PROBABLY NOT the behavior you would desire. Fortunately, changing 'case sensitive' to yes will actually do what you expect; Windows 'preserves' case, so when it sends the smb request to 'open' the file, it will send the name with the correct case; more correct in fact than the Windows Explorer presents to YOU. That is, the file AA.TXT will be requested by the name AA.TXT, not what you see in the Explorer window, Aa.txt. I trust you can see the potential for confusion and error here, but let me beat the dead horse a little (don't report me to the ASPC - it's just an expression, ok?). With case sensitive = yes, expect to run into the following issues: 1. Sloppy programming: If your program CREATES a file named Initialize.INI, it had better always try to open it as Initialize.INI. Not initialize.ini. Not INITIALIZE.INI. You get the drift. 2. Dos and WFW clients will probably NOT be able to run programs using files created by later clients (Win9x, NT) with the same program. Let me illustrate by example. Say we have an application that has a version that runs on WFW, DOS, WIN98 and WinNT. The WIN98 and WinNT versions create and use a file named Startup.dat. The WFW and DOS use the same file, but of course they must refer to it as STARTUP.DAT (Dos and WfW under stand 8.3 uppercase filenames only). If you install this program on your various PC's, and decide that you want them to ALL use the same initialization file (Startup.dat), you might have the installation program locate that file on a Samba share accessible to all clients. Won't work - if the file is named Startup.dat, the WfW and Dos clients won't see it. if the file is named STARTUP.DAT, the Win98/NT clients won't see it. So what's the practical application of this lengthy diatribe? A simple rule of thumb emerges: If the files on your share are going to be mainly created and accessed by Windows clients, leave the defaults alone. If you HAVE to have multiple filenames in the same directory that differ only by case, change the 'case sensitive' option to yes. And recognize that this may cause some inexplicable behavior on the part of Windows client applications accessing that directory.