How do case-insensitive filesystems display both upper and lower case file names?












12















This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.



On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.



For example:



$ cd ~/Documents
$ pwd
/home/derp/Documents

$ cd ../documents
$ pwd
/home/derp/documents

$ cd ../docuMents
$ pwd
/home/derp/docuMents

$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS

$ cd ../documentS
$ pwd
/home/derp/documentS


All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?



Another example:



$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD


The filesystem here reports the case of the original filename as created by the user or program.



At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?



EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...










share|improve this question

























  • Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

    – coteyr
    Apr 22 '15 at 20:47











  • From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

    – supercat
    Apr 22 '15 at 22:26











  • Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

    – Canadian Luke
    Apr 22 '15 at 23:18
















12















This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.



On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.



For example:



$ cd ~/Documents
$ pwd
/home/derp/Documents

$ cd ../documents
$ pwd
/home/derp/documents

$ cd ../docuMents
$ pwd
/home/derp/docuMents

$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS

$ cd ../documentS
$ pwd
/home/derp/documentS


All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?



Another example:



$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD


The filesystem here reports the case of the original filename as created by the user or program.



At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?



EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...










share|improve this question

























  • Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

    – coteyr
    Apr 22 '15 at 20:47











  • From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

    – supercat
    Apr 22 '15 at 22:26











  • Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

    – Canadian Luke
    Apr 22 '15 at 23:18














12












12








12


2






This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.



On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.



For example:



$ cd ~/Documents
$ pwd
/home/derp/Documents

$ cd ../documents
$ pwd
/home/derp/documents

$ cd ../docuMents
$ pwd
/home/derp/docuMents

$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS

$ cd ../documentS
$ pwd
/home/derp/documentS


All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?



Another example:



$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD


The filesystem here reports the case of the original filename as created by the user or program.



At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?



EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...










share|improve this question
















This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.



On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.



For example:



$ cd ~/Documents
$ pwd
/home/derp/Documents

$ cd ../documents
$ pwd
/home/derp/documents

$ cd ../docuMents
$ pwd
/home/derp/docuMents

$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS

$ cd ../documentS
$ pwd
/home/derp/documentS


All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?



Another example:



$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD


The filesystem here reports the case of the original filename as created by the user or program.



At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?



EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...







filesystems filenames case-sensitivity






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 35 mins ago









Rui F Ribeiro

39.6k1479132




39.6k1479132










asked Apr 22 '15 at 20:18









datUserdatUser

2,5061133




2,5061133













  • Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

    – coteyr
    Apr 22 '15 at 20:47











  • From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

    – supercat
    Apr 22 '15 at 22:26











  • Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

    – Canadian Luke
    Apr 22 '15 at 23:18



















  • Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

    – coteyr
    Apr 22 '15 at 20:47











  • From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

    – supercat
    Apr 22 '15 at 22:26











  • Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

    – Canadian Luke
    Apr 22 '15 at 23:18

















Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

– coteyr
Apr 22 '15 at 20:47





Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.

– coteyr
Apr 22 '15 at 20:47













From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

– supercat
Apr 22 '15 at 22:26





From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.

– supercat
Apr 22 '15 at 22:26













Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

– Canadian Luke
Apr 22 '15 at 23:18





Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive

– Canadian Luke
Apr 22 '15 at 23:18










1 Answer
1






active

oldest

votes


















14














A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.



A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).



You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.



So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.



When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.



In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.






share|improve this answer
























  • I might have known you beat me to this one! (upvoted)

    – Fabby
    Apr 22 '15 at 20:56











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f197993%2fhow-do-case-insensitive-filesystems-display-both-upper-and-lower-case-file-names%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









14














A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.



A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).



You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.



So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.



When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.



In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.






share|improve this answer
























  • I might have known you beat me to this one! (upvoted)

    – Fabby
    Apr 22 '15 at 20:56
















14














A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.



A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).



You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.



So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.



When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.



In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.






share|improve this answer
























  • I might have known you beat me to this one! (upvoted)

    – Fabby
    Apr 22 '15 at 20:56














14












14








14







A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.



A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).



You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.



So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.



When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.



In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.






share|improve this answer













A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.



A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).



You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.



So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.



When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.



In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.







share|improve this answer












share|improve this answer



share|improve this answer










answered Apr 22 '15 at 20:50









derobertderobert

73.1k8154211




73.1k8154211













  • I might have known you beat me to this one! (upvoted)

    – Fabby
    Apr 22 '15 at 20:56



















  • I might have known you beat me to this one! (upvoted)

    – Fabby
    Apr 22 '15 at 20:56

















I might have known you beat me to this one! (upvoted)

– Fabby
Apr 22 '15 at 20:56





I might have known you beat me to this one! (upvoted)

– Fabby
Apr 22 '15 at 20:56


















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f197993%2fhow-do-case-insensitive-filesystems-display-both-upper-and-lower-case-file-names%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Histoire des bourses de valeurs

Why is there Russian traffic in my log files?

Rename multiple files to decrement number in file name?