How do case-insensitive filesystems display both upper and lower case file names?
This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.
On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.
For example:
$ cd ~/Documents
$ pwd
/home/derp/Documents
$ cd ../documents
$ pwd
/home/derp/documents
$ cd ../docuMents
$ pwd
/home/derp/docuMents
$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS
$ cd ../documentS
$ pwd
/home/derp/documentS
All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?
Another example:
$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD
The filesystem here reports the case of the original filename as created by the user or program.
At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?
EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...
filesystems filenames case-sensitivity
add a comment |
This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.
On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.
For example:
$ cd ~/Documents
$ pwd
/home/derp/Documents
$ cd ../documents
$ pwd
/home/derp/documents
$ cd ../docuMents
$ pwd
/home/derp/docuMents
$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS
$ cd ../documentS
$ pwd
/home/derp/documentS
All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?
Another example:
$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD
The filesystem here reports the case of the original filename as created by the user or program.
At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?
EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...
filesystems filenames case-sensitivity
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18
add a comment |
This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.
On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.
For example:
$ cd ~/Documents
$ pwd
/home/derp/Documents
$ cd ../documents
$ pwd
/home/derp/documents
$ cd ../docuMents
$ pwd
/home/derp/docuMents
$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS
$ cd ../documentS
$ pwd
/home/derp/documentS
All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?
Another example:
$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD
The filesystem here reports the case of the original filename as created by the user or program.
At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?
EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...
filesystems filenames case-sensitivity
This question occurred to me the other day when I was working on a development project that relies on an opinionated framework with regard to file names. The framework (irrelevant here) wanted to see upper-case-first filenames. This got me thinking.
On a case-insensitive file system, say extFAT or HFS+ (specifically non-case sensitive) how does the file system provide access to the same file with both upper and lower case versions of the filename.
For example:
$ cd ~/Documents
$ pwd
/home/derp/Documents
$ cd ../documents
$ pwd
/home/derp/documents
$ cd ../docuMents
$ pwd
/home/derp/docuMents
$ cd ../DOCUMENTS
$ pwd
/home/derp/DOCUMENTS
$ cd ../documentS
$ pwd
/home/derp/documentS
All of these commands will resolve to the same directory. Is this behavior, specifically the output from pwdjust a function of bash in this case just showing me what it thinks I want to see?
Another example:
$ ls ~/Documents
Derp.txt another.txt whatThe.WORLD
The filesystem here reports the case of the original filename as created by the user or program.
At what point in the filesystem stack is the human readable filename preserved as it was created (eg. upper and lower case) so that it can be accessed by any combination of the correct upper and lowercase ASCII characters? Is this just a regex trick somewhere or is there something else going on?
EDIT:
It looks like the behavior I am curious about is found in case-preserving case-insensitive filesystems after some more research...
filesystems filenames case-sensitivity
filesystems filenames case-sensitivity
edited 35 mins ago
Rui F Ribeiro
39.6k1479132
39.6k1479132
asked Apr 22 '15 at 20:18
datUserdatUser
2,5061133
2,5061133
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18
add a comment |
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18
add a comment |
1 Answer
1
active
oldest
votes
A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.
A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).
You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.
So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.
When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.
In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f197993%2fhow-do-case-insensitive-filesystems-display-both-upper-and-lower-case-file-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.
A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).
You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.
So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.
When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.
In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
add a comment |
A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.
A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).
You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.
So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.
When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.
In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
add a comment |
A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.
A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).
You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.
So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.
When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.
In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.
A case-insensitive filesystem just means that whenever the filesystem has to ask "does A refer to the same file/directory as B?" it compares the names of files/directories ignoring differences in upper/lowercase (exactly what upper/lowercase differences count depends on the filesystem—it's non-obvious once you get beyond ASCII). A case-sensitive filesystem does not ignore those differences.
A case-preserving filesystem stores file names as given. A non-case-preserving filesystem does not; it'll typically convert all letters to uppercase before storing them (theoretically, it could use lowercase, or RaNsOm NoTe case, or whatever, but AFAIK all real-world ones used uppercase).
You can put those two attributes together in any combination. I'm not sure if you can find non-case-preserving case-sensitive filesystems, but you could certainly create one. All the other combinations exist or existed in real systems, though.
So a case-preserving, case-insensitive filesystem (the most common type of case-insensitive filesystem nowadays) will store and return file names in whatever capitalization you created them or last renamed them, but when comparing two file names (to check if one exists, to open one, to delete one, etc.) it'll ignore case differences.
When you use a case-insensitive filesystem on a Unix box, various utilities will do weird things because Unix traditionally uses case-sensitive filesystems—so they're not expecting Document1 and document1 to be the same file.
In the pwd case, what you're seeing is that it by default just outputs the path you actually used to get to the directory. So if you got there via cd DirName, it'll use DirName in the output. If you got there via DiRnAmE, you'll see DiRnAmE in the output. Bash does this by keeping track of how you got to your current directory in the $PWD environment variable. Mainly this is for symlinks (if you cd into a symlink, you'll see the symlink in your pwd, even though it's actually not part of the path to your current directory). But it also gives the somewhat weird behavior you observe on case-insensitive filesystems. I suspect that pwd -P will give you the directory name using the case stored on disk, but haven't tested.
answered Apr 22 '15 at 20:50
derobertderobert
73.1k8154211
73.1k8154211
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
add a comment |
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
I might have known you beat me to this one! (upvoted)
– Fabby
Apr 22 '15 at 20:56
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f197993%2fhow-do-case-insensitive-filesystems-display-both-upper-and-lower-case-file-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Not writing this as an answer because I don't know for sure any more but I believe that you can not have ~/Documents and ~/documents in that file system. But when you cd ~/Documents or ~/documents your going the same place and your shell is "playing nice" by remembering what you typed. The other side is that some FS store the way it was created in an aux. chunk of data. For example storing ~/Documents in a lookup table but writing to the FS as ~/documents. Basically creating an illusion that the file system cares about casing when it doesn't.
– coteyr
Apr 22 '15 at 20:47
From what I've observed, in the event that a directory contains two file names which are identical except for case, non-case-sensitive file systems may respond to a request for a given file by arbitrarily selecting one. Such situations can arise if the rules for upper/lowercase conversion change after a file gets created.
– supercat
Apr 22 '15 at 22:26
Cool information about NTFS's case preserving nature: superuser.com/questions/364057/why-is-ntfs-case-sensitive
– Canadian Luke
Apr 22 '15 at 23:18