Line 1: |
Line 1: |
− | OK, let's imagine we have an information about a file we want to open.
| + | Let's imagine we have an information about a file we want to open. |
− | What should it contain? Apparently access mode and path | + | What should it contain? Apparently, access mode and path: |
| | | |
− | <pre> | + | <source lang="C"> |
| struct file { | | struct file { |
| char *path; | | char *path; |
| unsigned mode; | | unsigned mode; |
| } *f; | | } *f; |
− | </pre> | + | </source> |
| | | |
| and we'd like to have that path being opened by a process. We would | | and we'd like to have that path being opened by a process. We would |
| do it like below: | | do it like below: |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd; | | int fd; |
| | | |
| fd = open(f->path, f->mode); | | fd = open(f->path, f->mode); |
− | </pre> | + | </source> |
| | | |
− | right? Right, but it's not all. We all know, that not only regular
| + | Right? Right, but it's not all of it. We all know, that not only regular |
− | files might be opened via paths, but also such things as FIFO-s. And | + | files might be opened via paths, but also such things as FIFOs. And |
− | plain open with the flags we want it to have may just hang. So we need | + | plain <code>open()</code> with the flags we want it to have may just hang. So we need |
| to change that code to look like this: | | to change that code to look like this: |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1; | | int fd, tfd = -1; |
| | | |
Line 33: |
Line 33: |
| if (tfd >= 0) | | if (tfd >= 0) |
| close(tfd); | | close(tfd); |
− | </pre> | + | </source> |
| | | |
| The tfd keeps FIFO read-write opened while we open it with any flags | | The tfd keeps FIFO read-write opened while we open it with any flags |
| we want. Then we close it. | | we want. Then we close it. |
| | | |
− | Now this seems to be OK, but it's actually not. In Linux file can be | + | Now this seems to be OK, but it's actually not. In Linux, file can be |
| unlinked while being opened (these [[invisible files]] are treated carefully | | unlinked while being opened (these [[invisible files]] are treated carefully |
| on dump). In that case what was formerly pointed by | | on dump). In that case what was formerly pointed by |
− | path may be kept in some temporary location. And we have to create a | + | path may be kept in some temporary location. We have to create a |
− | temp name for it and unlink one afterwards. So we need to extend the
| + | temporary name for it, and unlink it afterwards. So, we need to extend the |
− | info about file | + | info about a file: |
| | | |
− | <pre> | + | <source lang="C"> |
| struct file { | | struct file { |
| char *path; | | char *path; |
Line 51: |
Line 51: |
| char *temp_path; | | char *temp_path; |
| } *f; | | } *f; |
− | </pre> | + | </source> |
| | | |
− | and the opening code to take care of that temorary location | + | and the opening code to take care of that temporary location |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1; | | int fd, tfd = -1; |
| | | |
Line 71: |
Line 71: |
| if (f->temp_path) | | if (f->temp_path) |
| unlink(f->path); | | unlink(f->path); |
− | </pre> | + | </source> |
| | | |
| And we haven't seen all the code we need to manage what is pointed by | | And we haven't seen all the code we need to manage what is pointed by |
− | the temp_path, but let's proceed. | + | the <code>temp_path</code>, but let's proceed. |
| | | |
− | We have forgotten, that opened and unl^w removed can also be a | + | We have forgotten, that opened and <s>unlinked</s> removed can also be a |
− | directory. On directories link and unlink do not work and we have to | + | directory. For directories, link and unlink do not work, and we have to |
− | slightly fix the code to at least try to make things work OK:
| + | append the code to at least try to make things work OK: |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1; | | int fd, tfd = -1; |
| | | |
Line 104: |
Line 104: |
| unlink(f->path); | | unlink(f->path); |
| } | | } |
− | </pre> | + | </source> |
| | | |
− | Done. But, we also should take care of hard links. If a file has such | + | Done. Oh wait, we also should take care of hard links! If a file has any, |
− | and both were opened and removed, we cannot after opening just go | + | and both were opened and removed, we cannot just go |
− | ahead and kill the temp_path -- it can be waiting for some other | + | ahead and kill the <code>temp_path</code> after opening, as |
− | struct file to open one. A little bit more information should be added | + | it can be waiting for some other |
− | to the struct file | + | <code>struct file</code> to open one. A little bit more information should be added |
| + | to the <code>struct file</code>. |
| | | |
− | <pre> | + | <source lang="C"> |
| struct temp_file { | | struct temp_file { |
| char *path; | | char *path; |
Line 123: |
Line 124: |
| struct temp_file *temp; | | struct temp_file *temp; |
| } *f; | | } *f; |
− | </pre> | + | </source> |
| | | |
| and to the code that opens one now looks like this: | | and to the code that opens one now looks like this: |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1; | | int fd, tfd = -1; |
| | | |
Line 153: |
Line 154: |
| } | | } |
| } | | } |
− | </pre> | + | </source> |
| | | |
| By the way, we've left behind the scenes all the code required to make | | By the way, we've left behind the scenes all the code required to make |
− | the temp_file data be shared between processes that need one and to | + | the <code>temp_file</code> data be shared between processes that need one and to |
− | make the decrement of f->temp->users be smp-safe. | + | make the decrementing of <code>f->temp->users</code> be SMP-safe. |
| | | |
| Also note, that we don't handle the case when the file/directory is | | Also note, that we don't handle the case when the file/directory is |
| removed and some other file/directory is created under the same name. | | removed and some other file/directory is created under the same name. |
− | It's rare case. | + | It's a rare case. |
| | | |
| Now, is that all? No, sorry. A couple of things left. First, Linux has | | Now, is that all? No, sorry. A couple of things left. First, Linux has |
Line 168: |
Line 169: |
| info about what mount point the file belongs to like this: | | info about what mount point the file belongs to like this: |
| | | |
− | <pre> | + | <source lang="C"> |
| struct file { | | struct file { |
| char *path; | | char *path; |
Line 175: |
Line 176: |
| unsigned mnt_id; | | unsigned mnt_id; |
| } *f; | | } *f; |
− | </pre> | + | </source> |
| | | |
| and the code to open file would now look like | | and the code to open file would now look like |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1, ns_fd; | | int fd, tfd = -1, ns_fd; |
| char *rel_path = f->path + 1; | | char *rel_path = f->path + 1; |
Line 210: |
Line 211: |
| | | |
| close(ns_fd); | | close(ns_fd); |
− | </pre> | + | </source> |
| | | |
− | Let me not dive into the details of how the open_ns_root looks like. | + | Let's not dive into the details of how the <code>open_ns_root</code> looks like. |
| Just know, that it opens a file descriptor, that refers to the root | | Just know, that it opens a file descriptor, that refers to the root |
− | of the mount namespace that contains a mount point with the id mnt_id | + | of the mount namespace that contains a mount point with the id <code>mnt_id</code> |
| (they cannot be shared, and that's great). | | (they cannot be shared, and that's great). |
| | | |
Line 220: |
Line 221: |
| First, opened files typically have a position. Flags we get need to be | | First, opened files typically have a position. Flags we get need to be |
| sanitated not to container those that only make sense during open, | | sanitated not to container those that only make sense during open, |
− | like O_TRUNC or O_CREAT. And file may have a thing called fown managed | + | like <code>O_TRUNC</code> or <code>O_CREAT</code>. And file may have a thing called <code>fown</code> managed |
− | by the F_SETSIG and F_SETOWN fcntls. All this results in | + | by the <code>F_SETSIG</code> and <code>F_SETOWN</code> fcntls. All this results in |
| | | |
− | <pre> | + | <source lang="C"> |
| struct file { | | struct file { |
| char *path; | | char *path; |
Line 232: |
Line 233: |
| struct fown fown; | | struct fown fown; |
| } *f; | | } *f; |
− | </pre> | + | </source> |
| | | |
| and | | and |
| | | |
− | <pre> | + | <source lang="C"> |
| int fd, tfd = -1, ns_fd, open_flags; | | int fd, tfd = -1, ns_fd, open_flags; |
| char *rel_path = f->path + 1; | | char *rel_path = f->path + 1; |
Line 272: |
Line 273: |
| fcntl(fd, F_SETOWN, &f->fown->owner); | | fcntl(fd, F_SETOWN, &f->fown->owner); |
| lseek(fd, SEEK_SET, f->pos); | | lseek(fd, SEEK_SET, f->pos); |
− | </pre> | + | </source> |
| | | |
− | And don't ask for details of the f->fown thing. It's tricky, but just | + | And don't ask for details of the <code>f->fown</code> thing. It's tricky, but just |
− | follows the ABI and thus boring. | + | follows the ABI and therefore boring. |
| | | |
− | OK, we've finished with the top of the iceberg -- opening a file. Why | + | OK, we've finished with the top of the iceberg — opening a file. Why |
| top? Becase when opened file should be planted into a process' file | | top? Becase when opened file should be planted into a process' file |
− | descriptors table under desired number. You might thing, that it's as | + | descriptors table under desired number. You might think that it should be |
− | simple as | + | as simple as: |
− | | + | <source lang="C"> |
− | <pre> | |
| dup2(fd, desired_fd); | | dup2(fd, desired_fd); |
− | </pre> | + | </source> |
| | | |
− | but it's not. Here's [[how to assign needed file descriptor to a file]] :) | + | but it's not. Here's [[how to assign needed file descriptor to a file]]. |
| | | |
| [[Category:Under the hood]] | | [[Category:Under the hood]] |
| [[Category:Files]] | | [[Category:Files]] |