You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add small paragraph to explaine the bitwise operation issue
* Mention that heap merging is also called coalesing
* Fix formula syntax in tar fs chapter
* Minor fix on the literals chapter
* fix typo in tar header table
* Fix typos in tar chapter
* Update tar header section
* Add introduction to coalescing term
* Minor typo fixes on tar chapter
* Fix typos in memory management and tar chapters
* Changes requested
Copy file name to clipboardExpand all lines: 04_Memory_Management/01_Overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
Welcome to the first challenge of our osdev adventure! Memory management in a kernel is a big area, and it can easily get very complex. This chapter aims to breakdown the various layers you might use in your kernel, and explain how each of them is useful.
4
4
5
-
The design and complexity of a memory manger can vary greatly, a lot depends on what the operating system is designed, and its specific goals. For example if only want mono-tasking os, with paging disabled and no memory protection, it will probably be fairly simple to implement.
5
+
The design and complexity of a memory manger can vary greatly, a lot depends on how the operating system is designed, and its specific goals. For example if only want mono-tasking os, with paging disabled and no memory protection, it will probably be fairly simple to implement.
6
6
7
-
In this part we will try to cover a more common use case that is probably what nearly all modern operating system uses, that is a 32/64 operating system with paging enabled, and various forms of memory allocators for the kernel and one for user space.
7
+
In this part we will try to cover a more common use case that is probably what nearly all modern operating system uses, that is a 32/64i bits operating system with paging enabled, and various forms of memory allocators for the kernel and one for user space.
8
8
9
9
In the appendices there is also an additional section on memory protection features available in some CPUs.
Copy file name to clipboardExpand all lines: 04_Memory_Management/05_Heap_Allocation.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,7 +87,6 @@ So now we have the following situation:
87
87
Now the third `alloc()` call will work similarly to the others, and we can imagine the results. `
88
88
89
89
What we have so far is already an allocation algorithm, that's easy to implement and very fast!
90
-
Its implementation is very simple:
91
90
92
91
```c
93
92
uint8_t *cur_heap_position = 0; //Just an example, in the real world you would use
@@ -308,11 +307,14 @@ struct {
308
307
309
308
That's it! That's what we need to clean up the code and replace the pointers in the latest with the new struct reference. Since it is just matter of replacing few variables, implementing this part is left to the reader.
310
309
311
-
### Part 5: Merging
310
+
### Part 5: Coalescing (Merging)
312
311
313
312
So now we have a basic memory allocator (woo hoo), and we are nearing the end of our memory journey.
314
313
315
314
In this part we'll see how to help mitigate the *fragmentation* problem. It is not a definitive solution, but this lets us reuse memory in a more efficient way. Before proceeding let's recap what we've done so far.
315
+
316
+
This solution is known with the name _Coalescing_, and it simply is an algorithm that merge contiguous smaller block of free memory into a bigger one.
317
+
316
318
We started from a simple pointer to the latest allocated location, and added information in order to keep track of what was previously allocated and how big it was, needed to reuse the freed memory.
317
319
318
320
We've basically created a list of memory regions that we can traverse to find the next/prev region.
@@ -335,7 +337,7 @@ What the heap will look like after the code above?
335
337
| 6 | F | X | .. | X | 6 | F | X | .. | X | 6 | F | .. | X | | |
336
338
337
339
338
-
Now, all of the memory in the heap is available to allocate (except for the overhead used to store the status of each chunk), and everything looks perfectly fine. But now the code keeps executing, and it will arrive at the following instruction:
340
+
Now, all of the memory in the heap is available to allocate (except for the overhead used to store the status of each chunk), and everything looks perfectly fine. But the code keeps executing, and it will arrive at the following instruction:
Copy file name to clipboardExpand all lines: 08_VirtualFileSystem/03_TarFileSystem.md
+15-14Lines changed: 15 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,14 +33,15 @@ As anticipated above, the header structure is a fixed size struct of 512 bytes.
33
33
| 148 | 8 | Checksum for header record |
34
34
| 156 | 1 | Type flag |
35
35
| 157 | 100 | Name of linked file |
36
-
|57| 6 | UStar indicator, "ustar", then NULL |
36
+
|257| 6 | UStar indicator, "ustar", then NULL |
37
37
| 263 | 2 | UStar version, "00" (it is a string) |
38
38
| 265 | 32 | Owner user name |
39
39
| 297 | 32 | Owner group name |
40
40
| 329 | 8 | Device major number |
41
41
| 337 | 8 | Device minor number |
42
42
| 345 | 155 | Filename prefix |
43
43
44
+
The sum of all sizes, anyway is not 512 bytes, but 500, so the extra space is filled with zerosextra space is filled with _0s_.
44
45
To ensure portability all the information on the header are encoded in `ASCII`, so we can use the `char` type to store the information into those fields. Every record has a `type` flag, that says what kind of resource it represent, the possible values depends on the type of tar we are supporting, for the `ustar` format the possible values are:
45
46
46
47
| Value | Meaning |
@@ -57,9 +58,9 @@ The _name of linked file_ field refers to symbolic links in the unix world, when
57
58
58
59
The USTar indictator (containing the string `ustar` followed by NULL), and the version field are used to identify the format being used, and the version field value is "00".
59
60
60
-
The `filename prefix` field, present only in the `ustar`, this format allows for longer file names, but it is splitted into two parts the `file name` field ( 100 bytes) and the `filename prefix` field (155 bytes)
61
+
The `filename prefix` field is present only in the `ustar`, this format allows for longer file names, but it is splitted into two parts the `file name` field ( 100 bytes) and the `filename prefix` field (155 bytes)
61
62
62
-
The other fields are either self-explanatory (like uid/gid) or can be left as 0 (TO BE CHECKED) the only one that needs more explanation is the `file size` field because it is expressed as an octal number encoded in ASCII. This means we need to convert an ascii octal into a decimal integer. Just to remind, an `octal` number is a number represetend in base 8, we can use digits from 0 to 7 to represent it, similar to how binary (base 2) only have 0 and 1, and hexadecimal (base 16) has 0 to F. So for example:
63
+
The other fields are either self-explanatory (like uid/gid) or can be left as 0 the only one that needs more explanation is the `file size` field because it is expressed as an octal number encoded in ASCII. This means we need to convert an ascii octal into a decimal integer, with the exception of the last byte (12th) because this is historically left as `NULL` (0). Just to remind, an `octal` number is a number represetend in base 8, we can use digits from 0 to 7 to represent it, similar to how binary (base 2) only have 0 and 1, and hexadecimal (base 16) has 0 to F. So for example:
63
64
64
65
```
65
66
octal 12 = hex A = bin 1010
@@ -69,7 +70,7 @@ In C an octal number is represented adding a `0` in front of the number, so for
69
70
70
71
But that's not all, we also have that the number is represented as an `ascii` characters, so to get the decimal number we need to:
71
72
72
-
1. Convert each ascii digit into decimal, this should be pretty easy to do, since in the ascii table the digits are placed in ascending order starting from 0x30 ( `´0'` ), to get the digit we need just to subtract the `ascii` code for the 0 to the char supplied
73
+
1. Convert each ascii digit into decimal, this should be pretty easy to do, since in the ascii table the digits are placed in ascending order starting from 0x30 ( `'0'` ), to get the digit we need just to subtract the `ascii` code for the 0 to the char supplied
73
74
2. To obtain the decimal number from an octal we need to multiply each digit per `8^i` where i is the digit position (rightmost digit is 0) and sum their results. For example 37 in octal is:
74
75
75
76
```c
@@ -97,9 +98,9 @@ The picture below show how data is stored into a tar archive.
97
98
98
99
To move from the first header to the next we simply need to use the following formula:
The lookup function then will be in the form of a loop. The first thing we'll need to know is when we've reached the end of the archive. As mentioned above, if there are two or more zero-filled records, it indicated the end. So while searching, we need to make sure that we keep track of the number of zeroed records. The main lookup loop should be similar to the following pseudo-code:
103
+
The lookup function then will be in the form of a loop. The first thing we'll need to know is when we've reached the end of the archive. As mentioned above, if there are two or more zero-filled records, it indicates the end. So while searching, we need to make sure that we keep track of the number of zeroed records. The main lookup loop should be similar to the following pseudo-code:
103
104
104
105
```c
105
106
int zero_counter = 0;
@@ -189,18 +190,18 @@ In our scenario there is no really need to close a file from a fs driver point o
189
190
190
191
## And Now from A VFS Point Of View
191
192
192
-
Now that we have a basic implementation of the tar file system we need to make it accessible to the VFS layer. To do we need to do two things: load the filesystem into memory and populate at least one mountpoint_t item. Since technically there are no fs loaded yet we can add it as the first item in our list/array. We have seent the `mountpoint_t` type already in the previous chapter, but let's review what are the fields available in this data structure:
193
+
Now that we have a basic implementation of the tar file system we need to make it accessible to the VFS layer. To do it we need to do two things: load the filesystem into memory and populate at least one `mountpoint_t` item. Since technically there are no fs loaded yet we can add it as the first item in our list/array. We have seen the `mountpoint_t` type already in the previous chapter, but let's review what are the fields available in this data structure:
193
194
194
195
* The file system name (it can be whatever we want).
195
196
* The mountpoint (is the folder where we want to mount the filesystem), in our case since we have not mountpoints loaded, a good idea will be to mount it at "/".
196
-
* The file_operations field, that will contain the pointer to the fs functions to open/read/close/write files, in this field we are going to place the fs driver function we just created..
197
+
* The `file_operations` field, that will contain the pointer to the fs functions to open/read/close/write files, in this field we are going to place the fs driver function we just created..
197
198
198
-
The file_operation field will be loaded as follows (this is according to our current implementation):
199
+
The `file_operations` field will be loaded as follows (this is according to our current implementation):
199
200
200
-
* The open function will be the ustar_open function.
201
-
* The read function will be the ustar_read function.
202
-
* We don't need a close function since we can handle it directly in the VFS, so we will set it to NULL.
203
-
* As well as we don't need a write function since our fs will be read only, so it can be set to NULL.
201
+
* The `open` function will be the `ustar_open` function.
202
+
* The `read` function will be the `ustar_read` function.
203
+
* We don't need a `close` function since we can handle it directly in the VFS, so we will set it to NULL.
204
+
* As well as we don't need a `write` function since our fs will be read only, so it can be set to NULL.
204
205
205
206
Loading the fs in memory instead will depend on the booting method we have chosen, since every boot manager/loader has its different approach this will be left to the boot manager used documentation.
206
207
@@ -258,7 +259,7 @@ struct tar_list_item {
258
259
259
260
And using the new datatype initialize the list accordingly.
260
261
261
-
Now when the file system is accessed for the first time we can initialize this list, and use it to search for the files, saving a lot of time and resources, and it can makes things easier to for the lookup and read function.
262
+
Now when the file system is accessed for the first time we can initialize this list, and use it to search for the files, saving a lot of time and resources, and it can makes things easier for the lookup and read function.
262
263
263
264
Another limitation of our driver is that it expects for the tar to be fully loaded into memory, while we know that probably file system will be stored into an external device, so a good idea is to make the driver aware of all possible scenarios.
Copy file name to clipboardExpand all lines: 99_Appendices/C_Language_Info.md
+25Lines changed: 25 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,6 +96,31 @@ It is worth mentioning that inline assembly syntax is the At&t syntax, so the us
96
96
asm("movl $5, %rcx;");
97
97
```
98
98
99
+
## Dealing With Literals and Bitwise Operation
100
+
101
+
There are some subtle bugs that can be encountered when when using immediate values in C, due to operator precedence and integer promotion rules.
102
+
103
+
Let's imagine we have a 64 bit variable, and we need to do a bitwise operation like `setting` the bit at the position `x`, this is easily achieved using the _left shift_ (`<<`) operator combined with a _or_ (`|=`), like in the following example:
104
+
105
+
```
106
+
uint64_t example_var |= (1 << x);
107
+
```
108
+
109
+
We make few tests, for `x=1, 2, 10, 20, 31`, everything works fine, so what is the issue? The issue is when the shift is above 31, because of the C _Integer promotion rule_.
110
+
111
+
In the above example, `1` is a literal, and by default C converts it to `int`, the bitwise operation is executed using the type of the left operand, so we are trying to shift left a bit of a lower size type by a number of positions that is higher than than the size of the variable, causing an undefined behavior.
112
+
113
+
Then what are the solutions? Below few example of how to potentially fix it:
114
+
115
+
```c
116
+
#defineONE 1ULL
117
+
const uint64_t one = 1;
118
+
119
+
uint64_t example_one |= one << 42;
120
+
uint64_t example_two |= ONE << 42;
121
+
uint64_t example_three |= 1ULL << 42;
122
+
```
123
+
99
124
## C +(+) assembly together - Calling Conventions
100
125
101
126
Different C compilers feature a number of [calling conventions](https://en.wikipedia.org/wiki/X86_calling_conventions),
0 commit comments