AHCI Post complete.

This commit is contained in:
Drew Galbraith 2024-01-09 23:35:13 -08:00
parent 1387b297ac
commit de351f1301
12 changed files with 720 additions and 1 deletions

View File

@ -83,4 +83,6 @@ figcaption {
font-style: italic;
}
.chroma {
padding: 10px;
}

86
assets/css/syntax.css Normal file
View File

@ -0,0 +1,86 @@
/* Background */ .bg { color: #d0d0d0; background-color: #202020; }
/* PreWrapper */ .chroma { color: #d0d0d0; background-color: #202020; }
/* Other */ .chroma .x { }
/* Error */ .chroma .err { color: #a61717; background-color: #e3d2d2 }
/* CodeLine */ .chroma .cl { }
/* LineLink */ .chroma .lnlinks { outline: none; text-decoration: none; color: inherit }
/* LineTableTD */ .chroma .lntd { vertical-align: top; padding: 0; margin: 0; border: 0; }
/* LineTable */ .chroma .lntable { border-spacing: 0; padding: 0; margin: 0; border: 0; }
/* LineHighlight */ .chroma .hl { background-color: #363636 }
/* LineNumbersTable */ .chroma .lnt { white-space: pre; -webkit-user-select: none; user-select: none; margin-right: 0.4em; padding: 0 0.4em 0 0.4em;color: #686868 }
/* LineNumbers */ .chroma .ln { white-space: pre; -webkit-user-select: none; user-select: none; margin-right: 0.4em; padding: 0 0.4em 0 0.4em;color: #686868 }
/* Line */ .chroma .line { display: flex; }
/* Keyword */ .chroma .k { color: #6ab825; font-weight: bold }
/* KeywordConstant */ .chroma .kc { color: #6ab825; font-weight: bold }
/* KeywordDeclaration */ .chroma .kd { color: #6ab825; font-weight: bold }
/* KeywordNamespace */ .chroma .kn { color: #6ab825; font-weight: bold }
/* KeywordPseudo */ .chroma .kp { color: #6ab825 }
/* KeywordReserved */ .chroma .kr { color: #6ab825; font-weight: bold }
/* KeywordType */ .chroma .kt { color: #6ab825; font-weight: bold }
/* Name */ .chroma .n { }
/* NameAttribute */ .chroma .na { color: #bbbbbb }
/* NameBuiltin */ .chroma .nb { color: #24909d }
/* NameBuiltinPseudo */ .chroma .bp { }
/* NameClass */ .chroma .nc { color: #447fcf; text-decoration: underline }
/* NameConstant */ .chroma .no { color: #40ffff }
/* NameDecorator */ .chroma .nd { color: #ffa500 }
/* NameEntity */ .chroma .ni { }
/* NameException */ .chroma .ne { color: #bbbbbb }
/* NameFunction */ .chroma .nf { color: #447fcf }
/* NameFunctionMagic */ .chroma .fm { }
/* NameLabel */ .chroma .nl { }
/* NameNamespace */ .chroma .nn { color: #447fcf; text-decoration: underline }
/* NameOther */ .chroma .nx { }
/* NameProperty */ .chroma .py { }
/* NameTag */ .chroma .nt { color: #6ab825; font-weight: bold }
/* NameVariable */ .chroma .nv { color: #40ffff }
/* NameVariableClass */ .chroma .vc { }
/* NameVariableGlobal */ .chroma .vg { }
/* NameVariableInstance */ .chroma .vi { }
/* NameVariableMagic */ .chroma .vm { }
/* Literal */ .chroma .l { }
/* LiteralDate */ .chroma .ld { }
/* LiteralString */ .chroma .s { color: #ed9d13 }
/* LiteralStringAffix */ .chroma .sa { color: #ed9d13 }
/* LiteralStringBacktick */ .chroma .sb { color: #ed9d13 }
/* LiteralStringChar */ .chroma .sc { color: #ed9d13 }
/* LiteralStringDelimiter */ .chroma .dl { color: #ed9d13 }
/* LiteralStringDoc */ .chroma .sd { color: #ed9d13 }
/* LiteralStringDouble */ .chroma .s2 { color: #ed9d13 }
/* LiteralStringEscape */ .chroma .se { color: #ed9d13 }
/* LiteralStringHeredoc */ .chroma .sh { color: #ed9d13 }
/* LiteralStringInterpol */ .chroma .si { color: #ed9d13 }
/* LiteralStringOther */ .chroma .sx { color: #ffa500 }
/* LiteralStringRegex */ .chroma .sr { color: #ed9d13 }
/* LiteralStringSingle */ .chroma .s1 { color: #ed9d13 }
/* LiteralStringSymbol */ .chroma .ss { color: #ed9d13 }
/* LiteralNumber */ .chroma .m { color: #3677a9 }
/* LiteralNumberBin */ .chroma .mb { color: #3677a9 }
/* LiteralNumberFloat */ .chroma .mf { color: #3677a9 }
/* LiteralNumberHex */ .chroma .mh { color: #3677a9 }
/* LiteralNumberInteger */ .chroma .mi { color: #3677a9 }
/* LiteralNumberIntegerLong */ .chroma .il { color: #3677a9 }
/* LiteralNumberOct */ .chroma .mo { color: #3677a9 }
/* Operator */ .chroma .o { }
/* OperatorWord */ .chroma .ow { color: #6ab825; font-weight: bold }
/* Punctuation */ .chroma .p { }
/* Comment */ .chroma .c { color: #999999; font-style: italic }
/* CommentHashbang */ .chroma .ch { color: #999999; font-style: italic }
/* CommentMultiline */ .chroma .cm { color: #999999; font-style: italic }
/* CommentSingle */ .chroma .c1 { color: #999999; font-style: italic }
/* CommentSpecial */ .chroma .cs { color: #e50808; background-color: #520000; font-weight: bold }
/* CommentPreproc */ .chroma .cp { color: #cd2828; font-weight: bold }
/* CommentPreprocFile */ .chroma .cpf { color: #cd2828; font-weight: bold }
/* Generic */ .chroma .g { }
/* GenericDeleted */ .chroma .gd { color: #d22323 }
/* GenericEmph */ .chroma .ge { font-style: italic }
/* GenericError */ .chroma .gr { color: #d22323 }
/* GenericHeading */ .chroma .gh { color: #ffffff; font-weight: bold }
/* GenericInserted */ .chroma .gi { color: #589819 }
/* GenericOutput */ .chroma .go { color: #cccccc }
/* GenericPrompt */ .chroma .gp { color: #aaaaaa }
/* GenericStrong */ .chroma .gs { font-weight: bold }
/* GenericSubheading */ .chroma .gu { color: #ffffff; text-decoration: underline }
/* GenericTraceback */ .chroma .gt { color: #d22323 }
/* GenericUnderline */ .chroma .gl { text-decoration: underline }
/* TextWhitespace */ .chroma .w { color: #666666 }

Binary file not shown.

After

Width:  |  Height:  |  Size: 135 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

View File

@ -0,0 +1,617 @@
---
title: "Writing an AHCI Driver"
date: 2024-01-08
draft: true
tags: ['osdev']
---
Now that I've wrapped up the [0.1.0 Release](/blog/2023/12/acadia-0.1.0) of
AcadiaOS I'm looking to cleanup some of the "just get it working" hacks that
exist in the codebase. First up on that list is the AHCI Driver.
## What is AHCI
AHCI stands for Advanced Host Controller Interface and if you like acronyms boy
are you in for a treat. AHCI is a way to interface with SATA (which replaced
PATA (a.k.a. IDE)) via its HBA. AHCI has since been superseded by NVMe but is
simpler to implement (or so I've been told) so I've started here.
To try to explain it without acronym soup, AHCI allows you to access disk
drives and optical drives (SATA devices) by writing relevant ATA commands to
memory addresses that are backed by hardware firmware. There are a wide variety
of commands available but best I can tell the main ones used these days are to
identify the device and read/write via direct memory access (DMA).
Essentially you give the device an offset to read from as well as physical
memory address to write to. The device firmware copies the amount of data you
requested to the physical address then triggers an interrupt to indicate that
the operation is complete. Likewise writing via DMA is the same but in reverse.
Disclaimer, all of the above is basically just summarizing Wikipedia and the
OSDev wiki and I don't really know what I'm talking about.
## Current State
The current AHCI implementation in Denali is straightforward but very brittle.
It relies on everything following the happy path and is cobbled together more
based on trial an error of what worked rather than following the specification
closely.
As a part of this article we're going to dive into the related specs and look at
how they relate to each other. The trickiest part of writing the driver is the
fact that the necessary information is spread across several different specs
rather than contained in one place. The specs I reference in this post are:
- [AHCI 1.3.1](https://www.intel.com/content/www/us/en/io/serial-ata/serial-ata-ahci-spec-rev1-3-1.html)
- SATA 3.2
- ATA/ATAPI Command Set 3 (ACS-3)
The SATA and ACS specs cost money so I can't link them directly but it isn't
hard to find drafts of them available online.
## How AHCI Works
AHCI allows you to control SATA devices by writing commands to memory. The
layout of these structures is nicely shown in the AHCI Spec Figure 4:
![AHCI Memory](images/HBA_Memory_Annotated.png)
There are several pieces here that I've annotated:
1. The Generic Host Control (GHC) is a set of registers that allow you to manage the
whole controller and get its status. These registers are referred to in the
spec using GHC.RegisterName so the interrupt status register for instance is
"GHC.IS" for short.
2. Each device (hard disk or disc drive) that is attached to the controller is
exposed as a "port" with a set of registers to control it individually.
These registers are referred to as PxRegisterName so for instance the command
issue register is PxCI.
3. For each port it has a separately allocated piece of memory that can accept
up to 32 "commands" to execute.
4. When the controller is finished executing a command for a device it will
write a Frame Information Structure (FIS) and raise an interrupt. (The GHC
has a register to show which devices have a pending interrupt GHC.IS).
Each device can support up to 32 pending commands (but only receive one FIS at a
time). The memory structure is as follows:
![Port Memory](images/HBA_Port_Memory.png)
To issue a command we can:
1. Select a command slot that isn't currently in use.
2. Write the command to that command table (the specifics of this are explained
later) and then set that command as active in the commands issued register
for that port (PxCI).
3. Wait an interrupt indicating that this command has finished and read the
resulting FIS. We mostly just care about the status byte. The controller
will unset the bit for the completed command in the PxCI register when it
raises the interrupt so we can disambiguate which command has finished.
Although this sounds straightforward there are a few moving parts here that we
need to get up and running.
- We need to find where the HBA structure is in memory.
- We need to allocate some space for the command tables and received FIS
structures.
- We need to set up interrupt handling for each port.
- Before this we will issue a hardware reset command to the controller to get
everything in a clean state.
## Finding the HBA structure
We find the HBA structure by iterating through the PCI configuration space and
finding AHCI device. I'm not going to delve too deeply into this because PCI
could be a whole separate post and it is a little trickier to explain because
it is harder to access the specifications (the good people at the PCI "Special
Interest Group" are happy to let you download the PDF for the low low price of
$4500 if you are not a member).
The short story is that we are looking for the device with the right [class
code](https://wiki.osdev.org/PCI#Class_Codes) - Class Code 0x1 (Storage Device),
Subclass 0x6 (SATA Controller), Subtype 0x1 (AHCI).
Once we have the correct configuration space we cn read the address at offset
0x24 (called the ABAR for AHCI Base Address) which points to the start of the
GHC registers.
We can mostly ignore the other information in the configuration space for now as
we aren't dealing with Message Signaled Interrupts yet.
## Hardware Reset
Now that we have found the Global Host Controller registers we're going to
initiate a hardware reset of the AHCI Controller. The advantage of this is we
will know the exact state that the controller and its ports are in. Other than
that it ensures that we aren't dependent on the specific way limine uses the
AHCI controller.
I suspect that this is **not** how most production operating systems handle
things but this should give us a clean slate for now.
From the AHCI spec:
> 10.4.3 HBA Reset
>
> If the HBA becomes unusable for multiple ports, and a software reset or port
> reset does not correct the problem, software may reset the entire HBA by
> setting GHC.HR to 1. When software sets the GHC.HR bit to 1, the HBA
> shall perform an internal reset action. The bit shall be cleared to 0 by
> the HBA when the reset is complete. A software write of 0 to GHC.HR shall
> have no effect. To perform the HBA reset, software sets GHC.HR to 1 and may
> poll until this bit is read to be 0, at which point software knows that the
> HBA reset has completed.
>
> If the HBA has not cleared GHC.HR to 0 within 1 second of software setting
> GHC.HR to 1, the HBA is in a hung or locked state.
>
> When GHC.HR is set to 1, GHC.AE, GHC.IE, the IS register, and all port
> register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the
> HBAs register memory space are reset. The HBAs configuration space and all
> other global registers/bits are not affected by setting GHC.HR to 1. Any
> HwInit bits in the port specific registers are not affected by setting GHC.HR
> to 1. The port specific registers PxFB, PxFBU, PxCLB, and PxCLBU are not
> affected by setting GHC.HR to 1. If the HBA supports staggered spin-up, the
> PxCMD.SUD bit will be reset to 0; software is responsible for setting the
> PxCMD.SUD and PxSCTL.DET fields appropriately such that communication can be
> established on the Serial ATA link. If the HBA does not support staggered
> spin-up, the HBA reset shall cause a COMRESET to be sent on the port.
Despite the long text, this process is fairly straightforward. We set the
Hardware Reset bit and then poll for it to be set to 0. We then set the AHCI
enable bit. For now we can leave interrupts disabled until we have reset the
ports. Once this is done we sleep for a few milliseconds to allow the ports time
to spin up. For now we are just using 50ms because that is the smallest
resolution we support sleeping for (1 scheduling tick) but I think theoretically
we could sleep for only a millisecond or two.
```c++
ahci_hba_->global_host_control |= kGlobalHostControl_HW_Reset;
while (ahci_hba_->global_host_control & kGlobalHostControl_HW_Reset) {
continue;
}
ahci_hba_->global_host_control |= kGlobalHostControl_AHCI_Enable;
return static_cast<glcr::ErrorCode>(ZThreadSleep(50));
```
## Port Initialization
Now we can initialize each port that is implemented. There are two cases we
need to handle. Either the port has received a COMRESET and is running, or
staggered spin up is supported and we need to enable the port. As our VM doesn't
require staggered spin up, we will skip it for now and come back to it in the
future.
Before initializing each port we need to check if it has a device attached. We
can do that by checking the PxSSTS register described in the AHCI spec section
3.3.10.
![AHCI 1.3.1 Section 3.3.10](images/PxSSTS.png)
We are looking for a value 0x103, 0x100 indicating that the device is active
and 0x3 indicating that communication is established. For each port where we
detect this value we continue the initialization process.
### Memory Structures
We need to initialize the memory structures for each active port as shown in the
image before (under How AHCI works).
We need a command list structure of length 0x400 (technically it need not be
that long if fewer than 32 commands are supported but it doesn't use much
additional memory). Additionally a spot is needed for received FIS structure of
length 0x100. Finally each of the 32 commands in the command list must point to
a command table. Technically these can be quite large because each can hold up
to 2^16 physical region descriptors (using ~1 MiB of memory). I've opted
to limit it to just 8 16-byte descriptors so each command table would be length
0x100 as well. For now we don't support scatter gather buffers and just allocate
one contiguous memory section for each read.
In total all of these memory structures takes 0x2500 bytes (3 pages of RAM). We
allocate them all in one block and manually set up the pointers to their
physical addresses in the HBA port control.
```c++
// 0x0-0x400 -> Command List
// 0x400-0x500 -> Received FIS
// 0x500-0x2500 -> Command Tables (0x100 each) (Max PRDT Length is 8 for now)
uint64_t paddr;
command_structures_ =
mmth::OwnedMemoryRegion::ContiguousPhysical(0x2500, &paddr);
command_list_ = reinterpret_cast<CommandList*>(command_structures_.vaddr());
port_struct_->command_list_base = paddr;
received_fis_ =
reinterpret_cast<ReceivedFis*>(command_structures_.vaddr() + 0x400);
port_struct_->fis_base = paddr + 0x400;
port_struct_->command |= kCommand_FIS_Receive_Enable;
command_tables_ = glcr::ArrayView(
reinterpret_cast<CommandTable*>(command_structures_.vaddr() + 0x500), 32);
for (uint64_t i = 0; i < 32; i++) {
// This leaves space for 2 prdt entries.
command_list_->command_headers[i].command_table_base_addr =
(paddr + 0x500) + (0x100 * i);
commands_[i] = nullptr;
}
port_struct_->interrupt_enable =
kInterrupt_D2H_FIS | kInterrupt_PIO_FIS | kInterrupt_DMA_FIS |
kInterrupt_DeviceBits_FIS | kInterrupt_Unknown_FIS;
port_struct_->sata_error = -1;
port_struct_->command |= kCommand_Start;
```
There are a few other things going on here. Once we allocate the space to
receive FIS structures we let the port know that it can send FISes using the
PxCMD register.
Additionally at the end we enable interrupts, clear the error register, and
tell the port it can start processing commands.
## Interrupt Handling
Now that the device is initialized we can actually begin to send it commands.
To do so we need to register an interrupt handler with the correct PCI
interrupt line (for now we will use the direct interrupt line rather than
Message Signaled Interrupts). Registering interrupt handlers is a whole other
beast so for this post we will just focus on their implementation.
The first step is to de-multiplex the interrupt in the controller by checking
the interrupt status register. Each port that has an interrupt pending will
raise it's corresponding bit in the Interrupt Status register. We can delegate
to each port the handling of an interrupt, then clear the interrupt bit once it
is done. The relevant code in this case looks like this:
```c++
for (uint64_t i = 0; i < num_ports_; i++) {
if (!ports_[i].empty() && (ahci_hba_->interrupt_status & (1 << i))) {
ports_[i]->HandleIrq();
ahci_hba_->interrupt_status &= ~(1 << i);
}
}
```
Then on the port side we can handle the interrupt. This requires determining
what kind of interrupt was generated using the port's Interrupt Status register
(PxIS). Each of the 17 defined bits in this register correspond to a different
interrupt type and can be individual enabled and disabled using the port's
Interrupt Enable register (PxIE). For now as we registered when setting up the
port we will only handle the interrupts related to receiving FISes from the
device.
```c++
void AhciDevice::HandleIrq() {
uint32_t int_status = port_struct_->interrupt_status;
port_struct_->interrupt_status = int_status;
bool has_error = false;
if (int_status & kInterrupt_D2H_FIS) {
dbgln("D2H Received");
// Device to host.
volatile DeviceToHostRegisterFis& fis =
received_fis_->device_to_host_register_fis;
if (!CheckFisType(FIS_TYPE_REG_D2H, fis.fis_type)) {
return;
}
if (fis.error) {
dbgln("D2H err: {x}", fis.error);
dbgln("status: {x}", fis.status);
has_error = true;
}
}
if (int_status & kInterrupt_PIO_FIS) {
// Like above ...
}
if (int_status & kInterrupt_DMA_FIS) {
// Like above ...
}
// ...
}
```
To handle the interrupt we read the raised interrupts from the PxIS register and
write the values back to it to clear them. Then we can specify how to handle
each type of interrupt that we receive. For now we will just debug print the
type and any errors from the interrupt since we aren't sending any commands.
Something I'm not sure about is that as soon as we enable interrupts we seem to
receive a FIS from the device with an error bit set. Both the hard drive and the
optical drive on qemu send a FIS with error bit 0x1 set. Additionally the status
field is set to 0x30 for the hard drive and 0x70 for the optical drive.
I was able to find a [OSDev Forum
post](https://forum.osdev.org/viewtopic.php?f=1&t=56462&start=15#p342163)
referencing that this behavior is caused by the reset sending an EXECUTE DEVICE
DIAGNOSTIC command (0x90) to the device. It notes that this is largely
undocumented behavior but at least this information offers some clarity on the
outputs. Reading the ATA Command Set section 7.9.4 we can see that the command
ouputs code 0x01 to the error bits when `Device 0 passed, Device 1 passed or not
present`. According a footnote we can "See the appropriate transport standard
for the definition of device 0 and device 1." I really thought I was already
looking at the "appropriate transport standard" but alas. All that to say we'll
just ignore this interrupt for now.
## Sending a Command
Now that the AHCI ports are initialized and can handle an interrupt, we can send
commands to them. To start with lets send the IDENTIFY DEVICE command to each
device. This command asks the device to send 512 bytes of information about
itself back to us. These bytes contain 40 years of certified-crufty backwards
compatability. I mean just feast your eyes on the number of retired and obsolete
fields in just the first page of the spec.
![IDENTIFY DEVICE Response](images/IDENTIFY_DEVICE.png)
We'll ignore almost all of this information and just try to get the sector size
and sector count from the drive. To do so we need to figure out how to send a
command to the device. To be honest I feel like the specs fall down here in
actually explaining this. The trick is to send a Register Host to Device FIS in one
of the command slots. This FIS type has a field for the command as well as some
common parameters such as lba and count. In retrospect it is fairly clear once
you are aware of it, but if you are just reading the SATA spec and looking at
the possible commands, making the logical jump to the Register Host To Device
FIS feels damn near impossible.
First up we chose an empty command slot to use:
```c++
uint64_t slot;
for (slot = 0; slot < 32; slot++) {
if (!(commands_issued_ & (1 << slot))) {
break;
}
}
if (slot == 32) {
dbgln("All slots full");
return glcr::INTERNAL;
}
```
The `commands_issued_` variable is just for our own accounting of which slots
are currently in use by another command.
Next we can populate the FIS for that slot. The spec for the Register Host to
Device FIS is as follows:
![Register Host to Device FIS Layout](images/RegisterHostToDeviceFIS.png)
We don't need to initialize most of the fields here because the IDENTIFY_DEVICE
call doesn't rely on an lba or sector count. One of the keys is setting the high
bit "C" in the byte that contains PM Port which indicates to the HBA that this
FIS contains a new command (I spent a while trying to figure out why this wasn't
working without that). The code for this is relatively straightforward.
```c++
auto* fis = reinterpret_cast<HostToDeviceRegisterFis*>(
command_tables_[slot].command_fis);
*fis = HostToDeviceRegisterFis{
.fis_type = FIS_TYPE_REG_H2D,
.pmp_and_c = 0x80,
.command = kIdentifyDevice, // 0xEC
};
```
We also need to let the HBA know where it can put the result in memory. For this
we use the physical region descriptor table corresponding to this command slot.
As described before, for simplicity now we are only using a single entry to do
this. We allocate a 512 byte memory region and set it's physical address and
size in the first slot of the command slots PRDT.
```c++
uint64_t paddr;
auto region =
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200, &paddr);
command_tables_[slot].prdt[0].region_address = command.paddr;
command_tables_[slot].prdt[0].byte_count = 0x200; // 512 bytes
command_list_->command_headers[slot].prd_table_length = 1;
```
All that is left to do is to issue the command! We set the size of the command
FIS (in double words for some reason?) as well as let the HBA know it can
prefetch the data from memory. Then we set the bit for this command slot in the
PxCI register which will cause the device to start processing it.
```c++
// Set the command FIS length (in double words).
command_list_->command_headers[slot].command =
(sizeof(HostToDeviceRegisterFis) / 4) & 0x1F;
// Set prefetch bit.
command_list_->command_headers[slot].command |= (1 << 7);
// TODO: Synchronization-wise we need to ensure this is set in the same
// critical section as where we select a slot.
commands_issued_ |= (1 << slot);
port_struct_->command_issue |= (1 << slot);
```
But wait! How will we know when this command has completed? We somehow need to
wait until we receive an interrupt for this command to proccess the data it
sent. To handle this we can add a semaphore for each port command slot to allow
signalling when we recieve a completion interrupt for that command. I think it
might make sense to have some sort of callback instead so we can pass errors
back to the caller instead of just a completion signal. However I'm not sure
what type of errors exist that are resolvable by the caller so for now this
works.
```c++
void IdentifyDevice() {
...
// Issue command.
commands_issued_ |= (1 << slot);
port_struct_->command_issue |= (1 << slot);
command_signals_[slot].Wait();
// Continue processing.
...
}
void AhciPort::HandleIrq() {
uint32_t int_status = port_struct_->interrupt_status;
port_struct_->interrupt_status = int_status;
...
// Parse received FIS.
...
uint32_t commands_finished = commands_issued_ & ~port_struct_->command_issue;
for (uint64_t i = 0; i < 32; i++) {
if (commands_finished & (1 << i)) {
command_signals_[i].Signal();
commands_issued_ &= ~(1 << i);
}
}
}
```
Ok now that we have retrieved the information from the drive we can parse it.
For the sector size, the default is 512 bytes which we will use unless the
`LOGICAL SECTOR SIZE SUPPORTED` bit is set in double word 106, bit 12. If that
is set we can check the double words at 117 and 118 to get the 32 bit sector
size value. For the sector count, we need to check if the device supports 48 bit
addressing using double word 83 bit 10. If it is used we can get the number of
sectors from the 4 double words starting at 100. Otherwise we read the number of
sectors from the 2 double words starting at index 60.
```c++
uint16_t* ident = reinterpret_cast<uint16_t*>(region.vaddr());
if (ident[106] & (1 << 12)) {
sector_size_ = *reinterpret_cast<uint32_t*>(ident + 117);
} else {
sector_size_ = 512;
}
if (ident[83] & (1 << 10)) {
lba_count_ = *reinterpret_cast<uint64_t*>(ident + 100);
} else {
lba_count_ = *reinterpret_cast<uint32_t*>(ident + 60);
}
dbgln("Sector size: {x}", sector_size_);
dbgln("LBA Count: {x}", lba_count_);
is_init_ = true;
}
```
You might be rightfully thinking that it would be less brittle to make a struct
definition that we could point at this address which would implicitly contain
these offsets - and you would be correct. But to be honest, I can't be bothered
to create a 256 entry struct definition just to get these values. Maybe in the
future.
## Reading Data
Now that we have the ability to read the IDENTIFY DEVICE data we are only a
short hop, skip, and jump away from reading data from the drive. The main
differences when reading data are (a) the command number, (b) we must specify
the Logical Block Address (LBA) we want to read from and the number of sectors
to read, and (c) we need to dynamically size the entry in the Physical Region
Descriptor Table (we will still use only one entry for now).
Because much of this is similar we can fairly easily create a shared struct with
the necessary information and construct the requests in parallel.
```c++
struct Command {
uint8_t command;
uint64_t lba;
uint32_t sectors;
uint64_t paddr;
};
```
Then from that we can create an IssueCommand function that constructs the
Register Host to Device FIS in a similar way for both. Before that I'd like to
take this opportunity to point out how the LBA in this FIS is stored in a way
that truly only a mother could love:
![Register Host to Device Layout LBA](images/RegisterHostToDeviceFISLBA.png)
That asside we simply update the FIS construction to set the command, LBA, and
sector count. Following that we set the PRDT values (although we still only use
one slot).
```c++
auto* fis = reinterpret_cast<HostToDeviceRegisterFis*>(
command_tables_[slot].command_fis);
*fis = HostToDeviceRegisterFis{
.fis_type = FIS_TYPE_REG_H2D,
.pmp_and_c = 0x80,
.command = command.command,
.lba0 = static_cast<uint8_t>(command.lba & 0xFF),
.lba1 = static_cast<uint8_t>((command.lba >> 8) & 0xFF),
.lba2 = static_cast<uint8_t>((command.lba >> 16) & 0xFF),
.device = (1 << 6), // ATA LBA Mode
.lba3 = static_cast<uint8_t>((command.lba >> 24) & 0xFF),
.lba4 = static_cast<uint8_t>((command.lba >> 32) & 0xFF),
.lba5 = static_cast<uint8_t>((command.lba >> 40) & 0xFF),
.count = command.sectors,
};
command_tables_[slot].prdt[0].region_address = command.paddr;
command_tables_[slot].prdt[0].byte_count = 512 * command.sectors;
```
Then issuing either the identify device command or the read command is
relatively straightforward:
```c++
// IDENTIFY DEVICE
CommandInfo identify{
.command = kIdentifyDevice,
.lba = 0,
.sectors = 1,
.paddr = 0,
};
auto region =
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200, &identify.paddr);
ASSIGN_OR_RETURN(auto* sem, IssueCommand(identify));
sem->Wait();
// DMA READ
CommandInfo dma_read{
.command = kDmaReadExt,
.lba = lba,
.sectors = sector_cnt,
.paddr = 0,
};
auto region =
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200 * sector_cnt, &read.paddr);
ASSIGN_OR_RETURN(auto* sem, IssueCommand(dma_read));
sem->Wait();
```
From here the world is our oyster and we can read any arbitrary data from the
disk. The bulk of this code isn't actually all that long (~200 LOC in the [AHCI
Port implementation](https://gitea.tiramisu.one/drew/acadia/src/commit/21265e76edf4fa93b8ec1795da4bdd2fc70b79d9/sys/denali/ahci/ahci_port.cpp)
). However I probably added and deleted several times that trying to get
everything working and refactored down to a nice interface.
## Coming next
This is nowhere near a full implementation. Among the things we
skipped that I plan to come back to at some point are:
- **Staggered spin up:** In controllers that support this, each device is
powered down after RESET and must be started individually.
- **Message Signaled Interrupts:** The hot new way to handle PCI device
interrupts. Has only been available since 1998 so support may vary.
- **Port Multiplier Support:** Something that gets mentioned all over the specs
but I've avoided evening looking into until this moment. But it looks like it
allows several devices behind a single port.
- **Scatter Gather buffers:** For big files we may not always be able to find a
sufficient contiguous chunk of physical memory. This means we may have to use
more than one entry in the PRDT!
- **Error Handling & Retry:** Even though QEMU may succeed in executing commands
100% of the time, real hardware may not and we should probably handle that.
- **Less that 32 commands supported:** We kinda always assume that the device
can handle 32 commands even though it may not (how many it does is exposed in
the GHC registers).

View File

@ -12,3 +12,7 @@ title = "Drew's Site"
url = 'https://www.linkedin.com/in/drew-galbraith/'
weight = 30
[markup]
[markup.highlight]
noClasses = false

View File

@ -8,6 +8,16 @@
{{- end }}
{{- end }}
{{- with resources.Get "css/syntax.css" }}
{{- if eq hugo.Environment "development" }}
<link rel="stylesheet" href="{{ .RelPermalink }}">
{{- else }}
{{- with . | minify | fingerprint }}
<link rel="stylesheet" href="{{ .RelPermalink }}" integrity="{{ .Data.Integrity }}" crossorigin="anonymous">
{{- end }}
{{- end }}
{{- end }}
{{- range .Resources.Match "css/*.css" }}
{{- if eq hugo.Environment "development" }}
<link rel="stylesheet" href="{{ .RelPermalink }}">