617 lines
25 KiB
Markdown
617 lines
25 KiB
Markdown
---
|
||
title: "Writing an AHCI Driver"
|
||
date: 2024-01-09
|
||
tags: ['osdev']
|
||
---
|
||
|
||
Now that I've wrapped up the [0.1.0 Release](/blog/2023/12/acadia-0.1.0) of
|
||
AcadiaOS I'm looking to cleanup some of the "just get it working" hacks that
|
||
exist in the codebase. First up on that list is the AHCI Driver.
|
||
|
||
## What is AHCI
|
||
|
||
AHCI stands for Advanced Host Controller Interface and if you like acronyms boy
|
||
are you in for a treat. AHCI is a way to interface with SATA (which replaced
|
||
PATA (a.k.a. IDE)) via its HBA. AHCI has since been superseded by NVMe but is
|
||
simpler to implement (or so I've been told) so I've started here.
|
||
|
||
To try to explain it without acronym soup, AHCI allows you to access disk
|
||
drives and optical drives (SATA devices) by writing relevant ATA commands to
|
||
memory addresses that are backed by hardware firmware. There are a wide variety
|
||
of commands available but best I can tell the main ones used these days are to
|
||
identify the device and read/write via direct memory access (DMA).
|
||
|
||
Essentially you give the device an offset to read from as well as physical
|
||
memory address to write to. The device firmware copies the amount of data you
|
||
requested to the physical address then triggers an interrupt to indicate that
|
||
the operation is complete. Likewise writing via DMA is the same but in reverse.
|
||
|
||
Disclaimer, all of the above is basically just summarizing Wikipedia and the
|
||
OSDev wiki and I don't really know what I'm talking about.
|
||
|
||
## Current State
|
||
|
||
The current AHCI implementation in Denali is straightforward but very brittle.
|
||
It relies on everything following the happy path and is cobbled together more
|
||
based on trial an error of what worked rather than following the specification
|
||
closely.
|
||
|
||
As a part of this article we're going to dive into the related specs and look at
|
||
how they relate to each other. The trickiest part of writing the driver is the
|
||
fact that the necessary information is spread across several different specs
|
||
rather than contained in one place. The specs I reference in this post are:
|
||
|
||
- [AHCI 1.3.1](https://www.intel.com/content/www/us/en/io/serial-ata/serial-ata-ahci-spec-rev1-3-1.html)
|
||
- SATA 3.2
|
||
- ATA/ATAPI Command Set 3 (ACS-3)
|
||
|
||
The SATA and ACS specs cost money so I can't link them directly but it isn't
|
||
hard to find drafts of them available online.
|
||
|
||
## How AHCI Works
|
||
|
||
AHCI allows you to control SATA devices by writing commands to memory. The
|
||
layout of these structures is nicely shown in the AHCI Spec Figure 4:
|
||
|
||

|
||
|
||
There are several pieces here that I've annotated:
|
||
|
||
1. The Generic Host Control (GHC) is a set of registers that allow you to manage the
|
||
whole controller and get its status. These registers are referred to in the
|
||
spec using GHC.RegisterName so the interrupt status register for instance is
|
||
"GHC.IS" for short.
|
||
2. Each device (hard disk or disc drive) that is attached to the controller is
|
||
exposed as a "port" with a set of registers to control it individually.
|
||
These registers are referred to as PxRegisterName so for instance the command
|
||
issue register is PxCI.
|
||
3. For each port it has a separately allocated piece of memory that can accept
|
||
up to 32 "commands" to execute.
|
||
4. When the controller is finished executing a command for a device it will
|
||
write a Frame Information Structure (FIS) and raise an interrupt. (The GHC
|
||
has a register to show which devices have a pending interrupt GHC.IS).
|
||
|
||
Each device can support up to 32 pending commands (but only receive one FIS at a
|
||
time). The memory structure is as follows:
|
||
|
||

|
||
|
||
To issue a command we can:
|
||
|
||
1. Select a command slot that isn't currently in use.
|
||
2. Write the command to that command table (the specifics of this are explained
|
||
later) and then set that command as active in the commands issued register
|
||
for that port (PxCI).
|
||
3. Wait an interrupt indicating that this command has finished and read the
|
||
resulting FIS. We mostly just care about the status byte. The controller
|
||
will unset the bit for the completed command in the PxCI register when it
|
||
raises the interrupt so we can disambiguate which command has finished.
|
||
|
||
Although this sounds straightforward there are a few moving parts here that we
|
||
need to get up and running.
|
||
|
||
- We need to find where the HBA structure is in memory.
|
||
- We need to allocate some space for the command tables and received FIS
|
||
structures.
|
||
- We need to set up interrupt handling for each port.
|
||
- Before this we will issue a hardware reset command to the controller to get
|
||
everything in a clean state.
|
||
|
||
## Finding the HBA structure
|
||
|
||
We find the HBA structure by iterating through the PCI configuration space and
|
||
finding AHCI device. I'm not going to delve too deeply into this because PCI
|
||
could be a whole separate post and it is a little trickier to explain because
|
||
it is harder to access the specifications (the good people at the PCI "Special
|
||
Interest Group" are happy to let you download the PDF for the low low price of
|
||
$4500 if you are not a member).
|
||
|
||
The short story is that we are looking for the device with the right [class
|
||
code](https://wiki.osdev.org/PCI#Class_Codes) - Class Code 0x1 (Storage Device),
|
||
Subclass 0x6 (SATA Controller), Subtype 0x1 (AHCI).
|
||
|
||
Once we have the correct configuration space we can read the address at offset
|
||
0x24 (called the ABAR for AHCI Base Address) which points to the start of the
|
||
GHC registers.
|
||
|
||
We can mostly ignore the other information in the configuration space for now as
|
||
we aren't dealing with Message Signaled Interrupts yet.
|
||
|
||
## Hardware Reset
|
||
|
||
Now that we have found the Global Host Controller registers we're going to
|
||
initiate a hardware reset of the AHCI Controller. The advantage of this is we
|
||
will know the exact state that the controller and its ports are in. Other than
|
||
that it ensures that we aren't dependent on the specific way limine uses the
|
||
AHCI controller.
|
||
|
||
I suspect that this is **not** how most production operating systems handle
|
||
things but this should give us a clean slate for now.
|
||
|
||
From the AHCI spec:
|
||
|
||
> 10.4.3 HBA Reset
|
||
>
|
||
> If the HBA becomes unusable for multiple ports, and a software reset or port
|
||
> reset does not correct the problem, software may reset the entire HBA by
|
||
> setting GHC.HR to ‘1’. When software sets the GHC.HR bit to ‘1’, the HBA
|
||
> shall perform an internal reset action. The bit shall be cleared to ‘0’ by
|
||
> the HBA when the reset is complete. A software write of ‘0’ to GHC.HR shall
|
||
> have no effect. To perform the HBA reset, software sets GHC.HR to ‘1’ and may
|
||
> poll until this bit is read to be ‘0’, at which point software knows that the
|
||
> HBA reset has completed.
|
||
>
|
||
> If the HBA has not cleared GHC.HR to ‘0’ within 1 second of software setting
|
||
> GHC.HR to ‘1’, the HBA is in a hung or locked state.
|
||
>
|
||
> When GHC.HR is set to ‘1’, GHC.AE, GHC.IE, the IS register, and all port
|
||
> register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the
|
||
> HBA’s register memory space are reset. The HBA’s configuration space and all
|
||
> other global registers/bits are not affected by setting GHC.HR to ‘1’. Any
|
||
> HwInit bits in the port specific registers are not affected by setting GHC.HR
|
||
> to ‘1’. The port specific registers PxFB, PxFBU, PxCLB, and PxCLBU are not
|
||
> affected by setting GHC.HR to ‘1’. If the HBA supports staggered spin-up, the
|
||
> PxCMD.SUD bit will be reset to ‘0’; software is responsible for setting the
|
||
> PxCMD.SUD and PxSCTL.DET fields appropriately such that communication can be
|
||
> established on the Serial ATA link. If the HBA does not support staggered
|
||
> spin-up, the HBA reset shall cause a COMRESET to be sent on the port.
|
||
|
||
Despite the long text, this process is fairly straightforward. We set the
|
||
Hardware Reset bit and then poll for it to be set to 0. We then set the AHCI
|
||
enable bit. For now we can leave interrupts disabled until we have reset the
|
||
ports. Once this is done we sleep for a few milliseconds to allow the ports time
|
||
to spin up. For now we are just using 50ms because that is the smallest
|
||
resolution we support sleeping for (1 scheduling tick) but I think theoretically
|
||
we could sleep for only a millisecond or two.
|
||
|
||
```c++
|
||
ahci_hba_->global_host_control |= kGlobalHostControl_HW_Reset;
|
||
|
||
while (ahci_hba_->global_host_control & kGlobalHostControl_HW_Reset) {
|
||
continue;
|
||
}
|
||
|
||
ahci_hba_->global_host_control |= kGlobalHostControl_AHCI_Enable;
|
||
|
||
return static_cast<glcr::ErrorCode>(ZThreadSleep(50));
|
||
```
|
||
|
||
## Port Initialization
|
||
|
||
Now we can initialize each port that is implemented. There are two cases we
|
||
need to handle. Either the port has received a COMRESET and is running, or
|
||
staggered spin up is supported and we need to enable the port. As our VM doesn't
|
||
require staggered spin up, we will skip it for now and come back to it in the
|
||
future.
|
||
|
||
Before initializing each port we need to check if it has a device attached. We
|
||
can do that by checking the PxSSTS register described in the AHCI spec section
|
||
3.3.10.
|
||
|
||

|
||
|
||
We are looking for a value 0x103, 0x100 indicating that the device is active
|
||
and 0x3 indicating that communication is established. For each port where we
|
||
detect this value we continue the initialization process.
|
||
|
||
### Memory Structures
|
||
|
||
We need to initialize the memory structures for each active port as shown in the
|
||
image before (under How AHCI works).
|
||
|
||
We need a command list structure of length 0x400 (technically it need not be
|
||
that long if fewer than 32 commands are supported but it doesn't use much
|
||
additional memory). Additionally a spot is needed for received FIS structure of
|
||
length 0x100. Finally each of the 32 commands in the command list must point to
|
||
a command table. Technically these can be quite large because each can hold up
|
||
to 2^16 physical region descriptors (using ~1 MiB of memory). I've opted
|
||
to limit it to just 8 16-byte descriptors so each command table would be length
|
||
0x100 as well. For now we don't support scatter gather buffers and just allocate
|
||
one contiguous memory section for each read.
|
||
|
||
In total all of these memory structures takes 0x2500 bytes (3 pages of RAM). We
|
||
allocate them all in one block and manually set up the pointers to their
|
||
physical addresses in the HBA port control.
|
||
|
||
```c++
|
||
// 0x0-0x400 -> Command List
|
||
// 0x400-0x500 -> Received FIS
|
||
// 0x500-0x2500 -> Command Tables (0x100 each) (Max PRDT Length is 8 for now)
|
||
uint64_t paddr;
|
||
command_structures_ =
|
||
mmth::OwnedMemoryRegion::ContiguousPhysical(0x2500, &paddr);
|
||
command_list_ = reinterpret_cast<CommandList*>(command_structures_.vaddr());
|
||
port_struct_->command_list_base = paddr;
|
||
|
||
received_fis_ =
|
||
reinterpret_cast<ReceivedFis*>(command_structures_.vaddr() + 0x400);
|
||
port_struct_->fis_base = paddr + 0x400;
|
||
port_struct_->command |= kCommand_FIS_Receive_Enable;
|
||
|
||
command_tables_ = glcr::ArrayView(
|
||
reinterpret_cast<CommandTable*>(command_structures_.vaddr() + 0x500), 32);
|
||
|
||
for (uint64_t i = 0; i < 32; i++) {
|
||
// This leaves space for 2 prdt entries.
|
||
command_list_->command_headers[i].command_table_base_addr =
|
||
(paddr + 0x500) + (0x100 * i);
|
||
commands_[i] = nullptr;
|
||
}
|
||
port_struct_->interrupt_enable =
|
||
kInterrupt_D2H_FIS | kInterrupt_PIO_FIS | kInterrupt_DMA_FIS |
|
||
kInterrupt_DeviceBits_FIS | kInterrupt_Unknown_FIS;
|
||
port_struct_->sata_error = -1;
|
||
port_struct_->command |= kCommand_Start;
|
||
```
|
||
|
||
There are a few other things going on here. Once we allocate the space to
|
||
receive FIS structures we let the port know that it can send FISes using the
|
||
PxCMD register.
|
||
|
||
Additionally at the end we enable interrupts, clear the error register, and
|
||
tell the port it can start processing commands.
|
||
|
||
## Interrupt Handling
|
||
|
||
Now that the device is initialized we can actually begin to send it commands.
|
||
To do so we need to register an interrupt handler with the correct PCI
|
||
interrupt line (for now we will use the direct interrupt line rather than
|
||
Message Signaled Interrupts). Registering interrupt handlers is a whole other
|
||
beast so for this post we will just focus on their implementation.
|
||
|
||
The first step is to de-multiplex the interrupt in the controller by checking
|
||
the interrupt status register. Each port that has an interrupt pending will
|
||
raise it's corresponding bit in the Interrupt Status register. We can delegate
|
||
to each port the handling of an interrupt, then clear the interrupt bit once it
|
||
is done. The relevant code in this case looks like this:
|
||
|
||
```c++
|
||
for (uint64_t i = 0; i < num_ports_; i++) {
|
||
if (!ports_[i].empty() && (ahci_hba_->interrupt_status & (1 << i))) {
|
||
ports_[i]->HandleIrq();
|
||
ahci_hba_->interrupt_status &= ~(1 << i);
|
||
}
|
||
}
|
||
```
|
||
|
||
Then on the port side we can handle the interrupt. This requires determining
|
||
what kind of interrupt was generated using the port's Interrupt Status register
|
||
(PxIS). Each of the 17 defined bits in this register correspond to a different
|
||
interrupt type and can be individual enabled and disabled using the port's
|
||
Interrupt Enable register (PxIE). For now as we registered when setting up the
|
||
port we will only handle the interrupts related to receiving FISes from the
|
||
device.
|
||
|
||
```c++
|
||
void AhciDevice::HandleIrq() {
|
||
uint32_t int_status = port_struct_->interrupt_status;
|
||
port_struct_->interrupt_status = int_status;
|
||
|
||
bool has_error = false;
|
||
if (int_status & kInterrupt_D2H_FIS) {
|
||
dbgln("D2H Received");
|
||
// Device to host.
|
||
volatile DeviceToHostRegisterFis& fis =
|
||
received_fis_->device_to_host_register_fis;
|
||
if (!CheckFisType(FIS_TYPE_REG_D2H, fis.fis_type)) {
|
||
return;
|
||
}
|
||
if (fis.error) {
|
||
dbgln("D2H err: {x}", fis.error);
|
||
dbgln("status: {x}", fis.status);
|
||
has_error = true;
|
||
}
|
||
}
|
||
if (int_status & kInterrupt_PIO_FIS) {
|
||
// Like above ...
|
||
}
|
||
if (int_status & kInterrupt_DMA_FIS) {
|
||
// Like above ...
|
||
}
|
||
// ...
|
||
}
|
||
```
|
||
|
||
To handle the interrupt we read the raised interrupts from the PxIS register and
|
||
write the values back to it to clear them. Then we can specify how to handle
|
||
each type of interrupt that we receive. For now we will just debug print the
|
||
type and any errors from the interrupt since we aren't sending any commands.
|
||
|
||
Something I'm not sure about is that as soon as we enable interrupts we seem to
|
||
receive a FIS from the device with an error bit set. Both the hard drive and the
|
||
optical drive on QEMU send a FIS with error bit 0x1 set. Additionally the status
|
||
field is set to 0x30 for the hard drive and 0x70 for the optical drive.
|
||
|
||
I was able to find a [OSDev Forum
|
||
post](https://forum.osdev.org/viewtopic.php?f=1&t=56462&start=15#p342163)
|
||
referencing that this behavior is caused by the reset sending an EXECUTE DEVICE
|
||
DIAGNOSTIC command (0x90) to the device. It notes that this is largely
|
||
undocumented behavior but at least this information offers some clarity on the
|
||
outputs. Reading the ATA Command Set section 7.9.4 we can see that the command
|
||
outputs code 0x01 to the error bits when `Device 0 passed, Device 1 passed or not
|
||
present`. According a footnote we can "See the appropriate transport standard
|
||
for the definition of device 0 and device 1." I really thought I was already
|
||
looking at the "appropriate transport standard" but alas. All that to say we'll
|
||
just ignore this interrupt for now.
|
||
|
||
## Sending a Command
|
||
|
||
Now that the AHCI ports are initialized and can handle an interrupt, we can send
|
||
commands to them. To start with lets send the IDENTIFY DEVICE command to each
|
||
device. This command asks the device to send 512 bytes of information about
|
||
itself back to us. These bytes contain 40 years of certified-crufty backwards
|
||
compatibility. I mean just feast your eyes on the number of retired and obsolete
|
||
fields in just the first page of the spec.
|
||
|
||

|
||
|
||
We'll ignore almost all of this information and just try to get the sector size
|
||
and sector count from the drive. To do so we need to figure out how to send a
|
||
command to the device. To be honest I feel like the specs fall down here in
|
||
actually explaining this. The trick is to send a Register Host to Device FIS in one
|
||
of the command slots. This FIS type has a field for the command as well as some
|
||
common parameters such as LBA and count. In retrospect it is fairly clear once
|
||
you are aware of it, but if you are just reading the SATA spec and looking at
|
||
the possible commands, making the logical jump to the Register Host To Device
|
||
FIS feels damn near impossible.
|
||
|
||
First up we chose an empty command slot to use:
|
||
|
||
```c++
|
||
uint64_t slot;
|
||
for (slot = 0; slot < 32; slot++) {
|
||
if (!(commands_issued_ & (1 << slot))) {
|
||
break;
|
||
}
|
||
}
|
||
if (slot == 32) {
|
||
dbgln("All slots full");
|
||
return glcr::INTERNAL;
|
||
}
|
||
```
|
||
|
||
The `commands_issued_` variable is just for our own accounting of which slots
|
||
are currently in use by another command.
|
||
|
||
Next we can populate the FIS for that slot. The spec for the Register Host to
|
||
Device FIS is as follows:
|
||
|
||

|
||
|
||
We don't need to initialize most of the fields here because the IDENTIFY_DEVICE
|
||
call doesn't rely on an LBA or sector count. One of the keys is setting the high
|
||
bit "C" in the byte that contains PM Port which indicates to the HBA that this
|
||
FIS contains a new command (I spent a while trying to figure out why this wasn't
|
||
working without that). The code for this is relatively straightforward.
|
||
|
||
```c++
|
||
auto* fis = reinterpret_cast<HostToDeviceRegisterFis*>(
|
||
command_tables_[slot].command_fis);
|
||
*fis = HostToDeviceRegisterFis{
|
||
.fis_type = FIS_TYPE_REG_H2D,
|
||
.pmp_and_c = 0x80,
|
||
.command = kIdentifyDevice, // 0xEC
|
||
};
|
||
```
|
||
|
||
We also need to let the HBA know where it can put the result in memory. For this
|
||
we use the physical region descriptor table corresponding to this command slot.
|
||
As described before, for simplicity now we are only using a single entry to do
|
||
this. We allocate a 512 byte memory region and set it's physical address and
|
||
size in the first slot of the command slots PRDT.
|
||
|
||
```c++
|
||
uint64_t paddr;
|
||
auto region =
|
||
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200, &paddr);
|
||
command_tables_[slot].prdt[0].region_address = command.paddr;
|
||
command_tables_[slot].prdt[0].byte_count = 0x200; // 512 bytes
|
||
command_list_->command_headers[slot].prd_table_length = 1;
|
||
```
|
||
|
||
All that is left to do is to issue the command! We set the size of the command
|
||
FIS (in double words for some reason?) as well as let the HBA know it can
|
||
prefetch the data from memory. Then we set the bit for this command slot in the
|
||
PxCI register which will cause the device to start processing it.
|
||
|
||
```c++
|
||
// Set the command FIS length (in double words).
|
||
command_list_->command_headers[slot].command =
|
||
(sizeof(HostToDeviceRegisterFis) / 4) & 0x1F;
|
||
|
||
// Set prefetch bit.
|
||
command_list_->command_headers[slot].command |= (1 << 7);
|
||
|
||
// TODO: Synchronization-wise we need to ensure this is set in the same
|
||
// critical section as where we select a slot.
|
||
commands_issued_ |= (1 << slot);
|
||
port_struct_->command_issue |= (1 << slot);
|
||
```
|
||
|
||
But wait! How will we know when this command has completed? We somehow need to
|
||
wait until we receive an interrupt for this command to process the data it
|
||
sent. To handle this we can add a semaphore for each port command slot to allow
|
||
signalling when we receive a completion interrupt for that command. I think it
|
||
might make sense to have some sort of callback instead so we can pass errors
|
||
back to the caller instead of just a completion signal. However I'm not sure
|
||
what type of errors exist that are resolvable by the caller so for now this
|
||
works.
|
||
|
||
```c++
|
||
void IdentifyDevice() {
|
||
...
|
||
// Issue command.
|
||
commands_issued_ |= (1 << slot);
|
||
port_struct_->command_issue |= (1 << slot);
|
||
|
||
command_signals_[slot].Wait();
|
||
|
||
// Continue processing.
|
||
...
|
||
}
|
||
|
||
void AhciPort::HandleIrq() {
|
||
uint32_t int_status = port_struct_->interrupt_status;
|
||
port_struct_->interrupt_status = int_status;
|
||
|
||
...
|
||
// Parse received FIS.
|
||
...
|
||
|
||
uint32_t commands_finished = commands_issued_ & ~port_struct_->command_issue;
|
||
|
||
for (uint64_t i = 0; i < 32; i++) {
|
||
if (commands_finished & (1 << i)) {
|
||
command_signals_[i].Signal();
|
||
commands_issued_ &= ~(1 << i);
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
OK now that we have retrieved the information from the drive we can parse it.
|
||
For the sector size, the default is 512 bytes which we will use unless the
|
||
`LOGICAL SECTOR SIZE SUPPORTED` bit is set in double word 106, bit 12. If that
|
||
is set we can check the double words at 117 and 118 to get the 32 bit sector
|
||
size value. For the sector count, we need to check if the device supports 48 bit
|
||
addressing using double word 83 bit 10. If it is used we can get the number of
|
||
sectors from the 4 double words starting at 100. Otherwise we read the number of
|
||
sectors from the 2 double words starting at index 60.
|
||
|
||
```c++
|
||
uint16_t* ident = reinterpret_cast<uint16_t*>(region.vaddr());
|
||
if (ident[106] & (1 << 12)) {
|
||
sector_size_ = *reinterpret_cast<uint32_t*>(ident + 117);
|
||
} else {
|
||
sector_size_ = 512;
|
||
}
|
||
|
||
if (ident[83] & (1 << 10)) {
|
||
lba_count_ = *reinterpret_cast<uint64_t*>(ident + 100);
|
||
} else {
|
||
lba_count_ = *reinterpret_cast<uint32_t*>(ident + 60);
|
||
}
|
||
dbgln("Sector size: {x}", sector_size_);
|
||
dbgln("LBA Count: {x}", lba_count_);
|
||
is_init_ = true;
|
||
}
|
||
```
|
||
|
||
You might be rightfully thinking that it would be less brittle to make a struct
|
||
definition that we could point at this address which would implicitly contain
|
||
these offsets - and you would be correct. But to be honest, I can't be bothered
|
||
to create a 256 entry struct definition just to get these values. Maybe in the
|
||
future.
|
||
|
||
## Reading Data
|
||
|
||
Now that we have the ability to read the IDENTIFY DEVICE data we are only a
|
||
short hop, skip, and jump away from reading data from the drive. The main
|
||
differences when reading data are (a) the command number, (b) we must specify
|
||
the Logical Block Address (LBA) we want to read from and the number of sectors
|
||
to read, and (c) we need to dynamically size the entry in the Physical Region
|
||
Descriptor Table (we will still use only one entry for now).
|
||
|
||
Because much of this is similar we can fairly easily create a shared struct with
|
||
the necessary information and construct the requests in parallel.
|
||
|
||
```c++
|
||
struct Command {
|
||
uint8_t command;
|
||
uint64_t lba;
|
||
uint32_t sectors;
|
||
uint64_t paddr;
|
||
};
|
||
```
|
||
|
||
Then from that we can create an IssueCommand function that constructs the
|
||
Register Host to Device FIS in a similar way for both. Before that I'd like to
|
||
take this opportunity to point out how the LBA in this FIS is stored in a way
|
||
that truly only a mother could love:
|
||
|
||

|
||
|
||
That aside we simply update the FIS construction to set the command, LBA, and
|
||
sector count. Following that we set the PRDT values (although we still only use
|
||
one slot).
|
||
|
||
```c++
|
||
auto* fis = reinterpret_cast<HostToDeviceRegisterFis*>(
|
||
command_tables_[slot].command_fis);
|
||
*fis = HostToDeviceRegisterFis{
|
||
.fis_type = FIS_TYPE_REG_H2D,
|
||
.pmp_and_c = 0x80,
|
||
.command = command.command,
|
||
|
||
.lba0 = static_cast<uint8_t>(command.lba & 0xFF),
|
||
.lba1 = static_cast<uint8_t>((command.lba >> 8) & 0xFF),
|
||
.lba2 = static_cast<uint8_t>((command.lba >> 16) & 0xFF),
|
||
.device = (1 << 6), // ATA LBA Mode
|
||
|
||
.lba3 = static_cast<uint8_t>((command.lba >> 24) & 0xFF),
|
||
.lba4 = static_cast<uint8_t>((command.lba >> 32) & 0xFF),
|
||
.lba5 = static_cast<uint8_t>((command.lba >> 40) & 0xFF),
|
||
|
||
.count = command.sectors,
|
||
};
|
||
|
||
command_tables_[slot].prdt[0].region_address = command.paddr;
|
||
command_tables_[slot].prdt[0].byte_count = 512 * command.sectors;
|
||
```
|
||
|
||
Then issuing either the identify device command or the read command is
|
||
relatively straightforward:
|
||
|
||
```c++
|
||
// IDENTIFY DEVICE
|
||
CommandInfo identify{
|
||
.command = kIdentifyDevice,
|
||
.lba = 0,
|
||
.sectors = 1,
|
||
.paddr = 0,
|
||
};
|
||
auto region =
|
||
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200, &identify.paddr);
|
||
ASSIGN_OR_RETURN(auto* sem, IssueCommand(identify));
|
||
sem->Wait();
|
||
|
||
// DMA READ
|
||
CommandInfo dma_read{
|
||
.command = kDmaReadExt,
|
||
.lba = lba,
|
||
.sectors = sector_cnt,
|
||
.paddr = 0,
|
||
};
|
||
auto region =
|
||
mmth::OwnedMemoryRegion::ContiguousPhysical(0x200 * sector_cnt, &read.paddr);
|
||
ASSIGN_OR_RETURN(auto* sem, IssueCommand(dma_read));
|
||
sem->Wait();
|
||
```
|
||
|
||
From here the world is our oyster and we can read any arbitrary data from the
|
||
disk. The bulk of this code isn't actually all that long (~200 LOC in the [AHCI
|
||
Port implementation](https://gitea.tiramisu.one/drew/acadia/src/commit/21265e76edf4fa93b8ec1795da4bdd2fc70b79d9/sys/denali/ahci/ahci_port.cpp)
|
||
). However I probably added and deleted several times that trying to get
|
||
everything working and refactored down to a nice interface.
|
||
|
||
## Coming next
|
||
|
||
This is nowhere near a full implementation. Among the things we
|
||
skipped that I plan to come back to at some point are:
|
||
|
||
- **Staggered spin up:** In controllers that support this, each device is
|
||
powered down after RESET and must be started individually.
|
||
- **Message Signaled Interrupts:** The hot new way to handle PCI device
|
||
interrupts. Has only been available since 1998 so support may vary.
|
||
- **Port Multiplier Support:** Something that gets mentioned all over the specs
|
||
but I've avoided evening looking into until this moment. But it looks like it
|
||
allows several devices behind a single port.
|
||
- **Scatter Gather buffers:** For big files we may not always be able to find a
|
||
sufficient contiguous chunk of physical memory. This means we may have to use
|
||
more than one entry in the PRDT!
|
||
- **Error Handling & Retry:** Even though QEMU may succeed in executing commands
|
||
100% of the time, real hardware may not and we should probably handle that.
|
||
- **Less that 32 commands supported:** We kinda always assume that the device
|
||
can handle 32 commands even though it may not (how many it does is exposed in
|
||
the GHC registers).
|