GithubHelp home page GithubHelp logo

Comments (10)

vonericsen avatar vonericsen commented on July 1, 2024 1

Hi @danderson,
Thanks for testing those additional commands...it seems the 85h opcode is being filtered, not the contents of the command it is trying to issue.

Based on that document you linked, it sounds like it should be possible to issue whatever commands the guest wants to the device, but that is apparently not the case. The design probably only really focused on some basic capabilities for drive
information and reading and writing to optimize compatibility without providing full functionality.

is there any risk in allowing all 85h commands to go to a LUN? Or would qemu need to do additional filtering to only allow "safe" commands to leave the VM?

I had to think about this over the weekend....I think the answer depends on what is considered a risk.
It is possible that certain things are filtered to keep the way the host and guest understand and access the drive compatible with each other. It could be because there is concern that the guest could issue a command like sanitize or format which could erase the whole device, or change a configuration setting on the drive that causes some compatibility issue. This is all speculation at this point, but that would be some things that could be considered a risk. A fast format or sector size change would fall into the later of changing something about the drive that causes incompatibility between host and guest with the device (if that is a real problem or concern).

Because this is a LUN passthough instead of device passthrough, in the SCSI world the concern would be larger than in SATA. In SATA, a physical device only has one logical unit (LUN), so this keeps it simple.
In SCSI/SAS, a device can have multiple logical units on it and can be accessed with multiple ports. Many SAS drives have 2 ports, although most have 1 LUN.
Each lun on each port gets a device handle in the OS (at least every OS I've played with so far), so a dual port, single lun shows up with 2 handles. It stands to reason that additional ports or LUNs would also get their own handle as well. SATA drives are single port and single LUN due to restrictions on SATA specifications that don't allow more than this, so they are easy that they will only have a single instance show up in the system. Multi port SAS will only show each port if they are all connected...standard off the shelf cabling will only pick up a single port. I've only seen dual ports exposed with special backplanes that connect to each port and not everyone uses this feature.

This is not the case with some implementations of multi-actuator products. Depending on the configuration, a single physical drive may show multiple logical units, one for each actuator. The risk in LUN passthrough is that on a device like this, reads and writes will only affect one LUN as expected, but other commands that can change caching, error recovery, or do something like erase the drive or change the sector size may affect ALL logical units.
So if the first LUN is passed to guest 1, and the second LUN is passed to guest 2, and guest 2 decides to reformat the drive, it is possible that all LUNs will get changed and erased depending on what the device's firmware supports. This can destroy guest 1 if these changes were made, which would not be good in this use-case.
There are new fields to help describe when this will happen in the T10 specifications, but as far as I know, current implementations affect the whole physical device. So the risk can be much greater when this kind of change is made in configurations like this.

Since this is all done as a "SCSI" device to the host OS (design of linux in general with libata and not every OS attempts to do ATA passthrough to determine the "child" device since it may not really be useful for normal day to day operations), the host may not or may choose not to differentiate between SCSI and SATA like our tool does, so they may just filter all commands that they don't want potentially affecting multiple logical units.

from openseachest.

danderson avatar danderson commented on July 1, 2024 1

I never revisited this issue, sorry for leaving things hanging :( I wasn't able to find any obvious reason why qemu would be preventing these commands from proceeding, and I worked around qemu's weirdness by getting my server host to run the sector reconfiguration from the bare metal, outside of my VM. So, I think this can be closed since "diagnose and fix qemu LUN passthrough" is definitely way outside the scope of this project.

Thank you for all your debugging efforts and insights!

from openseachest.

danderson avatar danderson commented on July 1, 2024

In the spirit of curiosity, I also tried the non-OSS SeaChest binary from https://github.com/Seagate/ToolBin, and it reports the same information, and also refuses to change the sector size.

from openseachest.

vonericsen avatar vonericsen commented on July 1, 2024

Hi @danderson,

This is very strange. The code to do the sector size conversion is in the tool, so there shouldn't be anything missing in software.
The bit that indicates support of this feature exists in the identify device data log and I'm wondering if there was some error reading that which can happen in weird scenarios with certain controllers.

I noticed that the adapter information says that this drive is attached to some kind of Redhat Virtio device, but the product ID doesn't match anything I can find online...so it's possible that this is causing some kind of incompatibility. I have some suspicions since the SMART status is also reporting as "unknown or not supported" which is an indicator that the return task file registers don't come back properly or at all, but I would not expect this to affect reading this log page.

The verbose output from the tool should help identify if that is the case or not. It will be very verbose outputting command results and data, which I will need to review to see if there is possibly some other issue going on.
Can you attach the verbose output of the -i output?
./openSeaChest_Configure -d /dev/sda -i -v 4 | tee verboseIdentify.txt

from openseachest.

danderson avatar danderson commented on July 1, 2024

Thanks for the detailed response!

The drives are attached to a https://zfs.rent VM using qemu's LUN passthrough feature, which theoretically is supposed to let the VM send commands directly to the drives. I don't have full details on the underlying hardware, but what I do know is that the chain of custody is roughly openSeaChest -> virtio_scsi -> JMicron SATA card -> drive. If that looks to be the problem, I can get more details and diagnostics from the provider.

Here is the verbose identity output: verboseIdentify.txt

from openseachest.

vonericsen avatar vonericsen commented on July 1, 2024

Thanks for the information @danderson!

When I reviewed the log, it seems that this passthrough feature is supporting the A1h SAT passthrough command, which is why some drive information is retrieved in the -i output that looks like it matches the drive you are expecting to see.

Whenever the 85h SAT passthrough command is issued, an error is returned. The difference between them is the A1h opcode only allows 28bit command (so you can get basic identify data and some SMART data), whereas the 85h command allows 48bit commands, which are what is required to get newer drive information from GPL (General purpose logging) logs where additional information, such as support to change the sector size is reported and the command to change the sector size is a 48bit command as well, so this opcode is required in order to make this feature work.
It seems that these 85h commands are blocked or aborted no matter the command that is issued as the non-data command to read this drive's accessible max (or native max) address is blocked, so it isn't limited to the ability to read these other logs as far as I can tell.

It is possible that the virtio_scsi implementation is not really setup to support SAT passthrough beyond the basics inside the A1h command, but does support translating other SCSI type requests instead as it seemed that many of those were translated without a problem (only say one or two that were aborted with the same sense error code). The problem with this is, there is not currently an implemented translation for switching the sector size by using SCSI commands in libata and the SAT5 specification which defines these translations is still very new and I don't think it has been finalized either. SAT5 currently lists this translation as "may be supported" which essentially translates to "optional" so it may not even be implemented by libata or other translators (like USB bridges or SAS HBAs).

I do not think there is a way for openSeaChest to get these commands through, but if you find any other information, I would be happy to dive deeper and try some additional changes.
Right now the only thing I could do is add a "rule" or "hack" that says this HBA (the virtio_scsi that is reported) only allows 28bit commands, but that doesn't do much other than help the tool understand that it is running in a limited mode. If I add this, I can look into a way to report in the -i this and other known limitations (if any) as we don't have a method to report those known limitations to the user at this time.

from openseachest.

danderson avatar danderson commented on July 1, 2024

Thanks for the diagnostic! It does indeed look like qemu is preventing 85h commands from reaching the drive. My server host set up a testbench, and captured the following dumps from within a VM, and on the bare host:
vm_v4.txt
host_v4.txt

Diffing the two, you'll see that on the host, 48-bit commands work fine, and the drive reports additional information and capabilities. And indeed, running on the bare host I was able to switch to a 4k sector size with no issues.

I'm going to try and dig into qemu and see if there's an obvious place where LUN passthrough could be enhanced, but it sounds like the best openSeaChest could do would be to detect this failure case and print a warning about it.

Thanks!

from openseachest.

vonericsen avatar vonericsen commented on July 1, 2024

@danderson,
thanks for testing some more and letting me know!
Glad you were able to do it from the host!

I will look into what I can add to inform about these kinds of limitations.

Would you mind testing one more thing for me to make sure we are understanding the limitations of the virtio scsi hba properly?
In order to make sure the filter is on the 85h versus what is being encapsulated in it, I want to see if issuing an ATA identify using the 85h command completes the same or not.
For this, you'll need to use sg_raw from the sg3utils package. Unless your user is in the disk group, you'll need to run these as sudo or root like openSeaChest also requires.

First, make sure the A1h goes through:
sg_raw -r 512 /dev/YourHandleHere A1 08 0E 00 01 00 00 00 A0 EC 00 00 2>&1 | tee sgRawA1.txt

Now try again with the 85h:
sg_raw -r 512 /dev/YourHandleHere 85 08 0E 00 00 00 01 00 00 00 00 00 00 A0 EC 00 2>&1 | tee sgRaw85.txt

Those both send the ATA identify command, one just uses a larger CDB, so if they both work, then that means the LUN passthrough is filtering all but certain encapsulated commands like identify and SMART. If the 85h fails here too, then it's filtering on the opcode, which is useful to know when setting up some of the known limitations in the tool.

Also, I don't have a lot of experience setting up qemu, let alone this LUN passthrough functionality. If you know of a guide or instructions to configure this, I would be happy to review them so I can do some additional testing without bugging you any further 😃 If you don't know of one, I will poke around until I figure out how to do it.
Thanks!

from openseachest.

danderson avatar danderson commented on July 1, 2024

I ran both identify commands you provided. The 85h version fails, the A1h version works. So, it's looking like something in the I/O chain is only passing specific known opcodes, and just doesn't know about 85h. Outputs: sgRawA1.txt sgRaw85.txt

Weirdly, qemu's LUN passthrough feature is very poorly documented, I can only find references to it in Redhat presentations about the development of virtio_scsi, and some forum posts of users trying to figure out how to enable it. So, I don't have a good recipe for you to set up that environment. The closest I could find to documentation is https://wiki.qemu.org/images/c/c2/Virtio-scsi.pdf , which explains how to enable LUN passthrough in libvirtd configurations.

I'm happy to run commands for you, or even give you access to this VM once it's back in the datacenter next week. I'll continue exploring qemu to see if I can find the code that handles LUN passthrough. I'm hoping there's a really simple if (opcode != 0xA1) that I can fix :).

I know almost nothing about ATA, so maybe you could help me out as well: is there any risk in allowing all 85h commands to go to a LUN? Or would qemu need to do additional filtering to only allow "safe" commands to leave the VM? My default assumption is that as long as you're only talking to the LUN that you're allowed to, the contents of the requests shouldn't matter. Does that sound right?

from openseachest.

vonericsen avatar vonericsen commented on July 1, 2024

Per @danderson I will close this issue.
I did push a small change so that if this "controller" is detected the --llInfo will now dump that it is limited to 28bit ATA passthrough commands. This is far from perfect but can help debug it more if we run into this again in the future.
We can add addition "hacks" or "workarounds" for this configuration down the road, but I'm not sure what else will really be needed at this point.

from openseachest.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.