You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have observed some very concerning USB timeout issues while testing a PCSC-using application on Rocky Linux 9. The following is typical debug log output of pcscd:
Jan 25 03:31:50 pcscd[1529330]: 00000962 winscard_svc.c:361:ContextThread() Received command: TRANSMIT from client 15
Jan 25 03:31:50 pcscd[1529330]: 00000022 readerfactory.c: 852:RFReaderInfoById() RefReader() count was: 1
Jan 25 03:31:50 pcscd[1529330]: 00000004 winscard.c: 1595:SCardTransmit() Send Protocol: T=1
Jan 25 03:31:50 pcscd[1529330]: 00000004 ifdhandler.c: 1398:IFDHTransmitToICC() usb:0c4b/9102:libudev:0:/dev/bus/usb/002/072 (lun: 0)
Jan 25 03:44:10 pcscd[1529330]: 99999999 ccid_usb.c:1006:ReadUSB() read failed (2/72): LIBUSB_ERROR_TIMEOUT
Jan 25 03:44:10 pcscd[1529330]: 00000611 ifdwrapper.c:543:IFDTransmit() Card not transacted: 612
Jan 25 03:44:10 pcscd[1529330]: 00000079 winscard.c:1620:SCardTransmit() Card not transacted: 0x80100016
Jan 25 03:44:10 pcscd[1529330]: 00000045 winscard.c: 1648:SCardTransmit() UnrefReader() count was: 2
Jan 25 03:44:10 pcscd[1529330]: 00000067 winscard svc.c:691:ContextThread() TRANSMIT rv=0x80100016 for client 15
The timestamp between the last USB send log entry and the LIBUSB_ERROR_TIMEOUT entry shows that a little over 13 minutes passed by until the USB read call apparently ran into a timeout. During this time, pcscd is entirely unresponsive with regard to this particular card. The original call is frozen (probably waiting for an answer) and starting new processes that try to interact with the card in question also end up blocked. Unplugging the USB reader (which causes a LIBUSB_ERROR_NO_DEVICE to end the blocking wait) or restarting pcscd entirely appear to be the only options to shorten the time of unresponsiveness. Removing the card from the reader does not seem to cancel it.
This happens randomly during communication with an NFC card via Libfido2. I have tried two different types of cards, a Yubikey with NFC and an RFID/FIDO2 hybrid card, and it sooner or later happens with both of them, although with the Yubikey this issue appears way earlier (within approx. 30 minutes of constant communication every few seconds) than with the card (it sometimes works for hours without issues, but eventually it runs into it as well). I did use a Reiner SCT cyberJack RFID basis card reader, but as far as my diagnosis goes, this issue is probably independent of the card reader.
I have traced this behavior back to CCID and its calculation of the maximum timeout to wait for USB data, found in the "T1_card_timeout" function, which indeed responds with a whopping 13,3 minute timeout value for both of my NFC devices - perfectly matching the time observed in the log. The function seems to implement a maximally-pessimistic calculation of the time that a single card transaction may potentially take until it is no longer possible for it to ever produce a valid response - at least that's how I understood the code. I got myself a copy of the ISO 7816-3 standards document that the code refers to, double-checked all the calculations, expecting to find some kind of error, but didn't find one - the code appears to implement the formulas exactly as written in the spec, and the default values (which are important, because most of the values in my case are just the defaults) also conform to the spec.
However, at least with the device and cards that I have, the calculated maximum timeout is just insane. It consists of:
260*EGT (Extra Guard Time), which is 4.46ms at default TC1 of 0 and a clock frequency of 1MHz, resulting in 1.16 seconds of the timeout
1*BWT (Block Waiting Time), which adds 5.7 seconds of timeout at default BWI of 4
260*CWT(Character Waiting Time), which is 3051ms (!) at default CWI of 13, resulting in 793 seconds (!!) of the timeout
Especially the last part seems extreme to me, and it clearly causes the unresponsiveness, probably in cases in which no answer is received from the card at all. Since there does not seem to be an error in the calculations themselves, it seems to me that waiting for a timespan that even allows for the most pessimistic case possible - a card that waits the maximum possible time between sending individual characters, and that keeps doing this for the maximum number of characters possible in a transmission - to still eventually return a response might not be the best idea, considering that normally, the cards (at least the two I have) respond in less than a second to any request. The long CWT timeout seems to be acceptable if you're the reader hardware and if you "see" every individual character, hence you're able to reset the timer after each one and limit every single characters' time individually, but considering the multiplication by 260 that is done within the CCID driver, assuming the worst case for all characters of the response to happen at the same time becomes impractical IMHO.
Is there anything I've maybe misunderstood about this timeout calculation logic?
Is there maybe something that might be done to handle the described failure mode more gracefully - without performing code changes in CCID to simply shorten the timeout to a more manageable value?
Are there other users observing this particular problem?
The text was updated successfully, but these errors were encountered:
Your understanding is correct.
I am very impressed you even fetched ISO 7816-3 to read it and check my code is correct. Good job.
Your problem is that the USB timeout should NOT happen. The reader should answer before this timeout.
Your reader is the "Reiner SCT cyberJack RFID basis". Do you have another reader from a different manufacturer?
I would suspect a problem with the reader firmware.
So even in case of a "lost" transmission to the card, to which the card did not react or react erroneously, a properly operating reader should still notify the CCID driver of the (erroneous) completion of that communication?
I don't currently have another reader myself, but a colleague of mine will try to replicate the problems I observed with his own hardware, which includes an entirely different reader. So hopefully I'll soon have an answer to whether this occurs with other readers as well.
So even in case of a "lost" transmission to the card, to which the card did not react or react erroneously, a properly operating reader should still notify the CCID driver of the (erroneous) completion of that communication?
Yes.
The reader should answer that the card is mute or has been removed.
But for a contactless card it can be complex for the reader to know what happened to the card.
I don't currently have another reader myself, but a colleague of mine will try to replicate the problems I observed with his own hardware, which includes an entirely different reader. So hopefully I'll soon have an answer to whether this occurs with other readers as well.
I have observed some very concerning USB timeout issues while testing a PCSC-using application on Rocky Linux 9. The following is typical debug log output of pcscd:
The timestamp between the last USB send log entry and the LIBUSB_ERROR_TIMEOUT entry shows that a little over 13 minutes passed by until the USB read call apparently ran into a timeout. During this time, pcscd is entirely unresponsive with regard to this particular card. The original call is frozen (probably waiting for an answer) and starting new processes that try to interact with the card in question also end up blocked. Unplugging the USB reader (which causes a LIBUSB_ERROR_NO_DEVICE to end the blocking wait) or restarting pcscd entirely appear to be the only options to shorten the time of unresponsiveness. Removing the card from the reader does not seem to cancel it.
This happens randomly during communication with an NFC card via Libfido2. I have tried two different types of cards, a Yubikey with NFC and an RFID/FIDO2 hybrid card, and it sooner or later happens with both of them, although with the Yubikey this issue appears way earlier (within approx. 30 minutes of constant communication every few seconds) than with the card (it sometimes works for hours without issues, but eventually it runs into it as well). I did use a Reiner SCT cyberJack RFID basis card reader, but as far as my diagnosis goes, this issue is probably independent of the card reader.
I have traced this behavior back to CCID and its calculation of the maximum timeout to wait for USB data, found in the "T1_card_timeout" function, which indeed responds with a whopping 13,3 minute timeout value for both of my NFC devices - perfectly matching the time observed in the log. The function seems to implement a maximally-pessimistic calculation of the time that a single card transaction may potentially take until it is no longer possible for it to ever produce a valid response - at least that's how I understood the code. I got myself a copy of the ISO 7816-3 standards document that the code refers to, double-checked all the calculations, expecting to find some kind of error, but didn't find one - the code appears to implement the formulas exactly as written in the spec, and the default values (which are important, because most of the values in my case are just the defaults) also conform to the spec.
However, at least with the device and cards that I have, the calculated maximum timeout is just insane. It consists of:
Especially the last part seems extreme to me, and it clearly causes the unresponsiveness, probably in cases in which no answer is received from the card at all. Since there does not seem to be an error in the calculations themselves, it seems to me that waiting for a timespan that even allows for the most pessimistic case possible - a card that waits the maximum possible time between sending individual characters, and that keeps doing this for the maximum number of characters possible in a transmission - to still eventually return a response might not be the best idea, considering that normally, the cards (at least the two I have) respond in less than a second to any request. The long CWT timeout seems to be acceptable if you're the reader hardware and if you "see" every individual character, hence you're able to reset the timer after each one and limit every single characters' time individually, but considering the multiplication by 260 that is done within the CCID driver, assuming the worst case for all characters of the response to happen at the same time becomes impractical IMHO.
Is there anything I've maybe misunderstood about this timeout calculation logic?
Is there maybe something that might be done to handle the described failure mode more gracefully - without performing code changes in CCID to simply shorten the timeout to a more manageable value?
Are there other users observing this particular problem?
The text was updated successfully, but these errors were encountered: