######################### Byzantine dataset servers ######################### ******* Summary ******* DAP servers use dataset servers to download and pre-filter requested datasets. Dataset servers `may` (i.e. `will`) fail to properly transmit data (memory or disk corruption, malicious user, etc...). This document describes a way to protect PyCTOH from byzantine dataset servers. ********* Rationale ********* .. graphviz:: digraph datastream { rankdir=LR; "Store" -> "DatasetServer" -> "DAP Server"; } Storage servers provide intrinsic error checking. Dataset servers can trust them without alleviating data integrity : if an undetected error occurs while retrieving data, it will be detected by DAP server at the next stage. The critical part along the path from data store to DAP Server is dataset server to DAP server transmission: dataserver may be on an untrusted host, and return faked, though valid, data. A byzantine server detection system should validate data received from untrusted hosts and be able to score them with some "trust-level". ****** Design ****** Dataset requests are deterministic: whatever server we request, response should be the same, bit-per-bit. When DAP server requests for a dataset, it may [*]_ ask to *another* dataset server to perform the same request and to return just a checksum of the response [*]_. DAP server checksums dataset received from first server and compares this to the value returned by the second one. If values don't match, both servers are tagged with a byzantine warning flag, and the same request is re-issued to some other hosts. When we have a validated response, we can unflag original hosts which gave valid response. .. rubric:: Footnotes .. [*] Whether to perform a byzantine check can be determined by some ``byzantine_check_rate`` config value. .. [*] This saves bandwith. A strong checksum is used so that a malicious server can't forge a fake response with a correct checksum. md5 is a good choice. ************** Implementation **************