For a matrix of size 2Kx2K they achieved 11.05 Gflop/s, which is around 75% of the double precision peak. They have also implemented a single precision version of the code, which achieved 155 Gflop/s ...
Processor makers are pushing down the precision for a range of new and forthcoming devices, driven by a need that balances accuracy with energy-efficient performance for an emerging set of workloads.