Approximate computing is an emerging paradigm for error-tolerant applications. By introducing a reasonable amount of inaccuracy, both the area and delay of a circuit can be reduced significantly. To produce approximate circuits automatically, many approximate logic synthesis (ALS) algorithms are proposed. However, they mainly focus on area reduction and are not optimal in reducing the circuit delay. In this paper, we propose HEDALS, a highly efficient delay-driven ALS framework, which supports various types of local approximate changes (LACs), circuit representations, and average error metrics. To reduce delay, HEDALS builds a critical error graph (CEG) consisting of nodes on the critical paths and error information, and finds an optimized set of LACs in the CEG by either a maximum flow-based method or a priority cut-based method. The resulting set of LACs is applied to shorten all critical paths simultaneously so that the circuit delay is reduced. Besides, the simultaneous application of multiple LACs also makes HEDALS extremely fast. Compared to a state-of-the-art method, on average, HEDALS can reduce the circuit delay by 32.3%, while being 167X faster. The code of HEDALS is made open-source.