As the size and ubiquity of artificial intelligence and computational machine learning (ML) models grow, the energy required to train and use them is rapidly becoming economically and environmentally unsustainable. Recent laboratory prototypes of self-learning electronic circuits, examples of ``physical learning machines," open the door to analog hardware that directly employs physics to learn desired functions from examples at low energy cost. In this work, we show that this hardware platform allows for even further reduction of energy consumption by using good initial conditions as well as a new learning algorithm. Using analytical calculations, simulation and experiment, we show that a trade-off emerges when learning dynamics attempt to minimize both the error and the power consumption of the solution–greater power reductions can be achieved at the cost of decreasing solution accuracy. Finally, we demonstrate a practical procedure to weigh the relative importance of error and power minimization, improving power efficiency given a specific tolerance to error.