Tips to optimize your psycopg2 queries

Some years ago, the authors of a new postgreSQL driver, asyncpg, claimed that it outperformed psycopg even in common, not async/await, patterns. Well, fine, the rational reaction is: lets wait for the psycopg2 developers to incorporate the same improvements.

Three years later, and with a new major version in the wild, it is time to compare. If you find, as I found, that your calls to psycopg2 are slower that the calls to asyncpg, consider the following improvements:

  • Avoid dict cursors. Use the default cursor or NamedTupleCursor (Note some fixes to this cursor in versions 2.8 and 2.8.3)
  • Consider to avoid the psycopg json parser. Use SELECT json::text and parse with the native python parser, or do not parse if you do not need it.
  • Consider server side cursors. The lack of reusability, even if it were an issue, is compensated with huge speed improvements. Plus, any named cursor implicitly activates a “server-side” fetchmany via the itersize parameter.

By itself, client-side, fetchmany vs fetchall is only a marginal gain. It still could be valuable in async frameworks if you can adapt the request size to balance the IO wait, but it seems a lot of fine tuning.

Now, it you have a huge query to be distributed to other threads elsewhere, you could consider to do not parse at all, and just get the query with copy_to. For instance, a dump to /dev/null is usually faster (say 20% for 30000 rows) that any other fetch trying to parse the result, so consider it as a baseline:

with open(os.devnull,"w") as f:
   cursor.copy_to(f,"(SELECT foo FROM bar)")

If the query is not so huge, say less than five thousand lines, this approach does not beat a server side cursor.  You can optimize by creating a temporary table before fetching, but remember to commit the transaction to release the locks or you will crash your session. Of course, beyond this we enter in the wide world, server side, of query optimisations for postgreSQL, and this note is only about python interfaces.

An issue with copy_to is that it does not support set_wait_callback. So the aiopg overlay does not support it either. So yes, in this case you have a functional reason to use asyncpg, which seems to support it with similar performance. The name is a bit misleading: copy_from_query. 

Anyway, query plus json processing will exhaust one core at a speed between 50000 to 100000 lines per second,  one tenth of the baseline performance. It is not easy to go multiprocessing without incurring in communication overhead; piping a farm with three extra cores you can parse up to 200 000 lines per second, but a lot of fine-tuning is implied here. One could hope for some combination of async and concurrent.futures farms to tune automagically the best allocation of tasks, but I think we are still far from it. Perhaps with python 4…

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.