Limitations with classic python threads
One of the main problems with the classic implementation of Python threads is that their execution is not completely asynchronous. It's known that the execution of python threads is not completely parallel and adding multiple threads often multiplies the execution times. Therefore, performing these tasks reduces the time of execution.
The execution of the threads in Python is controlled by the GIL (Global Interpreter Lock) so that only one thread can be executed at the same time, independently of the number of processors with which the machine counts.
This makes it possible to write C extensions for Python much more easily, but it has the disadvantage of limiting performance a lot, so in spite of everything, in Python, sometimes we may be more interested in using processes than threads, which do not suffer from this limitation.
By default, the thread change is performed every 10 bytecode instructions, although it can be modified using the sys.setcheckinterval function. It also changes the thread when the thread is put to sleep with time.sleep or when an input/output operation begins, which can take a long time to finish, and therefore, if the change is not made, we would have the CPU long time without executing code,waiting for the I/O operation to finish.
To minimize the effect of GIL on the performance of our application, it is convenient to call the interpreter with the -O flag, which will generate an optimized bytecode with fewer instructions, and, therefore, less context changes. We can also consider using processes instead of threads, as we discussed, such as the ProcessPoolExecutors module.