My old (probably outdated) answer (which was posted long time ago):
There are other ways to overcome this problem:
1. Use the TimeoutSauce
internal class
From: https://github.com/kennethreitz/requests/issues/1928#issuecomment-35811896
import requests from requests.adapters import TimeoutSauceclass MyTimeout(TimeoutSauce):def __init__(self, *args, **kwargs):connect = kwargs.get('connect', 5)read = kwargs.get('read', connect)super(MyTimeout, self).__init__(connect=connect, read=read)requests.adapters.TimeoutSauce = MyTimeout
This code should cause us to set the read timeout as equal to theconnect timeout, which is the timeout value you pass on yourSession.get() call. (Note that I haven't actually tested this code, soit may need some quick debugging, I just wrote it straight into theGitHub window.)
2. Use a fork of requests from kevinburke: https://github.com/kevinburke/requests/tree/connect-timeout
From its documentation: https://github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect and the readtimeouts. Specify a tuple if you would like to set the valuesseparately:
r = requests.get('https://github.com', timeout=(3.05, 27))
kevinburke has requested it to be merged into the main requests project, but it hasn't been accepted yet.
Since requests >= 2.4.0
, you can use the timeout
argument, i.e:
requests.get('https://duckduckgo.com/', timeout=10)
You can also provide a tuple to specify connect
and the read
timeouts separately:
requests.get('https://duckduckgo.com/', timeout=(5, 8.5))
a None
timeout will wait forever (not recommended)
Note:
timeout
is not a time limit on the entire response download; rather,anexception
is raised if the server has not issued a response fortimeout seconds ( more precisely, if no bytes have been received on theunderlying socket for timeout seconds). If no timeout is specifiedexplicitly, requests do not time out.
To create a timeout you can use signals.
The best way to solve this case is probably to
try-except-finally
block.Here is some example code:
import signalfrom time import sleepclass TimeoutException(Exception):""" Simple Exception to be called on timeouts. """passdef _timeout(signum, frame):""" Raise an TimeoutException.This is intended for use as a signal handler.The signum and frame arguments passed to this are ignored."""# Raise TimeoutException with system default timeout messageraise TimeoutException()# Set the handler for the SIGALRM signal:signal.signal(signal.SIGALRM, _timeout)# Send the SIGALRM signal in 10 seconds:signal.alarm(10)try: # Do our code:print('This will take 11 seconds...')sleep(11)print('done!')except TimeoutException:print('It timed out!')finally:# Abort the sending of the SIGALRM signal:signal.alarm(0)
There are some caveats to this:
But, it's all in the standard python library! Except for the sleep function import it's only one import. If you are going to use timeouts many places You can easily put the TimeoutException, _timeout and the singaling in a function and just call that. Or you can make a decorator and put it on functions, see the answer linked below.
You can also set this up as a "context manager" so you can use it with the with
statement:
import signalclass Timeout():""" Timeout for use with the `with` statement. """class TimeoutException(Exception):""" Simple Exception to be called on timeouts. """passdef _timeout(signum, frame):""" Raise an TimeoutException.This is intended for use as a signal handler.The signum and frame arguments passed to this are ignored."""raise Timeout.TimeoutException()def __init__(self, timeout=10):self.timeout = timeoutsignal.signal(signal.SIGALRM, Timeout._timeout)def __enter__(self):signal.alarm(self.timeout)def __exit__(self, exc_type, exc_value, traceback):signal.alarm(0)return exc_type is Timeout.TimeoutException# Demonstration:from time import sleepprint('This is going to take maximum 10 seconds...')with Timeout(10):sleep(15)print('No timeout?')print('Done')
One possible down side with this context manager approach is that you can't know if the code actually timed out or not.
Sources and recommended reading:
import requests, sys, timeTOTAL_TIMEOUT = 10def trace_function(frame, event, arg):if time.time() - start > TOTAL_TIMEOUT:raise Exception('Timed out!')return trace_functionstart = time.time()sys.settrace(trace_function)try:res = requests.get('http://localhost:8080', timeout=(3, 6))except:raisefinally:sys.settrace(None)
Despite all the answers, I believe that this thread still lacks a proper solution and no existing answer presents a reasonable way to do something which should be simple and obvious.
Let's start by saying that as of 2023, there is still absolutely no way to do it properly with requests
alone. It is a concious design decision by the library's developers.
Solutions utilizing the timeout
parameter simply do not accomplish what they intend to do. The fact that it "seems" to work at the first glance is purely incidental:
The timeout
parameter has absolutely nothing to do with the total execution time of the request. It merely controls the maximum amount of time that can pass before underlying socket receives any data. With an example timeout of 5 seconds, server can just as well send 1 byte of data every 4 seconds and it will be perfectly okay, but won't help you very much.
Answers with stream
and iter_content
are somewhat better, but they still do not cover everything in a request. You do not actually receive anything from iter_content
until after response headers are sent, which falls under the same issue - even if you use 1 byte as a chunk size for iter_content
, reading full response headers could take a totally arbitrary amount of time and you can never actually get to the point in which you read any response body from iter_content
.
Here are some examples that completely break both timeout
and stream
-based approach. Try them all. They all hang indefinitely, no matter which method you use.
server.py
import socketimport timeserver = socket.socket()server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)server.bind(('127.0.0.1', 8080))server.listen()while True:try:sock, addr = server.accept()print('Connection from', addr)sock.send(b'HTTP/1.1 200 OK\r\n')# Send some garbage headers very slowly but steadily.# Never actually complete the response.while True:sock.send(b'a')time.sleep(1)except:pass
demo1.py
import requestsrequests.get('http://localhost:8080')
demo2.py
import requestsrequests.get('http://localhost:8080', timeout=5)
demo3.py
import requestsrequests.get('http://localhost:8080', timeout=(5, 5))
demo4.py
import requestswith requests.get('http://localhost:8080', timeout=(5, 5), stream=True) as res:for chunk in res.iter_content(1):break
My approach utilizes Python's sys.settrace
function. It is dead simple. You do not need to use any external libraries or turn your code upside down. Unlike most other answers, this actually guarantees that the code executes in specified time. Be aware that you still need to specify the timeout
parameter, as settrace
only concerns Python code. Actual socket reads are external syscalls which are not covered by settrace
, but are covered by the timeout
parameter. Due to this fact, the exact time limit is not TOTAL_TIMEOUT
, but a value which is explained in comments below.
import requestsimport sysimport time# This function serves as a "hook" that executes for each Python statement# down the road. There may be some performance penalty, but as downloading# a webpage is mostly I/O bound, it's not going to be significant.def trace_function(frame, event, arg):if time.time() - start > TOTAL_TIMEOUT:raise Exception('Timed out!') # Use whatever exception you consider appropriate.return trace_function# The following code will terminate at most after TOTAL_TIMEOUT + the highest# value specified in `timeout` parameter of `requests.get`.# In this case 10 + 6 = 16 seconds.# For most cases though, it's gonna terminate no later than TOTAL_TIMEOUT.TOTAL_TIMEOUT = 10start = time.time()sys.settrace(trace_function)try:res = requests.get('http://localhost:8080', timeout=(3, 6)) # Use whatever timeout values you consider appropriate.except:raisefinally:sys.settrace(None) # Remove the time constraint and continue normally.# Do something with the response
That's it!
Try this request with timeout & error handling:
import requeststry: url = "http://google.com"r = requests.get(url, timeout=10)except requests.exceptions.Timeout as e: print e
The connect timeout is the number of seconds
Requests will wait for your client to establish a connection to a remote machine (corresponding to the connect()) call on the socket. It’s a good practice to set connect timeouts to slightly larger than a multiple of 3, which is the default TCP packet retransmission window.
Once your client has connected to the server and sent the HTTP request, the read timeout started. It is the number of seconds the client will wait for the server to send a response. (Specifically, it’s the number of seconds that the client will wait between bytes sent from the server. In 99.9% of cases, this is the time before the server sends the first byte).
If you specify a single value for the timeout, The timeout value will be applied to both the connect and the read timeouts. like below:
r = requests.get('https://github.com', timeout=5)
Specify a tuple if you would like to set the values separately for connect and read:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
https://docs.python-requests.org/en/latest/user/advanced/#timeouts
Set stream=True
and use r.iter_content(1024)
. Yes, eventlet.Timeout
just somehow doesn't work for me.
try:start = time()timeout = 5with get(config['source']['online'], stream=True, timeout=timeout) as r:r.raise_for_status()content = bytes()content_gen = r.iter_content(1024)while True:if time()-start > timeout:raise TimeoutError('Time out! ({} seconds)'.format(timeout))try:content += next(content_gen)except StopIteration:breakdata = content.decode().split('\n')if len(data) in [0, 1]:raise ValueError('Bad requests data')except (exceptions.RequestException, ValueError, IndexError, KeyboardInterrupt,TimeoutError) as e:print(e)with open(config['source']['local']) as f:data = [line.strip() for line in f.readlines()]
The discussion is here https://redd.it/80kp1h
This may be overkill, but the Celery distributed task queue has good support for timeouts.
In particular, you can define a soft time limit that just raises an exception in your process (so you can clean up) and/or a hard time limit that terminates the task when the time limit has been exceeded.
Under the covers, this uses the same signals approach as referenced in your "before" post, but in a more usable and manageable way. And if the list of web sites you are monitoring is long, you might benefit from its primary feature -- all kinds of ways to manage the execution of a large number of tasks.
I believe you can use multiprocessing
and not depend on a 3rd party package:
import multiprocessingimport requestsdef call_with_timeout(func, args, kwargs, timeout):manager = multiprocessing.Manager()return_dict = manager.dict()# define a wrapper of `return_dict` to store the result.def function(return_dict):return_dict['value'] = func(*args, **kwargs)p = multiprocessing.Process(target=function, args=(return_dict,))p.start()# Force a max. `timeout` or wait for the process to finishp.join(timeout)# If thread is still active, it didn't finish: raise TimeoutErrorif p.is_alive():p.terminate()p.join()raise TimeoutErrorelse:return return_dict['value']call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)
The timeout passed to kwargs
is the timeout to get any response from the server, the argument timeout
is the timeout to get the complete response.
Despite the question being about requests, I find this very easy to do with pycurl CURLOPT_TIMEOUT or CURLOPT_TIMEOUT_MS.
No threading or signaling required:
import pycurlimport StringIOurl = 'http://www.example.com/example.zip'timeout_ms = 1000raw = StringIO.StringIO()c = pycurl.Curl()c.setopt(pycurl.TIMEOUT_MS, timeout_ms) # total timeout in millisecondsc.setopt(pycurl.WRITEFUNCTION, raw.write)c.setopt(pycurl.NOSIGNAL, 1)c.setopt(pycurl.URL, url)c.setopt(pycurl.HTTPGET, 1)try:c.perform()except pycurl.error:traceback.print_exc() # error generated on timeoutpass # or just pass if you don't want to print the error
In case you're using the option stream=True
you can do this:
r = requests.get('http://url_to_large_file',timeout=1, # relevant only for underlying socketstream=True)with open('/tmp/out_file.txt'), 'wb') as f:start_time = time.time()for chunk in r.iter_content(chunk_size=1024):if chunk: # filter out keep-alive new chunksf.write(chunk)if time.time() - start_time > 8:raise Exception('Request took longer than 8s')
The solution does not need signals or multiprocessing.
Just another one solution (got it from http://docs.python-requests.org/en/master/user/advanced/#streaming-uploads)
Before upload you can find out the content size:
TOO_LONG = 10*1024*1024 # 10 Mbbig_url = "http://ipv4.download.thinkbroadband.com/1GB.zip"r = requests.get(big_url, stream=True)print (r.headers['content-length'])# 1073741824 if int(r.headers['content-length']) < TOO_LONG:# upload content:content = r.content
But be careful, a sender can set up incorrect value in the 'content-length' response field.
timeout = (connection timeout, data read timeout) or give a single argument(timeout=1)
import requeststry:req = requests.request('GET', 'https://www.google.com',timeout=(1,1))print(req)except requests.ReadTimeout:print("READ TIME OUT")
this code working for socketError 11004 and 10060......
# -*- encoding:UTF-8 -*-__author__ = 'ACE'import requestsfrom PyQt4.QtCore import *from PyQt4.QtGui import *class TimeOutModel(QThread):Existed = pyqtSignal(bool)TimeOut = pyqtSignal()def __init__(self, fun, timeout=500, parent=None):"""@param fun: function or lambda@param timeout: ms"""super(TimeOutModel, self).__init__(parent)self.fun = funself.timeer = QTimer(self)self.timeer.setInterval(timeout)self.timeer.timeout.connect(self.time_timeout)self.Existed.connect(self.timeer.stop)self.timeer.start()self.setTerminationEnabled(True)def time_timeout(self):self.timeer.stop()self.TimeOut.emit()self.quit()self.terminate()def run(self):self.fun()bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip")a = QApplication([])z = TimeOutModel(bb, 500)print 'timeout'a.exec_()
Well, I tried many solutions on this page and still faced instabilities, random hangs, poor connections performance.
I'm now using Curl and i'm really happy about it's "max time" functionnality and about the global performances, even with such a poor implementation :
content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')
Here, I defined a 6 seconds max time parameter, englobing both connection and transfer time.
I'm sure Curl has a nice python binding, if you prefer to stick to the pythonic syntax :)
There is a package called timeout-decorator that you can use to time out any python function.
@timeout_decorator.timeout(5)def mytest():print("Start")for i in range(1,10):time.sleep(1)print("{} seconds have passed".format(i))
It uses the signals approach that some answers here suggest. Alternatively, you can tell it to use multiprocessing instead of signals (e.g. if you are in a multi-thread environment).
The biggest problem is that if the connection can't be established, the requests
package waits too long and blocks the rest of the program.
There are several ways how to tackle the problem but when I looked for a oneliner similar to requests, I couldn't find anything. That's why I built a wrapper around requests called reqto
("requests timeout"), which supports proper timeout for all standard methods from requests
.
pip install reqto
The syntax is identical to requests
import reqtoresponse = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=1)# Will raise an exception on Timeoutprint(response)
Moreover, you can set up a custom timeout function
def custom_function(parameter):print(parameter)response = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=5,timeout_function=custom_function,timeout_args="Timeout custom function called")#Will call timeout_function instead of raising an exception on Timeoutprint(response)
Important note is that the import line
import reqto
needs to be earlier import than all other imports working with requests, threading, etc. due to monkey_patch which runs in the background.
If it comes to that, create a watchdog thread that messes up requests' internal state after 10 seconds, e.g.:
Note that depending on the system libraries you may be unable to set deadline on DNS resolution.
I'm using requests 2.2.1 and eventlet didn't work for me. Instead I was able use gevent timeout instead since gevent is used in my service for gunicorn.
import geventimport gevent.monkeygevent.monkey.patch_all(subprocess=True)try:with gevent.Timeout(5):ret = requests.get(url)print ret.status_code, ret.contentexcept gevent.timeout.Timeout as e:print "timeout: {}".format(e.message)
Please note that gevent.timeout.Timeout is not caught by general Exception handling. So either explicitly catch gevent.timeout.Timeout
or pass in a different exception to be used like so: with gevent.Timeout(5, requests.exceptions.Timeout):
although no message is passed when this exception is raised.
I came up with a more direct solution that is admittedly ugly but fixes the real problem. It goes a bit like this:
resp = requests.get(some_url, stream=True)resp.raw._fp.fp._sock.settimeout(read_timeout)# This will load the entire response even though stream is setcontent = resp.content
You can read the full explanation here