Proxy Generator

Manages SOCKS5 proxy connections. For setup instructions and recommended workflows, see the :doc: user guide.

exception scholarly2._proxy_generator.DOSException[source]

DOS attack was detected.

exception scholarly2._proxy_generator.MaxTriesExceededException[source]

Maximum number of tries by scholarly reached

class scholarly2._proxy_generator.ProxyGenerator[source]
FreeProxies(timeout=1, wait_time=120)[source]

Deprecated. Set up continuously rotating proxies from the free-proxy library.

Parameters:
  • timeout (float) – Timeout for a single proxy in seconds, optional

  • wait_time (float) – Maximum time (in seconds) to wait until newer set of proxies become available at https://sslproxies.org/

Returns:

whether or not the proxy was set up successfully

Return type:

{bool}

Example::
>>> pg = ProxyGenerator()
>>> success = pg.FreeProxies()
Luminati(usr, passwd, proxy_port)[source]

Deprecated. Set up a Luminati/Bright Data proxy.

Parameters:
  • usr (string) – scholarly username, optional by default None

  • passwd (string) – scholarly password, optional by default None

  • proxy_port (integer) – port for the proxy,optional by default None

Returns:

whether or not the proxy was set up successfully

Return type:

{bool}

Example::
>>> pg = ProxyGenerator()
>>> success = pg.Luminati(usr = foo, passwd = bar, port = 1200)
ScraperAPI(API_KEY, country_code=None, premium=False, render=False)[source]

Deprecated. Set up a proxy using ScraperAPI.

The optional parameters are only for Business and Enterprise plans with ScraperAPI. For more details, https://www.scraperapi.com/documentation/

Example::
>>> pg = ProxyGenerator()
>>> success = pg.ScraperAPI(API_KEY)
Parameters:

API_KEY (string) – ScraperAPI API Key value.

Returns:

whether or not the proxy was set up successfully

Return type:

{bool}

SingleProxy(http=None, https=None)[source]

Deprecated. Use an arbitrary HTTP(S) proxy configuration.

Parameters:
  • http (string) – http proxy address

  • https (string) – https proxy adress

Returns:

whether or not the proxy was set up successfully

Return type:

{bool}

Example::
>>> pg = ProxyGenerator()
>>> success = pg.SingleProxy(http = <http proxy adress>, https = <https proxy adress>)
Socks5Proxies(proxies)[source]

Set up continuously rotating SOCKS5 proxies from a configured list.

Parameters:

proxies (list) – iterable of SOCKS5 proxies in USER:PASS@HOST:PORT format

Returns:

whether or not a working proxy was found

Return type:

{bool}

Example::
>>> pg = ProxyGenerator()
>>> success = pg.Socks5Proxies(["alice:secret@127.0.0.1:1080"])
Socks5ProxyFile(proxy_file: str)[source]

Load SOCKS5 proxies from a file and enable them as a rotating pool.

Parameters:

proxy_file (str) – path to a file containing one USER:PASS@HOST:PORT per line

Returns:

whether or not a working proxy was found

Return type:

{bool}

Tor_External(tor_sock_port: int, tor_control_port: int, tor_password: str)[source]

Setting up Tor Proxy. A tor service should be already running on the system. Otherwise you might want to use Tor_Internal

Parameters:
  • tor_sock_port (int) – the port where the Tor sock proxy is running

  • tor_control_port (int) – the port where the Tor control server is running

  • tor_password (str) – the password for the Tor control server

Example::

pg = ProxyGenerator() pg.Tor_External(tor_sock_port = 9050, tor_control_port = 9051, tor_password = “scholarly_password”)

Note: This method is deprecated since v1.5

Tor_Internal(tor_cmd=None, tor_sock_port=None, tor_control_port=None)[source]

Starts a Tor client running in a scholarly-specific port, together with a scholarly-specific control port. If no arguments are passed for the tor_sock_port and the tor_control_port they are automatically generated in the following ranges - tor_sock_port: (9000, 9500) - tor_control_port: (9500, 9999)

Parameters:
  • tor_cmd (string) – tor executable location (absolute path if its not exported in PATH)

  • tor_sock_port (int) – tor socket port

  • tor_control_port (int) – tor control port

Example::

pg = ProxyGenerator() pg.Tor_Internal(tor_cmd = ‘tor’)

Note: This method is deprecated since v1.5

get_next_proxy(num_tries=None, old_timeout=3, old_proxy=None)[source]
get_session()[source]
has_proxy() bool[source]