Support fetching external URLs #88

pdawyndt · 2021-12-22T18:08:38Z

This code should output a random haiku (scraped from a web page), but access to external URLS does not seem to be working. Strange limitation for a Python runtime that is already running in a browser (and has BeautifulSoup on board).

from urllib.request import urlopen

# download web page with random haiku
url = 'http://haikuguy.com/issa/random.php'
reader = urlopen(url)

# parse page until start of haiku is found
marker = '<p class="english">'
line = reader.readline().decode('utf-8')
while line and not line.startswith(marker):
    line = reader.readline().decode('utf-8')

# read three haiku lines and display them
if line.startswith(marker):
    print(line[len(marker):].strip()[:-6])
    line = reader.readline().decode('utf-8')
    print(line.strip()[:-6])
    line = reader.readline().decode('utf-8')
    print(line.strip()[:-4])

alexmojaki · 2022-01-09T11:43:23Z

See pyodide/pyodide#662 and pyodide/pyodide#375. You can read a URL with pyodide.open_url or js.fetch. I don't know why urllib isn't patched or something in pyodide.

In any case, the browser has many security restrictions which cause trouble for most URLs. Your example isn't https so it gets blocked. Other URLs are typically blocked by CORS.

winniederidder · 2022-02-07T20:15:03Z

@alexmojaki Is providing an override of urllib something you are doing in your own packages? Or should I make a Papyros-specific part? It would be similar to input/matplotlib by making urllib functions point to e.g. pyodide.open_url calls. The most common calls can easily be covered by this.

alexmojaki · 2022-02-07T20:21:55Z

I have no plans for this, go ahead.

bmesuere · 2022-02-09T19:16:13Z

@winniederidder you may add this to papyros, but note that external connections will be blocked by the CSP on Dodona.

winniederidder · 2022-05-09T09:09:15Z

After some studying, this issue would affect atleast the following libraries: urllib, urllib3, http.client, requests, SSL, websocket-client, websockets, and more. Doing this in a maintainable way from our side is thus not really feasible. This is something that has to be fixed upstream and is thus a limitation of Pyodide. As @bmesuere mentioned, projects integrating Papyros will have protection mechanisms limiting the accessible URLs.
I will open a new issue to discuss limitations of Pyodide for things like this.

winniederidder added the wontfix This will not be worked on label May 9, 2022

winniederidder added this to the later milestone May 9, 2022

winniederidder mentioned this issue May 9, 2022

Limitations of Pyodide #163

Open

1 task

winniederidder changed the title ~~Support online file access~~ Support accessing external URLs May 16, 2022

winniederidder changed the title ~~Support accessing external URLs~~ Support fetching external URLs May 16, 2022

winniederidder modified the milestones: later, Unplanned May 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support fetching external URLs #88

Support fetching external URLs #88

pdawyndt commented Dec 22, 2021 •

edited

Loading

alexmojaki commented Jan 9, 2022

winniederidder commented Feb 7, 2022

alexmojaki commented Feb 7, 2022

bmesuere commented Feb 9, 2022

winniederidder commented May 9, 2022

Support fetching external URLs #88

Support fetching external URLs #88

Comments

pdawyndt commented Dec 22, 2021 • edited Loading

alexmojaki commented Jan 9, 2022

winniederidder commented Feb 7, 2022

alexmojaki commented Feb 7, 2022

bmesuere commented Feb 9, 2022

winniederidder commented May 9, 2022

pdawyndt commented Dec 22, 2021 •

edited

Loading