Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mechanize._mechanize.FormNotFoundError: no form matching name 'form' #28

Open
schenkd opened this issue Jul 14, 2024 · 5 comments
Open

Comments

@schenkd
Copy link

schenkd commented Jul 14, 2024

Hey,

I've tried the example from the README.md and installed the app as described.
Unfortunatley it fails because the HTML docs has no HTML-Form named form.

send: b'GET / HTTP/1.1\r\nAccept-Encoding: gzip\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15\r\nAccept-Language: en-GB,en;q=0.9\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nConnection: close\r\nHost: www.handelsregister.de\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
header: Date: Sun, 14 Jul 2024 09:27:36 GMT
header: Server: Apache
header: Strict-Transport-Security: max-age=31536000; preload
header: Referrer-Policy: origin-when-cross-origin
header: Location: https://www.handelsregister.de/rp_web/welcome.xhtml
header: Cache-Control: max-age=15
header: Expires: Sun, 14 Jul 2024 09:27:51 GMT
header: Content-Length: 235
header: Connection: close
header: Content-Type: text/html; charset=iso-8859-1
b'<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>302 Found</title>\n</head><body>\n<h1>Found</h1>\n<p>The document has moved <a href="https://www.handelsregister.de/rp_web/welcome.xhtml">here</a>.</p>\n</body></html>\n'
*****************************************************
send: b'GET /rp_web/welcome.xhtml HTTP/1.1\r\nAccept-Encoding: gzip\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15\r\nAccept-Language: en-GB,en;q=0.9\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nConnection: close\r\nHost: www.handelsregister.de\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Sun, 14 Jul 2024 09:27:37 GMT
header: Server: Apache-Coyote/1.1
header: Strict-Transport-Security: max-age=31536000; preload
header: Referrer-Policy: origin-when-cross-origin
header: Referrer-Policy: origin-when-cross-origin
header: Expires: Thu, 01 Jan 1970 01:00:00 GMT
header: Pragma: no-cache
header: Expires: Tue, 08 Aug 2006 10:00:00 GMT
header: Content-Type: text/html;charset=UTF-8
header: Content-Length: 41571
header: Vary: Accept-Encoding
header: X-Content-Type-Options: nosniff
header: Cache-Control: must-revalidate, proxy-revalidate, no-store, no-cache, s-max-age=0, max-age=0
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: X-Permitted-Cross-Domain-Policies: master-only
header: Set-Cookie: JSESSIONID=53FBAE13BD34A6F4DF932C52BC83C2F5.tc05n02; Path=/; HttpOnly
header: Connection: close
b'<!DOCTYPE html>...too much for github...</html>'
*****************************************************
send: b'GET /rp_web/welcome.xhtml HTTP/1.1\r\nAccept-Encoding: gzip\r\nReferer: https://www.handelsregister.de\r\nCookie: JSESSIONID=53FBAE13BD34A6F4DF932C52BC83C2F5.tc05n02\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15\r\nAccept-Language: en-GB,en;q=0.9\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nConnection: close\r\nHost: www.handelsregister.de\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Sun, 14 Jul 2024 09:27:37 GMT
header: Server: Apache-Coyote/1.1
header: Strict-Transport-Security: max-age=31536000; preload
header: Referrer-Policy: origin-when-cross-origin
header: Referrer-Policy: origin-when-cross-origin
header: Expires: Thu, 01 Jan 1970 01:00:00 GMT
header: Pragma: no-cache
header: Expires: Tue, 08 Aug 2006 10:00:00 GMT
header: Content-Type: text/html;charset=UTF-8
header: Content-Length: 40427
header: Vary: Accept-Encoding
header: X-Content-Type-Options: nosniff
header: Cache-Control: must-revalidate, proxy-revalidate, no-store, no-cache, s-max-age=0, max-age=0
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: X-Permitted-Cross-Domain-Policies: master-only
header: Connection: close
b'<!DOCTYPE html>...too much for github...</html>'
*****************************************************
Registerportal | Homepage
Traceback (most recent call last):
  File "/Users/DASchenk/Projects/private/handelsregister/handelsregister.py", line 185, in <module>
    companies = h.search_company()
                ^^^^^^^^^^^^^^^^^^
  File "/Users/DASchenk/Projects/private/handelsregister/handelsregister.py", line 76, in search_company
    self.browser.select_form(name="form")
  File "/Users/DASchenk/Projects/private/handelsregister/.venv/lib/python3.11/site-packages/mechanize/_mechanize.py", line 681, in select_form
    raise FormNotFoundError("no form matching " + description)
mechanize._mechanize.FormNotFoundError: no form matching name 'form'

What I discovered so far is that follow_link (on line 72) with "Advanced search" is responding with the main page and not with the "rendered" response of "Erweiterte Suche".

response_search = self.browser.follow_link(text="Advanced search")

Link-element with title "Advanced search" on main page, that is included in the response:

<a tabindex="-1" title="Advanced search" class="ui-menuitem-link ui-corner-all  rpNavMainMenuItem" href="#" onclick="PF(\'sidebar1\').hide();PrimeFaces.ab({s:&quot;j_idt27&quot;,f:&quot;headerForm&quot;});return false;">
  <span class="ui-menuitem-icon ui-icon fa fa-search-plus" aria-hidden="true"></span>
  <span class="ui-menuitem-text">Advanced search</span>
</a>

Could someone with more knowleadge of this lib check if this was working with the current website of Handelsregister or is there a major change that breaks the lib?
I'm glad to help solve this issue, but before I try to overcome the JQuery rendering with Selenium I'd like to know if mechanize was before able to handle this behaviour?

Greetings David

@schenkd
Copy link
Author

schenkd commented Jul 14, 2024

On my research I'm not seeing that meachnize would be able to handle those async jquery requests in combination of re-rendering parts of the website. Let me know if I miss something here.

Beside of that I re-wrote the part of the Handelsregister Python SDK to work with Selenium to overcome this issue. If wished I can open a PR.

Best David

@kanadagermane
Copy link

Same issue over here! Can you share your selenium implementation @schenkd?

@fibofant
Copy link

@schenkd @kanadagermane

Change the URL for the page call at open_startpage to https://www.handelsregister.de/rp_web/erweitertesuche.xhtml

    def open_startpage(self):
        self.browser.open("https://www.handelsregister.de/rp_web/erweitertesuche.xhtml", timeout=10)

This is the direct page call for the advanced search.

@c3paul
Copy link

c3paul commented Nov 21, 2024

thank you @fibofant - wonder when this will be fixed?

@coezbek
Copy link

coezbek commented Nov 22, 2024

I submitted a pull request for this #30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants