-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Posters for IAFD #190
Comments
Cosy will have the code in the nest 10 minutes.... |
It looks like the blog sites (fagalicious) isn't pulling in posters now either but it could be our URL has expired. |
(sorry for the duplicate post) Hope this finds you well! @JPH71 wanted to once again say THANKS! You're quick solution above is going to allow him to add an enhancement to IAFD.....for films, IAFD provides links to index sites that have the film's cover artwork, so Jason will be working on an enhancement that should allow the IAFD agent to crawl to film Film covers, since IAFD itself doesn't contain artwork other than Actor headshots. THANKS! |
Not a problem. Glad I can help. Let me know if you need any more information. |
Here is the issue - this happens when using IAFD as the scraping Agent
IAFD has links to AEBN, GayHotMovies, GayDVDEmpire, CD Universe just like
GEVI does
With this in mind I have the code put in to scrape these external sites and
get any data that is missing in IAFD. especially Posters and Background art.
This is the section of the log file:
2022-08-19 02:34:25,484 (21f8) : INFO (logkit:16) - IAFD - UTILS ::
Access External Links in IAFD: Skip Current Agent Links: IAFD
2022-08-19 02:34:25,484 (21f8) : INFO (logkit:16) - IAFD - UTILS ::
External Sites Found 1 - AdultEmpire -
https://www.iafd.com/shopclick.asp?sku=22956990
2022-08-19 02:34:25,500 (21f8) : INFO (logkit:16) - IAFD - UTILS ::
2 - HotMovies -
https://www.iafd.com/shopclick.asp?sku=9344975
2022-08-19 02:34:25,500 (21f8) : INFO (logkit:16) - IAFD - UTILS ::
3 - HotMovies -
https://www.iafd.com/shopclick.asp?sku=8390429
2022-08-19 02:34:25,500 (21f8) : INFO (logkit:16) - IAFD - UTILS ::
4 - AdultEmpire -
https://www.iafd.com/shopclick.asp?sku=22956383
2022-08-19 02:34:25,500 (21f8) : INFO (logkit:16) - IAFD - UTILS :: Valid
Sites Left 2 - ['AdultEmpire', 'HotMovies']
2022-08-19 02:34:25,516 (21f8) : DEBUG (networking:143) - Requesting '
https://www.iafd.com/shopclick.asp?sku=8390429'
2022-08-19 02:34:25,625 (21f8) : ERROR (networking:196) - Error opening
URL 'https://www.iafd.com/shopclick.asp?sku=8390429'
2022-08-19 02:34:25,625 (21f8) : ERROR (logkit:22) - IAFD - UTILS :: Error
reading External HotMovies URL Link: HTTP Error 403: Forbidden
2022-08-19 02:34:25,641 (21f8) : DEBUG (networking:143) - Requesting '
https://www.iafd.com/shopclick.asp?sku=22956383'
2022-08-19 02:34:25,755 (21f8) : ERROR (networking:196) - Error opening
URL 'https://www.iafd.com/shopclick.asp?sku=22956383'
2022-08-19 02:34:25,755 (21f8) : ERROR (logkit:22) - IAFD - UTILS :: Error
reading External AdultEmpire URL Link: HTTP Error 403: Forbidden
I need to be able to get from :
https://www.iafd.com/shopclick.asp?sku=9344975
to the following:
1 - is the link i have entered into the address bar - that changes to gay
hotmovies
which shows up in 2 as the header....
[image: image.png]
inside utils.py the code is within the Function: *getFilmOnIAFD* line 210
and the error is caused by line 356.....
the function HTML.ElementFromURL(value, timeout=60, errors='ignore', sleep=
DELAY)
is a plex inbuilt function... I have some old documentation that explains
this plex function if you need... but it works like the python requests
library...
if you look at the GEVI __init__.py file, you will see how we implemented
your previous suggestion to get it to start working...
Many thanks
Jason
…On Fri, 19 Aug 2022 at 02:13, fivedays555 ***@***.***> wrote:
Not a problem. Glad I can help. Let me know if you need any more
information.
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKLKWL5B3QJJN3LTMVLVZ3NUDANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I tried the following attempt. Should be working:
I think the direct request would fail is because the iafd using Cloudflare to block unwanted requests. |
You are a star!
is the get_scraper_request code already in the plex agent??
Cheers
…On Fri, 19 Aug 2022 at 04:11, fivedays555 ***@***.***> wrote:
I tried the following attempt. Should be working:
url='https://www.iafd.com/shopclick.asp?sku=9344975'
response = get_scraper_request(url)
res = html.fromstring(response.text)
***@***.***="title"]')[0].text
>>> 'Fire Watch 2'
I think the direct request would fail is because the iafd using Cloudflare
to block unwanted requests.
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKJYMCJJYWWVB5C3HELVZ33PVANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Should be. Otherwise, you won't be able to scrape IAFD. |
Just searched through the utils.py file and there is no module/route
starting with get-scraper_request
cheers and sorry to be a nusidance
…On Fri, 19 Aug 2022 at 07:34, fivedays555 ***@***.***> wrote:
Should be. Otherwise, you won't be able to scrape IAFD.
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKPHNK4FCLVLODTAJDDVZ4TIFANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
No Problem. I will put the function call below.
|
Cheers Man.....
I have been up all night - sorting out duplicate cast entries....
Thanks for all the help!
Jason
…On Fri, 19 Aug 2022 at 08:06, fivedays555 ***@***.***> wrote:
No Problem. I will put the function call below.
import cloudscraper
scraper = cloudscraper.create_scraper()
def get_scraper_request(url, **kwargs):
logging.info("Requesting: " + url)
headers = kwargs.pop('headers', {})
cookies = kwargs.pop('cookies', {})
timeout = kwargs.pop('timeout', 30)
proxies = {}
global scraper
if 'User-Agent' not in headers:
# headers['User-Agent'] = (fake_useragent.UserAgent(fallback='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15')).random
headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15'
scraper.headers.update(headers)
scraper.cookies.update(cookies)
try:
scraper_request = scraper.request(
'GET', url, timeout=timeout, proxies=proxies)
except Exception as ex:
logging.exception('CloudScraper Failed.')
if scraper_request and not scraper_request.ok:
msg = ('< CloudScraper Failed Request Status Code: ' +
str(scraper_request.status_code) + '>')
logging.error(msg)
return scraper_request
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKPK4HOJ7TCDCCKXZSDVZ4W5RANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Glad I can help. Cheers! |
One last thing Adult Film Database....
I don't know what they changed - but the code to scrape now fails.... if
you have the time - send me a few pointers so I can get this agent working
again...
Your help has been much appreciated...
I will implement the changes you have sent into the GetFilmOnIAFD today and
get back to you with the results as soon as possible...
I think I better have a date with Morpheus now... been up all night...
Jason xxx
…On Fri, 19 Aug 2022 at 08:29, fivedays555 ***@***.***> wrote:
Glad I can help. Cheers!
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKK7HE62QTNJNCMNNUTVZ4ZWFANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Not sure what you need. But IAFD has a very sensitive request rate limit. To be safe, I put delay for each IAFD request as And all IAFD requests would need the cloudscraper function. Let me know if you need more information. |
The last request has to do with another agent, Adult Film Database not
IAFD...
rather than just building a search string one has to create formdata and
headers and perform a pull request...
A right pain in the nethers when it stops working....
I will put in that random time sleep in the IAFD code... in the
cloudscraper section.
Thanks once again...
…On Fri, 19 Aug 2022 at 08:50, fivedays555 ***@***.***> wrote:
Not sure what you need. But IAFD has a very sensitive request rate limit.
To be safe, I put delay for each IAFD request as
time.sleep(randint(100, 200)/10)
And all IAFD requests would need the cloudscraper function.
Let me know if you need more information.
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKMCCBZF3YJNA3K5G3LVZ44DTANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Oh, I did not realize it was for Adult Film Database. I never touch or use the Adult Film Database agent, so I don't really know... Mostly, I am using Waybig, Fagalicious Queerclick, and IAFD. They almost cover everything I need. I took a look at Adult Film Database (https://www.adultfilmdatabase.com/), and I think there are so few gay titles there. |
@JPH71 Can this be closed? |
Yes it can...
…On Thu, 29 Dec 2022, 04:13 Cody Berenson, ***@***.***> wrote:
@JPH71 <https://github.com/JPH71> Can this be closed?
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKI3AKLDC5LJMIDEEA4H2WTWPT6XJANCNFSM55NEJ7BA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
No don't... I haven't sorted out IAFD posters yet
…On Thu, 29 Dec 2022, 04:23 Jason Hudson, ***@***.***> wrote:
Yes it can...
On Thu, 29 Dec 2022, 04:13 Cody Berenson, ***@***.***>
wrote:
> @JPH71 <https://github.com/JPH71> Can this be closed?
>
> —
> Reply to this email directly, view it on GitHub
> <#190 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AKI3AKLDC5LJMIDEEA4H2WTWPT6XJANCNFSM55NEJ7BA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
IAFD has links to AEBN, GayHotMovies, GayDVDEmpire, CD Universe just like GEVI does
With this in mind I have the code put in to scrape these external sites and get any data that is missing in IAFD. especially Posters and Background art.
Unfotunately on running the asp link to point to the shop - I get a 403 Forbidden result.....
In chrome developer when I pic the link - I can see within the response header a Location entry that point to the webpage as it does in GEVi. I need to find out how to access this....
One of you helped as with the issues with GEVI, by setting up a refereal header instance some months ago... which saved my bacon in more ways than one..
Could you give some suggestions in ragard to this ---- the offending code is in utils.py - getFilmonIAFD function
Cheers
Jason xx
The text was updated successfully, but these errors were encountered: