-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sourcery Starbot ⭐ refactored Liebmann5/Web_Scraper #2
base: main
Are you sure you want to change the base?
Conversation
if not all(field in data for field in expected_data): | ||
if any(field not in data for field in expected_data): | ||
return False, 'Invalid data format' | ||
|
||
if data['Employment Type'] not in allowed_employment_types: | ||
return False, 'Invalid Employment Type' | ||
|
||
if data['Experience Level'] not in allowed_experience_levels: | ||
return False, 'Invalid Experience Level' | ||
|
||
#TODO: Add more checks like insurance it's within users country!!! | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function validate_job_data
refactored with the following changes:
- Invert any/all to simplify comparisons (
invert-any-all
)
|
||
# Create a signature | ||
signature = private_key.sign( | ||
|
||
return private_key.sign( | ||
data, | ||
padding.PSS( | ||
mgf=padding.MGF1(hashes.SHA256()), | ||
salt_length=padding.PSS.MAX_LENGTH | ||
salt_length=padding.PSS.MAX_LENGTH, | ||
), | ||
hashes.SHA256() | ||
) | ||
|
||
# return signature.hex() #used this when I didn't have any default_backend code!!! | ||
return signature No newline at end of file | ||
hashes.SHA256(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function sign_data
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
This removes the following comments ( why? ):
# return signature.hex() #used this when I didn't have any default_backend code!!!
# Create a signature
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CompanyWorkflow.company_workflow
refactored with the following changes:
- Merge else clause's nested if statement into elif (
merge-else-if-into-elif
)
This removes the following comments ( why? ):
# NEW NEW NEW NEW
#TODO: refactor this!
#! FAILS: If "Internal-Job-Listings" is the initial webpage this ruins
language_of_webpage = predictions[0][0].replace('__label__', '') | ||
#TODO: Determine whether this should go here or somewhere else!! | ||
# if language_of_webpage == 'en': | ||
# return True | ||
# else: | ||
# return False | ||
# = = = = | ||
# return language_of_webpage == 'en' | ||
return language_of_webpage | ||
return predictions[0][0].replace('__label__', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CompanyWorkflow.check_language_of_webpage
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
This removes the following comments ( why? ):
#TODO: Determine whether this should go here or somewhere else!!
# = = = =
# return False
# if language_of_webpage == 'en':
# else:
# return language_of_webpage == 'en'
# return True
elements = { | ||
|
||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CompanyWorkflow.url_parser
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
if arg == 'greenhouse': | ||
print(method_name) | ||
print(arg) | ||
for key, value in kwargs.items(): | ||
print(key + ": " + str(value)) | ||
elif arg == 'lever': | ||
if arg in ['greenhouse', 'lever']: | ||
print(method_name) | ||
print(arg) | ||
for key, value in kwargs.items(): | ||
print(key + ": " + str(value)) | ||
print(f"{key}: {str(value)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CompanyWorkflow.print_companies_internal_job_opening
refactored with the following changes:
- Simplify conditional into switch-like form (
switch
) - Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
if re.search(experience_needed, everything_about_job): | ||
return False | ||
else: | ||
return True | ||
return not re.search(experience_needed, everything_about_job) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CompanyWorkflow.should_user_apply
refactored with the following changes:
- Replace if statement with if expression (
assign-if-exp
) - Simplify boolean if expression (
boolean-if-exp-identity
) - Remove unnecessary casts to int, str, float or bool (
remove-unnecessary-cast
)
print("Result #" + str(self.job_links_counter) + " from Google Seaech") | ||
print(f"Result #{self.job_links_counter} from Google Seaech") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function scraperGoogle.print_google_search_results
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
) - Remove unnecessary calls to
str()
from formatted values in f-strings (remove-str-from-fstring
)
print("Result #" + str(i+1) + " from Google Seaech") | ||
print(f"Result #{str(i + 1)} from Google Seaech") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function scraperGoogle.new_print_google_search_results
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
print("Result #" + str(i+1) + " from Google Seaech") | ||
print(f"Result #{str(i + 1)} from Google Seaech") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function scraperGoogle.new_new_print_google_search_results
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
print("When you are done, type ONLY the number of your preferred web browser then press ENTER") | ||
print(f"\t1) FireFox") | ||
print(f"\t2) Safari") | ||
print(f"\t3) Chrome") | ||
print(f"\t4) Edge") | ||
while True: | ||
user_jobs = input() | ||
user_jobs.strip() | ||
|
||
if user_jobs == "1": | ||
users_browser_choice = " FireFox " | ||
break | ||
elif user_jobs == "2": | ||
users_browser_choice = " Safari " | ||
break | ||
elif user_jobs == "3": | ||
users_browser_choice = " Chrome " | ||
break | ||
elif user_jobs == "4": | ||
users_browser_choice = " Edge " | ||
break | ||
else: | ||
print("That's kinda messed up dog... I give you an opportunity to pick and you pick nothing.") | ||
print("You've squandered any further opportunities to decide stuff. I hope you are happy with yourself.") | ||
print("Don't worry, the council shall discuss and provide a pick for you!") | ||
#TODO: Make else just check OS and return number of that OS's web browser!!! | ||
#! THIS IS A while loop.... so it runs until false | ||
return users_browser_choice, browser_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Workflow.users_browser_choice
refactored with the following changes:
- Remove unreachable code (
remove-unreachable-code
)
This removes the following comments ( why? ):
#TODO: Make else just check OS and return number of that OS's web browser!!!
#! THIS IS A while loop.... so it runs until false
def show_warning(message, category, filename, lineno, file=None, line=None): | ||
print(f"Warning: {message}") | ||
def show_warning(self, category, filename, lineno, file=None, line=None): | ||
print(f"Warning: {self}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Workflow.show_warning
refactored with the following changes:
- The first argument to instance methods should be
self
(instance-method-first-arg-name
)
cleaned_text = clean(text, | ||
fix_unicode=True, # fix various unicode errors | ||
to_ascii=True, # transliterate to closest ASCII representation | ||
lower=False, # lowercase text | ||
no_line_breaks=remove_breaks, # fully strip line breaks as opposed to only normalizing them | ||
no_urls=True, # replace all URLs with a special token | ||
no_emails=True, # replace all email addresses with a special token | ||
no_phone_numbers=True, # replace all phone numbers with a special token | ||
no_numbers=False, # replace all numbers with a special token | ||
no_digits=False, # replace all digits with a special token | ||
no_currency_symbols=True, # replace all currency symbols with a special token | ||
no_punct=False, # remove punctuations | ||
replace_with_punct="", # instead of removing punctuations you may replace them | ||
replace_with_url="", | ||
replace_with_email="", | ||
replace_with_phone_number="", | ||
replace_with_number="", | ||
replace_with_digit="0", | ||
replace_with_currency_symbol="", | ||
lang="en" # set to 'de' for German special handling | ||
) | ||
return cleaned_text | ||
return clean( | ||
text, | ||
fix_unicode=True, # fix various unicode errors | ||
to_ascii=True, # transliterate to closest ASCII representation | ||
lower=False, # lowercase text | ||
no_line_breaks=remove_breaks, # fully strip line breaks as opposed to only normalizing them | ||
no_urls=True, # replace all URLs with a special token | ||
no_emails=True, # replace all email addresses with a special token | ||
no_phone_numbers=True, # replace all phone numbers with a special token | ||
no_numbers=False, # replace all numbers with a special token | ||
no_digits=False, # replace all digits with a special token | ||
no_currency_symbols=True, # replace all currency symbols with a special token | ||
no_punct=False, # remove punctuations | ||
replace_with_punct="", # instead of removing punctuations you may replace them | ||
replace_with_url="", | ||
replace_with_email="", | ||
replace_with_phone_number="", | ||
replace_with_number="", | ||
replace_with_digit="0", | ||
replace_with_currency_symbol="", | ||
lang="en", # set to 'de' for German special handling | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Workflow.clean_gpt_out
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
|
||
device = 0 if torch.cuda.is_available() else -1 | ||
generator = pipeline("text-generation", model=model_name, device=device) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Workflow.test_gpt_neo
refactored with the following changes:
- Remove unnecessary call to
str()
withinprint()
(remove-str-from-print
)
print(f"Alright the next big setup is SpaCy!") | ||
print("Alright the next big setup is SpaCy!") | ||
print("\t1) en_core_web_sm => 12 MB") | ||
print("\t2) en_core_web_md => 40 MB") | ||
print("\t3) en_core_web_lg => 560 MB") | ||
#If this is chosen you want to run => 'python -m spacy download en_core_web_lg' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function UntouchedUser.set_spacy
refactored with the following changes:
- Replace f-string with no interpolated values with string (
remove-redundant-fstring
)
This removes the following comments ( why? ):
#If this is chosen you want to run => 'python -m spacy download en_core_web_lg'
if input_data['label'] is None: | ||
print("Dang so -> == None ...straight-up") | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function process_form_inputs
refactored with the following changes:
- Remove redundant conditional (
remove-redundant-if
) - Use named expression to simplify assignment and conditional [×2] (
use-named-expression
)
This removes the following comments ( why? ):
#self.fill_form(label, answer)
#! .get_matching_keys() does all the comaparing to get the right answer!!!!! ssooo there do special case check -> .env chack -> long q>a ... a>a check!!!
success = self.troubleshoot_form_filling(element, value) | ||
if not success: | ||
print("Failed to fill in the form. See the error messages above for details.") | ||
else: | ||
if success := self.troubleshoot_form_filling(element, value): | ||
print("Successfully filled in the form.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function fill_that_form
refactored with the following changes:
- Use named expression to simplify assignment and conditional [×9] (
use-named-expression
) - Swap if/else branches (
swap-if-else-branches
)
if users_app_current_version < app_current_version: | ||
return True | ||
return False | ||
return users_app_current_version < app_current_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function check_for_update
refactored with the following changes:
- Lift code into else after jump in control flow (
reintroduce-else
) - Replace if statement with if expression (
assign-if-exp
) - Simplify boolean if expression (
boolean-if-exp-identity
) - Remove unnecessary casts to int, str, float or bool (
remove-unnecessary-cast
)
#Validate and add job data | ||
result = add_job(job) | ||
return result | ||
return add_job(job) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function add_job_endpoint
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
This removes the following comments ( why? ):
#Validate and add job data
#Validate and add user data | ||
result = add_user(user) | ||
return result | ||
return add_user(user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function add_user_endpoint
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
This removes the following comments ( why? ):
#Validate and add user data
Thanks for starring sourcery-ai/sourcery ✨ 🌟 ✨
Here's your pull request refactoring your most popular Python repo.
If you want Sourcery to refactor all your Python repos and incoming pull requests install our bot.
Review changes via command line
To manually merge these changes, make sure you're on the
main
branch, then run: