Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit data URLs in css-variables #145

Merged
merged 2 commits into from
Sep 25, 2024
Merged

Limit data URLs in css-variables #145

merged 2 commits into from
Sep 25, 2024

Conversation

@max-ostapenko
Copy link
Contributor

It seems a summary part of the object is a duplicate (same content, different structure) of data available in parsed_css metric.

I'm able to find the variables/selectors/values there:

SELECT
  url, css
FROM `httparchive.all.parsed_css`
WHERE date = "2024-09-01"
AND client = 'mobile'
AND NOT is_root_page
AND rank <= 10000
and page = 'https://www.morgenpost.de/politik/article407241629/game-changer-fuer-die-ukraine-bahnt-sich-an-putin-droht.html'
AND css LIKE "%--playkit-icon-quality-HD-active-url%"

I can't give a guarantee, but it should be possible to reuse it for various metrics calculation.

The other part of the object - custom_metrics.css-variables.computed seems more unique.

So a quick optimization could be something as follows:

  1. double-check if anybody is using custom_metrics.css-variables.summary (last 90 days of logs don't show any queries),
  2. drop it in favor of parsed_css,
  3. BTW URLs are not shortened in parsed_css either. Apply this change there instead.

@tunetheweb
Copy link
Member Author

@rviscomi do you know the difference between these two custom metrics and whether we need both?

@rviscomi
Copy link
Member

css-variables gives us runtime data about custom property usage

parsed_css gives us the raw stylesheet AST

They're similar in that they're both used for Web Almanac CSS analysis, but both are needed

@tunetheweb
Copy link
Member Author

OK, then I'm less worried about triming parsed_css data URLs since it's in it's own table and massive anyway. Plus I'm not 100% confident of what change to make and where.

I do think there's an argument to move both css_variables and parsed_css into the one table (and thus reduce the size of custom_metrics column in the main all.pages table some more) but let's leave that until the after the Web Almanac is finished.

So in meantime I think this is good to merge?

Copy link

Custom metrics for https://almanac.httparchive.org/en/2022/

WPT test run results: http://webpagetest.httparchive.org/results.php?test=240925_03_1W
Changed custom metrics values:

{
  "_css-variables": {
    "summary": {}
  }
}

Webpage test results that are too big for a comment are available as the action's artifact.

@max-ostapenko
Copy link
Contributor

We definitely need to split css_variables due to the size, added to the list.

And yeah, good to merge.

@tunetheweb tunetheweb merged commit 4b6f43b into main Sep 25, 2024
5 checks passed
@tunetheweb tunetheweb deleted the limit-data-urls-css branch September 25, 2024 18:07
max-ostapenko pushed a commit that referenced this pull request Oct 14, 2024
max-ostapenko added a commit that referenced this pull request Oct 14, 2024
* commenting

* test metric change

* formatting

* typo

* formatting

* comment restructured

* options fix

* Limit data URLs in css-variables (#145)

* link updates

---------

Co-authored-by: Barry Pollard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants