Skip to content

Stats about HTTP response security headers usage mentioned by the OSHP.

License

Notifications You must be signed in to change notification settings

oshp/oshp-stats

OWASP Secure Headers Project statistics

Gather data

📊 Statistics about HTTP response security headers usage mentioned by the OWASP Secure Headers Project (OSHP).

💾 This project gather data, about the usage of HTTP response security headers, into a SQLITE database to allow the generation of statistics in a second time.

💡 See this issue for details.

Data source

Tip

💡 MAJESTIC was used instead of the CISCO Top 1 million sites CSV file because it contain less malware domains.

# Download the MAJESTIC Top 1 million sites CSV file
$ wget http://downloads.majestic.com/majestic_million.csv
# Transform the downloaded file to an input source that use the same format 
# than the CISCO Top 1 million sites CSV file
$ cat majestic_million.csv | awk -F  "," 'NR>1 {print $1 "," $3}' > data/input.csv
$ rm majestic_million.csv

Scripts

Note

📦 They are all stored in the scripts folder and they are Python 3.x based.

Important

⚠️ Usage of the script generate_stats_md_file was replaced by a workflow on the main OSHP site..

💻 Visual Studio Code is used for the scripts development. A Visual Studio Code workspace file is provided for the project with recommended extensions.

📑 Files:

  • gather_data: Script gathering the information about HTTP security headers usage in a SQLITE database based on the "MAJESTIC Top 1 million sites CSV file" data source.
  • generate_stats_md_file: Script using the gathered data to generate/update the markdown file stats, with mermaid pie charts with differents statistics about HTTP security headers usage (⚠️not used anymore).

Data

Note

📦 They are all stored in the data folder.

📑 Files:

  • input.csv: MAJESTIC Top 1 million sites list formated as one entry ranking,domain by line.
  • data.db: SQLITE database with information about HTTP security headers usage.

Data and statistics update

Note

💡 Only the first 150000 entries of the CSV datasource are used to fit the processing timeframe allowed for a github action workfows using the free tiers.

💻 The update is scheduled in the following way:

  • The first day of every month the data database is updated via this workflow.
  • The fifth day of every month the statistic data is updated via this workflow.

Note