Methodology
How we collect, classify, and analyze hiring market data. Transparency about what our data covers and where it has limitations.
Data Sources
We aggregate job postings from 6 distinct sources, combining direct employer career pages with job board aggregators for comprehensive coverage.
ATS Sources (Direct)
- *Greenhouse -- 450+ company career pages via browser automation
- *Lever -- 180+ companies via public API
- *Ashby -- 170+ companies via API (best salary data)
- *Workable -- 135+ companies via API (workplace type)
- *SmartRecruiters -- 35+ companies via API (experience level)
Aggregator Sources
- *Adzuna -- Job board aggregator covering broad market activity
ATS sources provide higher-quality structured data (salary, location type, working arrangement). Adzuna provides broader coverage but with less structured metadata.
Geographic Coverage
Currently tracking 5 major hiring markets:
Coverage varies by city. London and NYC have the deepest coverage; Singapore was added most recently.
Job Families
Reports cover three professional families:
Data & Analytics
Data engineers, analysts, scientists, ML engineers, analytics engineers, BI specialists
Product Management
Product managers, product owners, growth PMs, platform PMs, technical PMs
Project & Delivery
Project managers, delivery managers, programme managers, scrum masters, agile coaches
Classification
Every job posting passes through a multi-stage classification pipeline:
- 1.Pre-filtering: Title and location pattern matching reduces volume by ~95% before LLM processing, focusing only on relevant roles.
- 2.Agency detection: Known recruitment agencies are filtered out using a maintained blocklist, removing 10-15% of raw postings.
- 3.LLM classification: Gemini 2.5 Flash extracts structured fields: role subfamily, seniority level, skills, working arrangement, and track (IC/Management).
- 4.Deduplication: URL-based deduplication prevents the same posting from being counted multiple times across scraping runs.
- 5.Skill normalization: Raw skill mentions are mapped to a curated taxonomy using exact match and fuzzy normalization.
Quality Controls
Limitations
- 1.Coverage bias: Our ATS sources skew toward tech-forward, VC-backed, and mid-size companies. Large enterprises using Workday, Taleo, or internal systems are underrepresented.
- 2.Compensation gaps: Salary data depends on employer disclosure. Markets without pay transparency legislation (most of Europe, Singapore) have much lower disclosure rates.
- 3.Working arrangement data: Only available from ATS-sourced roles with structured fields. Adzuna-sourced roles lack this metadata.
- 4.Classification accuracy: LLM classification is not perfect. Edge cases (hybrid PM/engineer roles, ambiguous seniority) may be misclassified.
- 5.Temporal coverage: Reports reflect a snapshot of a single month. Month-over-month changes from a single data point should be interpreted cautiously.
Current Dataset
7,944+
Jobs tracked
4,453+
Employers
5
Cities
19
Reports