Methodology
Sources
- SAHA İstanbul — 1,213 firms
- OSSA / OSTİM Savunma — 306 firms
- BASDEC — 96 firms
- HUKD — 48 firms
- DASSAD — 19 firms
- SİBER KÜME — 216 firms
Deduplication
Across the six clusters, raw records total 1,898. Cross-cluster duplicates are merged using a two-pass pipeline:
- Domain match: firms sharing a normalized website domain (eTLD+1) are treated as the same entity.
- Fuzzy name match: firms with very similar names (≥95 token-set similarity) without a domain conflict are merged.
Merged result: 1,751 unique firms. 13 borderline pairs are held in a manual review queue.
Geocoding
Addresses are geocoded with OpenStreetMap Nominatim. Street-level coordinates are used when available; otherwise the firm is placed at the province centroid.
Known limitations
- TSSK (Ankara / ODTÜ Teknokent, ~114 firms) was excluded — its public directory was returning 404 at scrape time.