In my latest research endeavor, I embarked on an exploration of the vast landscape of web3 companies to unveil the macro trends shaping the industry. Join me on this captivating journey as I reveal my unique process and methodology.
To conduct this analysis, I utilized publicly available data from LinkedIn as my primary source. This treasure trove of information served as the foundation of my research, enabling me to delve into the web3 ecosystem and unlock its secrets.
🔎 "Training set"
I manually collected "True Positives": around 25 blockchain startups and big companies to create a diverse sample for analysis. This included startups at different stages to get a broader perspective.
You can also google something like "best defi protocols" for an inspiration.
Let's dive into the metadata to find similarities between these companies – they will be really helpful to find more blockchain companies.
📍 web3 startups: Industries & Countries
"Information technology & services" is the most popular industry in collected data sample but it has too broad definition and will produce a lot of false positives if we will use only that filter. Less popular industries to consider: financial services, venture capital & private equity, health, wellness & fitness, online media.
The same logic can be applied when analysing countries: most of them are from United States but some web3 companies are also from Singapore, Germany, British Virgin or Cayman Islands.
This stats can be boring for now: our goal is to find the way how to find as many crypto web3 startups as possible so we can analyse them at scale.
💭 web3 startups: keywords
Since one company may have several keywords, let's calculate "word cloud".
We can see that Linkedin companies like to use web3-specific keywords like: blockchain, cryptocurrency, ethereum, smart contacts, bitcoin, defi, web3, decentralized finance. Other keywords may be misleading.
The similar thing can be done to "description" field. Don't forget tokenise the text properly. I'm feeling lazy now, so I'll continue to use only proper keywords.
🕷️ More keywords after data enrichment
Since we found keywords that can help us to find more true positives for our web3 startup list, let's parse as many startups as possible and enlarge our keyword list with new ideas.
After scraping 10k companies with keywords mentioned above, let's calculate the new keyword top list to find more candidates:
Now it's clear: we should add these keywords to get largest crypto list possible: nft, ico, digital assets and some variants: nfts, smart contracts, cryptocurrencies.
After filtering and enriching the data, I created a comprehensive dataset of web3 startups. This dataset will serve as the foundation for my future analysis, where I explore the trends, patterns, and opportunities within the web3 and blockchain startup ecosystem. Follow my Twitter and Linkedin to get the report when its done.
Join me as I uncover the exciting world of web3 companies and shed light on the future of this revolutionary technology 👋
Countries with most blockchain startups
After collecting around 10k web3 startups I got this country distribution ⬆️
And below the distribution of keywords:
That's it. Reach me if you have any questions.