Research Methodology

Data Extraction

Conventions and Practical Difficulties in Extracting Data on ICOs

At present, BVH is quoting a higher “$ Amount Raised” for the total ICO market than most other commentators. This is purely a by-product of its data collection methodology. Indeed, BVH is agnostic as to the overall level of funds raised. There is no “good” or “bad” level”. Neither is it seeking to have the biggest database or the highest amount raised. After all, in the fullness of time, ICOs may well cease to be the funding mechanism of choice for blockchain entrepreneurs. Already, there is a suspicion that the amount raised via Venture Capital funding has been seriously under-estimated in recent years and there is also a palpably increased interest in Securitised Token Offerings. BVH has chosen to focus on the $ Amount Raised primarily because it considers money and capital flows to be one of the best bellwethers of blockchain’s potential impact in a macro context. (Coincidentally, it also helps to prevent double counting.)

The higher total “$ Amount Raised” is explained largely by the fact that BVH extracts its ICO data from a basket of ICO Listing Websites and aggregates the amount raised at the individual ICO level. The choice of Listing Website may change over time but currently they include ICO Bench, Coin Schedule, Smith & Crown, CoinDesk, Token Data, ICO Data, ICO Drops and ICO Watch List. The amount raised via ICO is extracted for each unique ICO name and the highest of these 8 reported values is then taken as the “BVH Amount Raised” for that ICO. This value is then summed across the 6,000 or so different ICOs in the BVH database - resulting in a current total amount raised in excess of $33 billion. In order to achieve a reasonable degree of data consistency, BVH plans to adhere to this convention surrounding data collection for as long as is practically possible. While this might appear simple at first glance, a number of practical difficulties make data collation difficult and give rise to unforeseen errors. These include:

  • Difficulty in Identifying Unique ICOs: The entire BVH blockchain database contains upwards of 10,000 entities (including ICOs, Coins, Companies, etc.) and this number is currently growing by two or three hundred a month. As the same entity can be entered on a number of ICO Listing Websites, there is a risk it could be counted twice unless there is an exact match in name. Even the most flippant differences, like an additional “space” or the “Ltd” annotation at the end of the name, can result in double counting. Equally, an entity might list under its full formal name on one Listing Website but use an abbreviated version on another. Normally, the inclusion of a unique identifier or ticker might help to isolate these problems but unfortunately tickers in blockchain are far from unique! Luckily, the double-counting problem is more prevalent among upcoming ICOs – much of duplication gets corrected by the time an ICO reaches the end of its funding period – so the over-estimation of the total $ Amount Raised is reasonably well contained.

  • Differentiating Between Pre-Sale and Main Sale Funding: There is a lack of uniformity in the way the ICO Listing Websites account for funds raised at each stage of the ICO cycle. Some add them together while others don’t. This creates confusion.

  • Attributing the Same Funds to Different Funding Mechanisms: Most ICOs appear to be meticulous in differentiating between different forms of funding. Unfortunately, a more opportunistic approach seems to be emerging recently whereby funds raised via early-stage VC rounds show up again under ICO funding. Presumably this done for optics but it could hardly be described as best practice.

  • Inclusion: To be included in the BVH database, an ICO must first of all be listed on one of the 8 nominated listing websites. There is no direct entry mechanism. Once included in the database, it is difficult to be excluded – though the amount raised will be adjusted, upwards or downwards, to reflect the values most recently reported by the ICO Listing Website. This means that data collected for so-called “scams” is aggregated with all other data. No adjustment to historical data is made if an ICO Listing Website falls out of the basket that is being used. The decision as to which ICO Listing Websites to include in the basket will be at the sole discretion of BVH. Relevant criteria include the extent of market coverage and the accuracy, timeliness and ease with which its data can be collated.

  • ICO Start and End Dates: The Start and End Dates attributed to individual ICOs differ across the various ICO Listing Websites. Some quote the Pre-Sale dates, more quote the Main Sale dates while some quote both. Equally, some Listing Websites assign the amount raised via ICO to the month when the funds were actually committed – this differentiation becomes statistically very significant for large ICOs such as EOS and for those whose End Dates have been deferred very much into the future.

  • Country: Much ambiguity surrounds the country of origin for ICOs.

  • Classifying ICOs by Activity: The process of classifying ICOs by Activity can be subject to significant subjectivity at times.

Conventions and Practical Difficulties in Extracting VC Data

The recent hype attached to ICOs has distracted attention away from traditional forms of funding. However, what we lack in terms of the variety of sources covering funds raised via venture capital, we make up for in terms of quality. Crunchbase seems to be very comprehensive and the most reliable source available.