Monday, July 14, 2025
  • About us
  • Economic Calendar
  • Price Predictions
  • Coins Alerts
  • Crypto Portfolio Tracker
  • Exclusively from Our Partners
Cryptheory
  • Cryptocurrency List
  • Cryptocurrency
    • Cryptocurrency Exchanges
      • Top exchanges for trading Bitcoin derivatives
      • How to buy bitcoin without KYC
      • How to Trade Bitcoin – best platforms
      • Best platforms to buy Bitcoin by debit or credit card
      • Platforms for Grid Trading
      • BYDFi: Review and Guide
      • BingX: Review and Guide
      • Kraken: Review and Guide
      • Bybit: Review and Guide
      • Bitpanda: Review and Guide
      • Phemex: Review and Guide
      • Huobi: Quick Guide
      • Binance: Review and Guide
        • Binance Futures Guide
    • Bitcoin
    • Tether
    • XRP
    • Dogecoin
    • Avalanche
    • Stellar
    • The Open Network
  • crypto news
  • Crypto Exchanges Info
  • Analysis
  • Attractions
No Result
View All Result
  • Cryptocurrency List
  • Cryptocurrency
    • Cryptocurrency Exchanges
      • Top exchanges for trading Bitcoin derivatives
      • How to buy bitcoin without KYC
      • How to Trade Bitcoin – best platforms
      • Best platforms to buy Bitcoin by debit or credit card
      • Platforms for Grid Trading
      • BYDFi: Review and Guide
      • BingX: Review and Guide
      • Kraken: Review and Guide
      • Bybit: Review and Guide
      • Bitpanda: Review and Guide
      • Phemex: Review and Guide
      • Huobi: Quick Guide
      • Binance: Review and Guide
        • Binance Futures Guide
    • Bitcoin
    • Tether
    • XRP
    • Dogecoin
    • Avalanche
    • Stellar
    • The Open Network
  • crypto news
  • Crypto Exchanges Info
  • Analysis
  • Attractions
No Result
View All Result
Cryptheory
No Result
View All Result

Stylometric Analysis: Satoshi Nakamoto

by Roman B.
January 2, 2021
in Attractions
Reading Time: 7 mins read

Table of Contents

  • Problem Statement
  • Data
  • Methodology
  • Results
  • Satoshi Nakamoto was a group
  • Discussion
  • Future Work

Natural Language Processing tools were applied to the Satoshi Nakamoto’s Bitcoin paper to compare it to numerous cryptocurrency-related papers in an attempt to identify the true identity of the unknown Satoshi Nakamoto.

Stylometric Analysis: Satoshi Nakamoto

There are two parts to the paper; the first part is stylometric analysis on the linguistic features generated and n-grams of each document in the corpus consisting of the relevant literature listed on Satoshi Nakamoto Institute and using machine learning models of the linguistic features to predict an author/authors on the Satoshi Nakamoto’s Bitcoin paper and his personal email texts.

The second part is semantic similarity analysis where the content of each document in the corpus is compared in terms of semantic similarity number using the built-in functions in spaCy and gensim. The results from the two parts suggested which author/authors in the corpus are linguistically and semantically similar to Satoshi Nakamoto.

Problem Statement

Bitcoin has been a long-lasting peer-to-peer digital cryptocurrency for people who are skeptical of a current monetary system that is heavily controlled by third parties such as central and commercial banks. Bitcoin has come to the world’s attention not just because of the cryptocurrency itself, but also because of an algorithm behind Bitcoin, which is called blockchain. The real identity of Satoshi Nakamoto, who is known as the creator of Bitcoin and blockchain, has been an intensely debated topic among the members of the Bitcoin community. Since Satoshi Nakamoto and people involved in the early stage of this Bitcoin project only interacted via email, nobody has seen and interacted with him in real person; therefore, his identity is still unknown. Satoshi Nakamoto, who had refused to reveal himself to the public due to his privacy concern, left a few write-ups; one of the write-ups is the paper called “Bitcoin: A Peer-to-Peer Electronic Cash System” describing how Bitcoin works using blockchain and another is a few email exchanges between Satoshi and the people who were involved in the early stage of Bitcoin.

The paper examines to answer one question regarding Satoshi Nakamoto, “Who is/are linguistically and semantically similar to Satoshi Nakamoto?” The paper applies stylometric and semantic similarity analyses on the relevant literature listed on Satoshi Nakamoto Institute, the Bitcoin paper, and Satoshi’s email exchanges to find out who is/are linguistically and textually similar to Satoshi Nakamoto. Stylometric analysis is an analysis of linguistic style used to suggest whether a text belongs to a certain author based on linguistic features. Semantic similarity analysis is an analysis used to indicate whether the content or meaning of a text is similar to the content of another or not.

The true identity of Satoshi Nakamoto is important to the Bitcoin community. Satoshi has been known to have approximately 1 million Bitcoins or 7 percent of the total Bitcoin supply. He has the strongest influence on the economy of Bitcoin; if he decides to sell some of his Bitcoins into the market, the market will respond to the change by possibly devaluing all the existing Bitcoins. Furthermore, manifesting the true identity of Satoshi Nakamoto can bring many upgrades to blockchain and new applications of blockchain in fields other than finance can be introduced.

Data

The data were gathered using a python module called Article. The literature listed on Satoshi Nakamoto Institute only in a format of HTML were collected using the module. The authors of the collected literature have been known as possible candidates for Satoshi Nakamoto in the Bitcoin community such as Hal Finney, Ian Grigg, Nick Szabo, Timothy C. May, and Wei Dai. A total of 29 documents was collected, including 6 Hal Finney texts, 2 Ian Grigg texts, 16 Nick Szabo texts, 2 Timothy C. May texts, 1 Wei Dai text, and 2 Satoshi Nakamoto texts, where one of them is the Bitcoin paper, and another is his email exchange texts with others. The texts of every author in the data except Satoshi Nakamoto were combined into one single text file. The training corpus contains a single combined text of every author except Satoshi Nakamoto and the test corpus contains two individual Satoshi Nakamoto’s texts.

Methodology

The stylometric analysis has three components, linguistic features, classification algorithms, and n-grams. A total of 10 linguistic features was generated and used to compare from author to author in the corpuses. These features were generated using sent_tokenize function and stopwords built in the nltk module. The descriptions of the features are provided in Table 1. Classification algorithms such as Support Vector Machine, Random Forest, and Gaussian Naive Bayes, were used to classify the Satoshi Nakamoto’s Bitcoin paper and his email exchanges as one of the authors of the training corpus. The algorithms were trained with the features of the authors except those of Satoshi Nakamoto and were all implemented in python using the scikit-learn module.

In addition, n-grams of each document in the corpuses, where n is from 1 to 4, were produced using the nltk module. The tokens in each document were lemmatized using nltk.WordNetLemmatizer to prevent the same word from being counted as another word due to plurality. First, 1-gram, called uni-gram, was generated with and without stopwords. Afterwards, bigram, trigram, and quadgram were created, compared, and analyzed to see if Satoshi Nakamoto repeats a certain pattern of words in order and other authors use the same pattern in their writings.

The semantic similarity analysis was done using the built-in functions that compute semantic similarity in spaCy and gensim implemented in python. In spaCy, .similarity() method was used to compare the content of one document to that of another and determine the similarity using a number between 0 and 1, where 0 means the two documents are not related to each other and 1 means that the contents of the two documents are identical. spacy.load(‘en_core_web_lg’), which consists of 300-dimensional word vectors trained on Common Crawl with GloVe and 1.1m keys and 1.1m unique vectors (300 dimensions), was used. In gensim, similarities.MatrixSimilarity() was used in computing the cosine similarity using a number between -1 and 1, where the closer to 1 the number is, the more similar two documents are to each other in terms of content.

Results

According to the classification algorithms in Table 3, they all predicted that Nick Szabo is linguistically similar to Satoshi who had written the Bitcoin paper and Ian Grigg is linguistically similar to Satoshi who had exchanged the emails. In Table 4, there are two unigrams, (‘would’, 31) and (‘one’, 29) in Satoshi’s email exchanges. The word ‘would’ is used by Hal Finney 28 times and the word ‘one’ is used by Nick Szabo 199 times. There is one unigram, the word ‘contract’, commonly used by Ian Grigg and Nick Szabo.

From spaCy (Table 5), Wei Dai has the highest similarity score to the Bitcoin paper and Hal Finney has the highest similarity score to Satoshi’s email exchanges. From gensim (Table 6), Timothy C. May has the highest similarity score to the Bitcoin paper and Ian Grigg has the highest similarity score to Satoshi’s email exchanges. An unusual result is that Ian Grigg has a similarity score of .99996 to Satoshi’s email exchanges (rounded up to 1.0 in the table).

Satoshi Nakamoto was a group

Based on the results, Satoshi who had written the Bitcoin paper may not be the same Satoshi who had exchanged emails. Satoshi Nakamoto may possibly be more than one person; Satoshi Nakamoto is a pseudonym for a team of computer scientists and cryptographers who were involved in creating Bitcoin and blockchain. Nick Szabo and Ian Grigg are the two authors who are linguistically similar to Satoshi Nakamoto in the Bitcoin paper and his email texts, respectively. In addition, Wei Dai and Timothy C. May are two potential candidates for the Bitcoin paper in terms of semantic similarity. Hal Finney and Ian Grigg are two possible candidates for Satoshi’s email exchanges. Since it is a known fact that Hal Finney had interacted with Satoshi Nakamoto via email, Hal Finney should not be included in the list of possible candidates for Satoshi who exchanged emails; Ian Grigg is linguistically and semantically similar to Satoshi Nakamoto. Therefore, the possible candidates for Satoshi Nakamoto are Nick Szabo, Ian Grigg, Wei Dai, and Timothy C. May.

Discussion

Satoshi Nakamoto used the phrase “proof-of-work” repeatedly throughout the Bitcoin paper and Nick Szabo is the only author of the training corpus who used the same exact phrase in his blog post called Bit gold. It supports a theory that Nick Szabo is very close to Satoshi in terms of linguistic style. The document distances of every author of the corpus in 2-dimensional spaces using multidimensional scaling, MDS on sklearn were visualized. In Pic 1, the distance between Ian Grigg and Nick Szabo is the shortest, suggesting that Ian Grigg and Nick Szabo are closely related to each other, which might not be a coincidence. Wei Dai and Timothy C. May are far away from each other and Nick Szabo and Ian Grigg, possibly suggesting that Wei Dai and Timothy C. May are not strong candidates for Satoshi Nakamoto compared to Nick Szabo and Ian Grigg.

Future Work

The classification algorithms trained with only 5 data points consisting of 10 features are worrisome. In an ideal data science world, machine learning models need to be trained with a bigger training sample with cross-validation. Along with the absolute frequency of n-grams, the relative frequency of n-grams can be added to the study. In addition, Craig Steven Wright, who claimed himself as Satoshi Nakamoto, can be added, if possible, as one of the authors of the training corpus because the algorithms and semantic similarity allow comparing him to the authors of the current training corpus. It would be interesting to see if he would have outperformed Nick Szabo and Ian Grigg, who are the two strongest candidates for Satoshi Nakamoto.

You may also like: How does Bitcoin Mining Work?

  • Author
  • Recent Posts
Roman B.
Roman B.
I have 6 years of writing experience on several leading websites dealing with cryptocurrencies and other investments. I have been in the Web3 sector for a similar amount of time and have collaborated on several projects from NFT to P2E. My previous experiences: Business2Community.com, TradingPlatforms.com and now Cryptonews.com.
Roman B.
Latest posts by Roman B. (see all)
  • Ethereum Is Cooking Again: Are You Gonna Ride or Cry? - June 30, 2025
  • Bitcoin: Everyone’s Scared, But The Charts Are Screaming “It Ain’t Over!” - June 25, 2025
  • The Centralized Trap: Is Web3 Quietly Turning Back Into Web2? - June 24, 2025
Tags: crypto attractionSatoshi Nakamoto
Previous Post

Cryptocurrencies Had a Great Year – 2021 Looks Even Better

Next Post

Haven Protocol (XHV) Review: Complete Beginner’s Guide!

Related Posts

10 Brilliant Reasons I’m Still Holding My 2018 Altcoin Bag (And Totally Not Crying Inside)
Attractions

10 Brilliant Reasons I’m Still Holding My 2018 Altcoin Bag (And Totally Not Crying Inside)

July 1, 2025
The Centralized Trap: Is Web3 Quietly Turning Back Into Web2?
Attractions

The Centralized Trap: Is Web3 Quietly Turning Back Into Web2?

June 24, 2025
Staking: Is it Just a Fancy Ponzi Scheme? Let's Get Real.
Uncategorized

Staking: Is it Just a Fancy Ponzi Scheme? Let’s Get Real.

June 11, 2025
Who Really Makes Bank on Memecoins – And Why It Ain't You, You Beautiful Degenerate
Attractions

Who Really Makes Bank on Memecoins – And Why It Ain’t You, You Beautiful Degenerate

June 4, 2025
Let’s Be Honest: Most Crypto Projects Are Just Bullshit Wrapped in a Sexy Website
Attractions

Let’s Be Honest: Most Crypto Projects Are Just Bullshit Wrapped in a Sexy Website

May 15, 2025
Ethereum Is Just a Pricier Solana – Let the Maxis Rage All They Want
Attractions

Ethereum Is Just a Pricier Solana – Let the Maxis Rage All They Want

May 14, 2025
What Exchanges Won’t Tell You – Part 2: Staking Scams & Launchpad Lies
Attractions

What Exchanges Won’t Tell You – Part 2: Staking Scams & Launchpad Lies

May 12, 2025
What Exchanges Won’t Tell You: How They’ll Milk You Even in a Bull Market - Part 1
Attractions

What Exchanges Won’t Tell You: How They’ll Milk You Even in a Bull Market – Part 1

May 12, 2025
Why You Should NEVER Hire a Hitman on the Darknet with Crypto (Unless You Want to Ruin Your Life)
Attractions

Why You Should NEVER Hire a Hitman on the Darknet with Crypto (Unless You Want to Ruin Your Life)

April 30, 2025
Please login to join discussion
Track all markets on TradingView
Score: 0
Game Over

Popular

Bitcoin’s Not Dead, It’s Just Loading: On-Chain Data Says Sit Down and HODL

Bitcoin’s Not Dead, It’s Just Loading: On-Chain Data Says Sit Down and HODL

June 24, 2025
Ethereum Price Analysis & Trading Tips: Triangle Squeeze and RSI Shenanigans – WTF Happens Next?

Ethereum Price Analysis & Trading Tips: Triangle Squeeze and RSI Shenanigans – WTF Happens Next?

June 18, 2025
Bitcoin at $104K: Spoofers, Liquidations & Degens – Welcome to the Danger Zone

Bitcoin at $104K: Spoofers, Liquidations & Degens – Welcome to the Danger Zone

June 20, 2025
XRP Ain’t Dead: On-Chain Whispers and Trading Shenanigans You Can’t Ignore & Trading Tips

XRP Ain’t Dead: On-Chain Whispers and Trading Shenanigans You Can’t Ignore & Trading Tips

June 26, 2025
TRON’s On-Chain Reality Check: Growth, Greed, and Why TRX Ain’t Dead Yet

TRON’s On-Chain Reality Check: Growth, Greed, and Why TRX Ain’t Dead Yet

June 19, 2025
Ethereum’s Not Dead Yet: On-Chain Data Screams Rebound & Trading Tips

Ethereum’s Not Dead Yet: On-Chain Data Screams Rebound & Trading Tips

June 27, 2025
Ethereum Is Cooking Again: Are You Gonna Ride or Cry?

Ethereum Is Cooking Again: Are You Gonna Ride or Cry?

June 30, 2025
Ethereum's $5K Dream: Breakout or Breakdown?

Ethereum’s $5K Dream: Breakout or Breakdown?

July 2, 2025
To Everyone Selling Their Crypto: Take a Breath, Quit Panicking, and Read This Like a Pro

To Everyone Selling Their Crypto: Take a Breath, Quit Panicking, and Read This Like a Pro

June 23, 2025
Altcoin Tsunami Loading: Bitcoin Dominance Teases the Final Altpocalypse (Again)

Altcoin Tsunami Loading: Bitcoin Dominance Teases the Final Altpocalypse (Again)

June 26, 2025

💥 What's Your Crypto Trading Spirit Animal?

1. Your reaction to a red 15% candle?

2. How do you read charts?

3. What’s your biggest trading sin?

4. Favorite indicator?

5. How do you set your stop-loss?

6. What's your ideal trading environment?

7. How do you handle losing trades?

8. What’s your position sizing like?

9. What's your backup plan if you get liquidated?

10. Your crypto endgame?

Crypto News

Bitcoin

Ethereum

Altcoins

Cryptocurrency

Platforms for Copy Trading

Platforms for Grid Trading

Glossary of trading and crypto terms

Crypto Portfolio Tracker

 

Cryptheory Labs Projects

Guest Post for Free

On our website, you can share your opinions and insights. This feature is reserved for non-promotional articles. We are already working on a system for donations and rewarding every author.

This website is here for those of you who want to be at the forefront of innovation and new technologies like Bitcoin. But first, we need to tackle the challenges and survive in the unforgiving world of crypto assets. THIS IS CRYPTHEORY!

Follow Us

  • Terms Of Use
  • Privacy Policy
  • About us
  • Contact us

© 2024 Cryptheory - F**k the forex, we want gains!

Shiba Inu loading
No Result
View All Result
  • Cryptocurrency List
  • Cryptocurrency
    • Cryptocurrency Exchanges
      • Top exchanges for trading Bitcoin derivatives
      • How to buy bitcoin without KYC
      • How to Trade Bitcoin – best platforms
      • Best platforms to buy Bitcoin by debit or credit card
      • Platforms for Grid Trading
      • BYDFi: Review and Guide
      • BingX: Review and Guide
      • Kraken: Review and Guide
      • Bybit: Review and Guide
      • Bitpanda: Review and Guide
      • Phemex: Review and Guide
      • Huobi: Quick Guide
      • Binance: Review and Guide
    • Bitcoin
    • Tether
    • XRP
    • Dogecoin
    • Avalanche
    • Stellar
    • The Open Network
  • crypto news
  • Crypto Exchanges Info
  • Analysis
  • Attractions

© 2024 Cryptheory - F**k the forex, we want gains!

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.