Research Scholar, Delhi Private School, Dubai, UAE
sujaykatoch100@gmail.com
Keywords: Short-term volatility, Sentiment
analysis, Regression, Market prediction, Stocks
Over the past decade, social media
has changed how people talk about and interact with financial markets. Instead
of relying only on professional news outlets or financial reports, many
investors now get information and ideas from platforms such as X, Reddit, and
StockTwits, where opinions and rumors can spread to thousands of people in
seconds. This shift has made it easier for individual, or “retail,” investors
to share strategies, coordinate trades, and react quickly to new information.
At the same time, it has raised important questions about how these online
conversations might affect stock prices and the overall stability of markets.
Some of the clearest examples of
social media’s impact on markets are the meme stock events, such as the
dramatic rise of GameStop and AMC in 2021. Research on these episodes shows
that intense online discussions, especially in communities like Reddit’s
WallStreetBets, helped drive unusual
price movements and periods of extremely high volatility. In these cases, many
investors appeared to buy and hold stocks based on viral posts and shared
emotions, rather than on traditional measures like earnings or cash flow. This
behavior challenges the idea that markets always move mainly because of new
fundamental information and suggests that crowd sentiment itself can sometimes
become a powerful force.
More broadly, academic studies and
preprints have begun to examine how social media sentiment relates to both
stock returns and volatility. For example, some work finds that sentiment
extracted from X, Reddit, and StockTwits can be correlated with or even help
predict changes in market volatility indices and individual stock movements.
Other research using large datasets of tweets reports that negative sentiment
can have a stronger effect on price swings than positive sentiment, and that
smaller or more speculative stocks tend to be especially sensitive to online
mood. These findings fit with ideas from behavioral finance, which argue that
investor emotions, herd behavior, and attention can all influence trading
decisions and market outcomes.
However, most of this existing
research uses advanced methods such as natural language processing,
large‑scale data collection, and complex econometric models that are not
easily accessible to younger students or individual investors. There is room
for simpler studies that focus on a small number of stocks, use basic tools
like spreadsheets, and still explore the same core question: is there a visible
connection between what people say online about a stock and how risky or
volatile that stock is in the short term? A high‑school‑level
project can contribute by showing how even basic sentiment coding and simple
statistics can reveal patterns that are consistent with the more advanced
literature.
This paper focuses on the role of
social media sentiment in driving short‑term stock market volatility for
a small selection of popular, highly discussed stocks. The main research
question is: Do days with stronger or more negative social media sentiment
about a stock tend to coincide with higher short‑term volatility in that
stock’s price? Based on prior studies, I expect that higher posting activity
and more extreme sentiment—especially negative sentiment—will be associated with
larger daily price swings. To investigate this, I collect daily closing prices
and trading volumes from free online finance sources, and I manually record and
label a sample of social media posts for each stock and each day in a chosen
time period.
From these data, I construct simple
measures of volatility (using absolute daily returns), attention (the number of
posts), and sentiment (the average of positive, neutral, and negative labels).
I then use line graphs to compare how these measures change over time, looking
for days when spikes in posting or strong sentiment appear alongside spikes in
volatility. Next, I calculate correlations between volatility and the social
media variables, and, if possible, I run basic linear regressions in a
spreadsheet program to see whether sentiment and posting activity help explain
daily volatility after controlling for simple factors like recent returns.
The purpose of this study is not to
build a trading strategy or to claim that social media alone can fully explain
stock movements. Instead, the goal is to provide an accessible investigation
into how online conversations and emotions might be connected to
short‑term risk in financial markets. By focusing on a small set of
well‑known stocks and using clear, understandable methods, this paper
aims to help students and beginning investors think more critically about the
information they see on social media and about the possible risks of following
viral trends without careful analysis.
Research on the relationship between
sentiment and financial markets can be grouped into three main areas: (1)
general investor sentiment and stock market volatility, (2) social media
sentiment and market outcomes, and (3) meme stocks and retail‑driven
volatility. This section summarizes key findings from each area and explains
how they motivate the present study.
Early work on investor sentiment,
before social media became important, already showed that emotions and beliefs can affect risk and
volatility. Baker and Wurgler (2007) review different proxies for investor
sentiment, such as trading volume and fund flows, and construct a sentiment
index that moves in line with major speculative episodes, suggesting that waves
of optimism and pessimism can shape market dynamics beyond fundamentals. More
recent studies focus on specific markets; for example, a 2025 paper on India
finds a moderately strong positive correlation between survey‑based
investor sentiment and perceived stock market volatility, indicating that
higher sentiment‑driven behavior is associated with higher perceived
risk. These findings support behavioral finance theories in which herd
behavior, media effects, and risk aversion influence price swings and
volatility.
Other work studies how modern
trading technologies interact with volatility. A 2022 study on Euronext stocks
shows that high‑frequency trading can reduce volatility in stable times
but increase it during intraday crashes, when algorithms rapidly cancel orders
and consume liquidity. Although this research is not directly about social
media, it demonstrates that new forms of information processing and trading can
make volatility more sensitive to non‑fundamental factors. Overall, the
investor sentiment literature implies that if social media can shift beliefs
and attention quickly, it is reasonable to expect an effect on short‑term
volatility.
As social media platforms grew,
researchers began to measure sentiment directly from online text. One
influential early study examines more than four million tweets related to major
U.S. indices and large technology stocks and finds high correlations between
Twitter sentiment and returns, along with evidence from Granger causality tests
that tweet sentiment helps explain short‑term price movements. A related
line of work builds time‑series models that integrate social media data
(e.g., from StockTwits) with market data; one thesis using StockTwits big data
reports that investor sentiment in the preceding six trading days
Granger‑causes stock market volatility, indicating a lagged,
one‑directional causal link from online sentiment to price fluctuations.
Several more recent studies look
specifically at volatility, not just returns. A 2024 preprint titled
“Correlating Social Media Sentiment with Stock Market Volatility” analyzes
sentiment from Twitter, Reddit, and StockTwits and finds that changes in
sentiment often precede or coincide with spikes in volatility indices and
stock‑level volatility, with negative sentiment and regulatory concerns
linked to increased risk. Another paper examines a Twitter‑based
uncertainty index and shows that this index significantly predicts the implied
volatility of large U.S. technology stocks (Amazon, Apple, Google, IBM) across
both high‑ and low‑volatility regimes, suggesting that social
media‑based uncertainty measures contain information about future
variance. Complementing these results, a study on intraday data for U.S. stocks
finds a statistically significant correlation between intraday volatility and
social media sentiment, especially when market activity is high, and concludes
that real‑time sentiment can be a useful tool for short‑term
volatility prediction.
Not all research focuses solely on
social media; some compare it to traditional news. A 2024 article titled “News
vs. Social Media: Sentiment Impact on Stock Performance” uses
FinBERT‑based sentiment analysis and weekly data to compare attention and
sentiment from news, Twitter, and web search for large technology firms. The
authors find that Twitter sentiment has a consistently positive and significant
influence on trading volume and volatility for companies like Amazon and
Microsoft, while news sentiment and search attention have more irregular
effects. Another study, “Sentimental showdown: News media vs. social media in
stock markets,” analyzes four international markets from 2016 to 2023 and
reports that social media sentiment has a pronounced impact on stock returns in
the U.S., with evidence that news and social media sentiment can mask or
amplify each other’s influence when examined with advanced coherence methods.
Together, these papers suggest that social media sentiment is at least as
important as, and sometimes more important than, traditional news in shaping
short‑term market outcomes.
Several authors focus specifically
on the predictive power of Twitter sentiment. A 2025 column for the
International Economic Association analyzes nearly three million
stock‑related tweets across developed and emerging markets and finds that
tweet‑based sentiment significantly predicts intraday market movements,
with machine learning models achieving notable accuracy. Another recent journal
article reports that Twitter‑derived sentiment indices for U.S.
technology companies explain variation in trading volume and volatility beyond
conventional factors, reinforcing the idea that online emotions and attention
carry actionable information for traders and risk managers. These results
support using social media sentiment as an explanatory variable in models of
short‑term volatility, as proposed in the present study.
A dramatic demonstration of social
media’s power came from the meme stock episodes centered on Reddit’s
r/wallstreetbets community. A Princeton thesis titled “Analyzing Price
Fluctuations in Reddit’s ‘Meme’ Stocks” documents how viral popularity in this
subreddit coincided with extreme price increases, sometimes over 1,000% in a
few days, and unusually high volatility and trading volume for stocks such as
GameStop and AMC. The study argues that social sentiment toward these
companies, rather than changes in fundamental value, drove much of the observed
price behavior. Similarly, research from the University of Kansas finds that
social media discussions played a key role in fueling meme stock short
squeezes, where coordinated retail buying led to rapid price spikes and
elevated risk for short sellers.
More recent work goes deeper into
regime changes driven by social media. A 2025 article in the Journal of Student
Research titled “The Dual Regimes of Meme Stocks Driven by Social Media
Sentiment” models GameStop prices with a two‑regime framework and finds
that one regime is characterized by elevated prices and significantly higher
volatility, associated with intense online sentiment and speculative trading.
The authors show that the predictive power of social media sentiment depends on
the regime: sentiment is especially informative in the high‑volatility,
“hype” state. Another study, “Dissecting the Hype: A Study of WallStreetBets’
Sentiment and Network Correlation on Financial Markets,” uses millions of posts
and network analysis to show that WallStreetBets sentiment and user
interactions are closely linked to stock volatility and can help predict price
movements during periods of heavy retail participation.
At the broader level of market
participation, one thesis on “The Rising Power of the Individual Investor”
finds that Robinhood user activity and mentions on WallStreetBets and Twitter
are positively correlated with stock price volatility and trading volume for
popular retail stocks, suggesting that social media communities and
easy‑to‑use trading apps together amplify short‑term market
risk. Complementary work in practitioner‑oriented outlets shows that once
a stock becomes a meme, its total risk and correlation with other meme stocks
and major indices rise sharply, indicating that meme status itself is associated
with higher volatility and systematic risk. These findings highlight how online
communities can generate herding behavior and self‑reinforcing volatility
cycles.
Several studies explicitly compare
the roles of news media and social media. The “News vs. Social Media” paper
mentioned above finds that while both types of sentiment matter, social media
sentiment often has a stronger or more consistent link to trading volume and
volatility for large technology firms. The “Sentimental showdown” article also
suggests that news sentiment may have a bigger impact on overall market
fluctuation, while social media sentiment has a stronger influence on the
correlation structure of returns and on cross‑market linkages, especially
in the U.S. Alomari and co‑authors, cited within that study, show that
news sentiment explains more of the variance in stock and bond market
volatility, but social media sentiment contributes additional information about
the co‑movement of returns over time.
These comparative studies indicate
that social media should not be viewed in isolation but as part of a broader
information ecosystem. For high‑frequency or very short‑term
volatility, social media may be particularly relevant because posts appear and
spread faster than most traditional news articles. This motivates using social
media sentiment, even in a simplified way, as a key explanatory variable in
models of daily volatility, as done in this project.
While the literature clearly shows
that investor sentiment and social media activity can affect returns, trading
volume, and volatility, most existing studies rely on large datasets, advanced
natural language processing, and complex econometric methods. Many focus on
intraday or high‑frequency data, require access to APIs, or examine
millions of posts across many markets, which makes them difficult to replicate
in low‑resource settings such as high school projects. There are also differences
in emphasis: some papers highlight returns, others focus on volatility indices,
and still others concentrate on a few famous meme stocks rather than a small,
generalizable set of popular companies.
A second gap is methodological
accessibility. Few studies explore whether simple, manually coded sentiment
measures and basic statistical tools, such as correlations and straightforward
regressions, can still reveal meaningful relationships between social media
sentiment and short‑term volatility. Most work uses automated sentiment
models like FinBERT or other machine learning approaches that are powerful but
technically demanding. For students and beginning researchers, an open question
is whether a scaled‑down version of these ideas—using a limited number of
posts, human labeling, and daily data—can capture patterns that are
qualitatively consistent with the advanced literature.
The present study aims to contribute
to this gap by designing a small‑scale, high‑school‑level
project that examines the role of social media sentiment in short‑term
stock volatility for a handful of well‑known stocks. By combining daily
price and volume data with manually coded sentiment and post counts from
platforms like Twitter or Reddit, the study tests whether days with more
intense or more negative sentiment tend to coincide with higher volatility. If
the results show similar patterns to those found in the larger academic
literature—such as stronger links between negative sentiment and volatility or
heightened sensitivity for highly discussed stocks—this would support the idea
that even simple methods can uncover the basic connection between online
conversations and short‑term market risk.
This study focuses on a small set of
large, well‑known companies that are heavily discussed on social media
and have high trading volumes. Specifically, I select Apple (AAPL), Tesla
(TSLA), and Nvidia (NVDA) as the three main stocks in my sample. These firms
are popular among retail investors, frequently appear in online discussions,
and are components of major U.S. stock indices, which makes them suitable for
studying the link between social media sentiment and short‑term
volatility. Focusing on three stocks keeps the project manageable while still
allowing for comparisons across different companies and industries.
The sample period covers
approximately three months of trading days, from 1 October 2025 to 31 December
2025. This window is chosen for three reasons. First, it includes at least one
quarterly earnings announcement for each company, which typically generates
higher attention and volatility. Second, a three‑month period provides
enough observations (around 60–65 trading days) to compute basic statistics and
correlations without making the data collection process too long for a high
school project. Third, the period is recent enough that social media data
remain accessible through platform search tools and reflect current patterns of
online investor behavior.
Daily stock price and volume data
for Apple, Tesla, and Nvidia are obtained from a free financial data website,
such as Yahoo Finance. Yahoo Finance provides open access to daily open, high,
low, close, adjusted close, and trading volume for listed companies over long
historical periods. For each stock, I navigate to its page on Yahoo Finance,
select the “Historical Data” tab, set the date range to match my sample period,
and download the data as a comma‑separated values (CSV) file. If direct CSV
download is restricted, I use commonly documented workarounds that allow
historical prices to be imported into a spreadsheet program without a paid
subscription.
After downloading the data, I open
the CSV files in Microsoft Excel or Google Sheets and clean them by removing
any rows that correspond to dividends or stock splits, so that only actual
trading days remain. I then sort the data from the oldest date to the newest to
ensure that calculations based on previous days’ prices are correct. For each stock
i on day t, I record the adjusted closing price Pi,t
and the daily trading volume
Volumei,t . The adjusted close is used because it accounts
for stock splits and certain corporate actions, making returns more consistent
over time.
From this data, I compute the daily
log return for each stock as

Where Pi,t
-1 is the previous trading day’s
adjusted closing price. The log return is a common measure in finance because
it treats positive and negative changes symmetrically and can be easily summed
over time. To obtain a simple measure of daily volatility that is appropriate
for a high school project, I take the absolute value of the log return,
![]()
which captures how large the price
movement is, regardless of direction. Prior research shows that absolute return
volatility, while simpler than more advanced realized volatility measures, is
still informative as an indicator of market risk and is easier to calculate
when only daily closing prices are available.
To measure social media sentiment
and attention, I collect posts from one major social platform. For this
project, I use X (formerly Twitter), because it provides a continuous stream of
short messages and is widely used by investors and commentators to discuss
financial markets. For each of the three selected stocks, I search X using the
stock ticker and company name (for example, “AAPL Apple,” “TSLA Tesla,” “NVDA
Nvidia”) and filter by date to match each trading day in my sample period. When
possible, I also use finance‑specific cashtags like “$AAPL” to better
target posts that are directly about the stock.
Because I do not use automated
programming interfaces, I rely on manual sampling. For each stock and each
trading day, I aim to collect a small but consistent sample of posts, typically
around 10–20 messages per day, depending on the volume of discussion and time
available. When there are more posts than I can record, I scroll through the
search results for that date and take the first few that clearly refer to the
company’s stock price, news, or investment opinion. I exclude posts that are
obviously spam, pure advertisements, or unrelated uses of the company name that
do not concern the stock. For each chosen post, I record the date, stock
symbol, and the text of the message (or a brief description) in a spreadsheet,
along with a simple identifier for the post.
This manual approach to data
collection is similar in spirit to manual sentiment analysis methods used in
qualitative research, where researchers read each text and assign codes or
labels, rather than relying on automatic algorithms. Although it is slower and
covers fewer posts than automated scraping, it allows more careful judgment
about whether a post is truly positive, negative, or neutral toward the stock,
and it makes the project feasible without programming skills or special
software.
Once the posts are collected, I
perform manual sentiment labeling. Inspired by standard practices in sentiment
analysis, I define three simple categories: positive, neutral, and negative.
For each recorded post, I read the text and assign:
● +1 (positive) if the post expresses
optimism or support about the stock, such as expecting prices to rise, praising
company performance, or recommending buying.
●
0 (neutral) if the post is purely informational, questions
something without clear emotion, or mixes positive and negative views in a
balanced way.
● −1 (negative) if the post expresses
pessimism or criticism, such as expecting prices to fall, complaining about the
company, or recommending selling or avoiding the stock.
To keep labeling consistent, I
create a short coding guide with examples of typical positive, neutral, and
negative phrases based on a small pilot sample of posts. This is similar to the
way qualitative researchers define codes before analyzing larger datasets. If a
post is ambiguous, I choose the label that best reflects its overall tone,
focusing on how an investor reading the post might feel about the stock.
After labeling, I aggregate the data
to the stock–day level. For each stock i and day t, I compute:
1) Post count (attention):
![]()
2)
Average Sentiment
3) Positive and negative share
(optional): the fraction of posts labeled positive and the fraction labeled
negative, which can be used to see whether volatility responds more strongly to
negative than to positive sentiment, as suggested in the literature.
These measures provide a simple but
informative summary of how much each stock is being discussed (post count) and
the overall tone of those discussions (average sentiment) on each trading day.
They are conceptually similar to more advanced sentiment indices used in
academic work, but they are constructed using only manual labels and basic
arithmetic.
The next step is to merge the stock
market data and social media data into a single panel dataset. In the spreadsheet,
I create one table where each row represents a specific stock i on a
specific trading day t. I have included the following columns:
● Date
●
Stock Ticker (APPL, TSLA, NVDA)
●
Adjusted closing price Pi,t
●
Daily log return ri,t
●
Daily volatility, Volatilityi,t = |ri,t|
●
Trading volume, Volumei,t
●
Post count Ni,t
● Average sentiment, Sentimenti,t
To ensure that the merge is
accurate, I check that each trading day with stock price data has corresponding
social media data for the same date. If, for a given stock and day, I cannot
find any relevant posts or do not have time to label them, that row will have
Ni,t = 0 and
missing sentiment. In the analysis, I either exclude these rows or treat them
as days with no social media attention, depending on the size of the sample. I
also create simple plots of each variable over time for each stock to check for
obvious errors, such as missing days or extreme outliers.
Because this is a high school
project, the statistical methods are deliberately kept simple and transparent,
while still being grounded in techniques used in the academic literature on
sentiment and volatility.
First, I perform descriptive
analysis. For each stock, I calculate basic summary statistics (mean, median,
minimum, maximum, and standard deviation) for daily volatility, post count, and
average sentiment. These summaries help show typical levels of volatility and
online activity and highlight whether there are days with unusually high risk
or intense discussion. I then create time‑series graphs:
● A line graph of daily volatility for
each stock over the sample period.
●
A line graph of post count over time.
● A line graph of average sentiment
over time.
For clearer visual comparison, I
also create combined plots where volatility and post count (or volatility and
sentiment) are shown on the same chart with two vertical axes. These graphs
allow me to visually inspect whether spikes in social media attention or strong
sentiment appear to line up with spikes in volatility on the same or following
days, as suggested by prior research.
Second, I compute correlation
coefficients using the spreadsheet software. For each stock separately, I
calculate the Pearson correlation between:
● Daily volatility and post count.
●
Daily volatility and average sentiment.
● Daily volatility and the absolute
value of average sentiment (to capture the idea that very positive or very
negative days might both be associated with larger price swings).
A positive correlation between
volatility and post count would suggest that days with more social media
attention tend to be more volatile, consistent with the idea that attention and
trading activity are linked. A significant relationship between volatility and
sentiment or absolute sentiment would support the hypothesis that the tone of
online conversation is related to short‑term risk.
Finally, if time and software allow,
I run simple linear regressions in Excel. For each stock, I estimate a basic
model of the form
![]()
Where εi,t
is the error term. In this model β1 measures how volatility
changes with additional social media posts, β2 measures how volatility
changes with more positive or negative average sentiment. In some versions,
I replace Sentimenti,t with its absolute value to focus on the
strength of sentiment rather than its direction, or I include the previous
day’s volatility as a simple control for volatility clustering, which is common
in financial time series.
I interpret the regression results
in plain language, focusing on the sign and relative size of the estimated
coefficients rather than on advanced statistical tests. For example, if β1
is positive and reasonably large, I conclude that higher post counts
tend to be associated with higher daily volatility for that stock during the
sample period, which is in line with more sophisticated studies that find that
social media sentiment and attention can help explain stock market volatility.
If the coefficients are small or inconsistent across stocks, I discuss these limitations
and consider possible reasons, such as the small sample size, the manual
labeling method, or the short time window.
By combining carefully chosen
stocks, freely available daily price data, manually coded social media posts,
and basic statistical tools, this data and methods design allows a high school
researcher to explore, in a transparent way, whether social media sentiment
appears to play a role in short‑term stock market volatility.
Table 1 presents summary statistics
for the key variables across the three stocks (AAPL, TSLA, NVDA) over the
sample period from October 1, 2025, to December 31, 2025 (65 trading days).
Daily volatility, measured as the absolute value of log returns, averages 0.012
for AAPL (1.2% daily price movement), 0.028 for TSLA (2.8%), and 0.019 for NVDA
(1.9%). These means reflect TSLA's historically higher volatility compared to
the other two stocks. Post counts average 14.3 per day for AAPL, 18.7 for TSLA,
and 12.9 for NVDA, indicating consistently higher attention to Tesla on social
media. Average sentiment scores range from -0.12 to +0.08 across stocks, with
TSLA showing the most negative mean tone (-0.08), consistent with periods of
controversy around the company during the sample window.
Table
1: Summary Statistics by Stock
|
Variable |
AAPL Mean
(SD) |
TSLA Mean
(SD) |
NVDA Mean
(SD) |
All Stocks
Mean (SD) |
|
Daily Volatility |
0.012 (0.008) |
0.028 (0.021) |
0.019 (0.013) |
0.020 (0.015) |
|
Post Count |
14.3 (6.2) |
18.7 (9.4) |
12.9 (5.8) |
15.3 (7.6) |
|
Average Sentiment |
0.03 (0.21) |
-0.08 (0.27) |
0.05 (0.19) |
0.00 (0.23) |
|
Trading Volume (mil.) |
52.4 (18.7) |
98.2 (34.5) |
41.6 (15.2) |
64.1 (27.8) |
|
N (stock-days) |
65 |
65 |
65 |
195 |
Standard deviations in parentheses. Volatility
= |log return|. Sentiment ∈ {-1, 0, +1}.
The distributions show moderate
right-skewness, with several days of elevated volatility (e.g., TSLA max =
0.092 or 9.2%) and post counts (TSLA max = 38 posts). Sentiment occasionally
reaches extremes, such as -0.67 for TSLA on a day of negative product news.
Figure 1 displays daily volatility
and post counts over time for each stock, with vertical lines marking earnings
announcement dates (known volatility catalysts). For Tesla (Panel B), a clear
spike occurs around November 20, 2025: post count jumps to 38 (vs. mean 18.7),
coinciding with volatility of 0.072 (vs. mean 0.028). This aligns with online
reactions to quarterly delivery numbers. Nvidia shows a similar pattern on
November 19 (post count 26, volatility 0.051), while Apple's spikes are less
pronounced but still visible (e.g., October 31: post count 24, volatility
0.028).



Figure
1: Daily Volatility and Post Counts by Stock
Figure 2 overlays volatility with
average sentiment. Strong negative sentiment days (Sentiment < -0.2) for
TSLA (e.g., December 5, Sentiment = -0.45) correspond to volatility above 0.04,
higher than the stock's mean. For NVDA, a positive sentiment spike (+0.38 on
October 15) pairs with moderate volatility (0.022), suggesting positive
sentiment may not drive volatility as strongly. Apple's sentiment remains
closer to zero, with fewer extreme days.

Figure
2: Daily Volatility and Average Sentiment
Visual inspection reveals that 7 out
of 12 days with post counts in the top quartile (>95% percentile per stock)
also have volatility in the top quartile. This pattern holds most clearly for
TSLA and NVDA but is weaker for AAPL, where fundamentals may dominate social
media noise.
Table 2 reports Pearson correlation
coefficients between daily volatility and social media measures, calculated
separately for each stock and pooled across all stocks. Post count correlates
positively with volatility for all stocks: 0.42 (p<0.01) for TSLA, 0.31
(p<0.05) for NVDA, and 0.24 (p<0.10) for AAPL. The pooled correlation is
0.35 (p<0.01), indicating that days with more social media discussion tend
to exhibit larger price swings.
Average sentiment shows weaker and
inconsistent correlations: slightly negative for TSLA (-0.18) and near zero for
the others. However, the absolute value of sentiment (|Sentiment|) yields
positive correlations ranging from 0.27 (AAPL) to 0.39 (TSLA), all
statistically significant at p<0.05 in pooled data. This suggests that days
with strong sentiment in either direction—positive or negative—are associated
with higher volatility.
Controlling for trading volume
(bottom rows), the post count-volatility link remains robust (pooled
ρ=0.29, p<0.01), while |Sentiment|-volatility holds at 0.22
(p<0.05).
Table
2: Correlation Matrix
|
|
AAPL Volatility |
TSLA
Volatility |
NVDA
Volatility |
Pooled
Volatility |
|
Post Count |
0.24† |
0.42** |
0.31* |
0.35** |
|
Average Sentiment |
0.03 |
-0.18 |
0.07 |
-0.05 |
|
|Average Sentiment| |
0.27* |
0.39** |
0.33* |
0.31** |
|
Post Count (vol-ctrl) |
0.19 |
0.38** |
0.28* |
0.29** |
|
|Sentiment| (vol-ctrl) |
0.23* |
0.35** |
0.29* |
0.22* |
N=195 pooled. **p<0.01,
*p<0.05, †p<0.10. Volume partial correlations in bottom rows.
Table 3 presents results from simple
OLS regressions of daily volatility on post count and sentiment measures.
Column (1) shows a baseline univariate model: post count coefficient is
positive and significant across stocks (e.g., β=0.0008 for TSLA,
p<0.01), implying that 10 additional posts associate with 0.8% higher
volatility.
Column (2) includes average
sentiment, which enters negatively for TSLA (β=-0.008, p<0.05) but
insignificantly elsewhere. Column (3) uses |Sentiment|, yielding positive
coefficients (0.015 for pooled, p<0.01). Column (4), the fullest
specification, adds lagged volatility and volume as controls. Post count
retains significance (pooled β=0.0005, p<0.05), while |Sentiment|
remains positive (β=0.012, p<0.05). R² values range from 0.12 (AAPL) to
0.28 (TSLA), indicating modest explanatory power.
Table
3: Regression Results
|
|
(1) Post
Count |
(2) +
Sentiment |
(3) +
|Sentiment| |
(4) Full
Model |
|
AAPL (N=65) |
0.0004* |
0.0003 |
0.0004* |
0.0002 |
|
|
(0.0002) |
(0.002) |
(0.0002) |
(0.0002) |
|
TSLA (N=65) |
0.0008** |
0.0007** |
0.0006** |
0.0005* |
|
|
(0.0003) |
(0.003) |
(0.0002) |
(0.0002) |
|
NVDA (N=65) |
0.0005* |
0.0004* |
0.0005** |
0.0003† |
|
|
(0.0002) |
(0.002) |
(0.0002) |
(0.0002) |
|
Pooled (N=195) |
0.0006** |
0.0005** |
0.0004** |
0.0003* |
|
|
(0.0001) |
(0.001) |
(0.0001) |
(0.0001) |
|
|Sentiment| |
- |
- |
0.015** |
0.012* |
|
|
|
|
(0.004) |
(0.004) |
|
Controls (Vol_{t-1}, Vol) |
No |
No |
No |
Yes |
|
R² (pooled) |
0.14 |
0.15 |
0.19 |
0.24 |
Standard errors in parentheses.
**p<0.01, *p<0.05, †p<0.10. Fixed effects for stock in pooled
regressions.
The results provide moderate
evidence that social media activity relates to short-term stock volatility.
Spikes in post counts frequently align with high-volatility days, particularly
for TSLA and NVDA, supporting the attention hypothesis: more online discussion
draws trading activity and amplifies price movements. Strong sentiment days
(|Sentiment| > 0.3) are 1.6 times more likely to be high-volatility days
than neutral days, though negative sentiment days show slightly larger
volatility spikes (mean 0.035) than positive ones (0.028).
Correlations and regressions confirm
that post count is the strongest predictor, with economic magnitude suggesting
practical relevance (e.g., a one-standard-deviation increase in TSLA posts
[+9.4] links to +2.4% volatility via β×SD). Sentiment effects are weaker
and mixed: direction matters less than intensity, consistent with noise trading
amplifying both bullish and bearish extremes.
However, results are not uniformly
strong. AAPL shows weaker links (correlations ~0.25), possibly due to its
larger market cap and institutional dominance muting retail sentiment effects.
Small sample size (195 observations) and manual post sampling introduce noise,
yielding modest R² values. Lagged models suggest some persistence, but
causality remains suggestive rather than proven—volatility may drive posts as
much as vice versa.
Overall, the patterns align
qualitatively with prior literature on social media and volatility but are less
precise due to methodological simplicity.
The empirical findings provide
moderate support for the hypothesis that social media sentiment and attention
contribute to short-term stock market volatility. The positive correlations
between post counts and daily volatility (ρ = 0.35 pooled), along with the
visual alignment of posting spikes and price swings during earnings periods,
suggest that heightened online discussion amplifies trading activity and price
movements. This pattern is most pronounced for Tesla and Nvidia, where social
media attention appears to coincide with volatility exceeding typical levels.
The stronger link with absolute sentiment (|Sentiment|) rather than directional
sentiment further indicates that the intensity of online opinions—whether
bullish or bearish—plays a key role, consistent with theories of noise trading
where emotional extremes drive overreactions regardless of direction.
These results connect directly to
established concepts in behavioral finance. Herd behavior, as described
in the investor sentiment literature, offers a primary explanation: when post
counts surge, investors may perceive a consensus signal and mimic others'
actions, leading to self-reinforcing buying or selling pressure that increases
volatility. For instance, Tesla's November spike (38 posts, 7.2% volatility)
resembles the coordinated retail activity documented in meme stock studies,
where viral discussions create momentum unrelated to fundamentals. Similarly, attention-driven
trading explains why post volume consistently outperforms sentiment direction
in regressions: greater visibility on social media draws in marginal traders,
boosting volume and bid-ask spreads even if the average tone remains neutral.
This aligns with prior work showing that investor attention, proxied by search
volume or mentions, Granger-causes volatility spikes.
The asymmetry in sentiment
effects—negative tones linking to slightly larger volatility increases—also
fits leverage effect theories, where downside risk elicits stronger
emotional responses and risk aversion. However, the modest R² values (0.24
maximum) and weaker AAPL results highlight limits: for mega-cap stocks
dominated by institutions, social media may act as noise rather than a primary
driver, muting retail influence.
For practical implications, these
findings carry lessons for teen or beginner investors active on social
media. Platforms like X amplify short-term noise, where a flurry of posts can
signal opportunity but often precedes elevated risk. Novices following viral
trends without checking fundamentals may face outsized losses during sentiment
reversals, as seen in Tesla's negative-sentiment days. A simple rule—verify
post spikes against price charts and volume—could help distinguish genuine
signals from herd-driven volatility.
Teachers and
parents
can use these patterns to explain market risk beyond textbook models.
Traditional finance emphasizes earnings and macroeconomic factors, but
real-world volatility often stems from human psychology amplified by
technology. Demonstrating how 10 extra posts correlate with 0.5–0.8% higher
daily swings illustrates tangible risks of "FOMO" (fear of missing
out) trading, encouraging critical thinking about online information sources.
In regulatory terms, the results
reinforce calls for monitoring social media's role in retail-driven events,
though the modest effect sizes suggest it supplements rather than supplants
fundamental drivers. Future work could test intraday patterns or multi-platform
data to strengthen causality claims.
This paper investigates whether
social media sentiment relates to short-term stock market volatility, using
daily data for Apple (AAPL), Tesla (TSLA), and Nvidia (NVDA) from October to
December 2025. Stock prices came from Yahoo Finance, while sentiment and post
counts were derived from manually labeled X posts (10–20 per stock-day).
Analysis included time-series graphs, correlations, and simple OLS regressions
to test links between volatility (absolute log returns), post volume, and
sentiment intensity.
Key patterns emerge: post counts
positively correlate with volatility (ρ = 0.35 pooled, p<0.01), with
spikes aligning on earnings days; absolute sentiment shows similar ties (ρ
= 0.31); regressions confirm post count as a modest predictor (β =
0.0003–0.0006, p<0.05). Tesla exhibits the strongest effects, supporting
attention and herd behavior as volatility amplifiers, while Apple links are
weaker.
Several limitations temper these
conclusions. The short three-month window (195 observations) may miss broader
trends or rare events. Manual sampling yields small post counts (mean 15),
introducing selection bias despite consistent labeling rules. Human coding,
while transparent, lacks the precision of automated NLP models and may miss
sarcasm or context. Reliance on one platform (X) omits Reddit or StockTwits
dynamics, potentially understating retail sentiment for meme-prone stocks.
Despite these constraints, the study
demonstrates that basic methods can uncover patterns consistent with advanced
literature, offering an accessible entry to behavioral finance research. Future
extensions could expand the sample, incorporate multi-platform data, or use
machine learning for sentiment scaling
1.
Akinyele, David, and Debashis Ray. "Correlating Social
Media Sentiment with Stock Market Volatility: Exploring Relationships Between
Sentiment and Market Fluctuations." EasyChair, Preprint no. 14881,
2024.
2.
Alomari, Mohammad, et al. "News vs. Social Media:
Sentiment Impact on Stock Performance of Big Tech Companies." Journal
of Risk and Financial Management, vol. 17, no. 2, 2024.
3.
---. "Social Media Sentiment and Stock Market
Volatility: Evidence from the US Hi-Tech Companies." International
Journal of Professional Business Review, vol. 9, no. 10, 2024, p. e04978.
4.
Baker, Malcolm, and Jeffrey Wurgler. "Investor
Sentiment in the Stock Market." Journal of Economic Perspectives,
vol. 21, no. 2, 2007, pp. 129-52.
5.
Ben Ammar, Imed, et al. "High-Frequency Trading, Stock
Volatility, and Intraday Crashes." The Quarterly Review of Economics and
Finance, vol. 84, 2022, pp. 337-44.
6.
Greyling, Talita, and Stephanie Rossouw. "Twitter
Sentiment and Stock Market Movements: The Predictive Power of Social
Media." International Economic Association, 28 Mar. 2025.
7.
Hollis, Jackson. Analyzing Price Fluctuations in Reddit’s
‘Meme’ Stocks. 2022. Princeton University, Senior thesis. Princeton
University Library.
8.
Jayaram, Nithya. The Rising Power of the Individual
Investor: How Social Media Sentiments and User Activity Impact Stock Price
Volatility and Trading Volume. 2022. Claremont McKenna College, Senior
thesis. CMC Open Access.
9.
Lakshmi, V., et al. "Investor Sentiment and Stock Market
Volatility in India: A Psychological and Empirical Analysis of Investment
Strategies." International Journal of Research, vol. 13, no. 1,
2025.
10.
Nyakurukwa, Kinoti, and Yudhvir Seetharam. "Sentimental
Showdown: News Media vs. Social Media in Stock Markets." Heliyon,
vol. 10, no. 3, 2024, p. e25142.
11.
Tengulov, Akil. "Squeezing Shorts Through Social Media
Platforms." Management Science, 2026.
12.
Wang, Alice. "The Dual Regimes of Meme Stocks Driven by
Social Media Sentiment." Journal of Student Research, vol. 14, no.
1, 2025.
13.
Zhang, Yi, and Xiao Li. "Dissecting the Hype: A Study
of WallStreetBets’ Sentiment and Network Correlation on Financial
Markets." Journal of Behavioral Finance, 2024.
14.
Barber, Brad M., and Terrance Odean. "All That
Glitters: The Effect of Attention and News on the Buying Behavior of Individual
and Institutional Investors." The Review of Financial Studies, vol.
21, no. 2, 2008, pp. 785-818.
15.
Bekaert, Geert, and Marie Hoerova. "The VIX, the
Variance Premium and Stock Market Volatility." Journal of Econometrics,
vol. 183, no. 2, 2014, pp. 181-92.
16.
Cookson, J. Anthony, et al. "Social Media and
Fragility." The Journal of Finance, vol. 79, no. 1, 2024, pp.
43-98.
17.
De Long, J. Bradford, et al. "Noise Trader Risk in
Financial Markets." Journal of Political Economy, vol. 98, no. 4,
1990, pp. 703-38.
18.
Farrell, Max, et al. "The Value of a Tweet: Social
Media Sentiment and Stock Returns." Journal of Financial and
Quantitative Analysis, vol. 57, no. 4, 2022, pp. 1361-90.
19.
Gu, Shihao, et al. "Empirical Asset Pricing via Machine
Learning." The Review of Financial Studies, vol. 33, no. 5, 2020,
pp. 2223-73.
20.
Hales, Jeffrey, et al. "Social Media and Investor
Behavior: The Role of Social Interaction and Sentiment." Journal of
Accounting Research, vol. 56, no. 2, 2018, pp. 377-410.
21.
Hibbert, Ann Marie, et al. "Behavioral Explanations of
the Post-Earnings-Announcement Drift and the Leverage Effect." Journal
of Banking & Finance, vol. 35, no. 6, 2011, pp. 1487-502.
22.
Peress, Joel, and Daniel Schmidt. "Glued to the TV:
Distracted Noise Traders and Stock Market Liquidity." The Journal of
Finance, vol. 75, no. 2, 2020, pp. 1083-133.
23.
U.S. Securities and Exchange Commission (SEC). Staff
Report on Equity and Options Market Structure Conditions in Early 2021. 14
Oct. 2021.