Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting
Unlocking Predictive Signals from News Sentiment for Aluminum Markets
By capturing the prevailing sentiment and market mood, textual data has become increasingly vital for forecasting commodity prices, particularly in metal markets. However, the effectiveness of lightweight, finetuned large language models (LLMs) in extracting predictive signals for aluminum prices—and the specific market conditions under which these signals are most informative—remains under-explored. This study generates monthly sentiment scores from English and Chinese news headlines (Reuters, Dow Jones Newswires, and China News Service) and integrates them with traditional tabular data, including base metal indices, exchange rates, inflation rates, and energy prices. We evaluate the predictive performance and economic utility of these models through long-short simulations on the Shanghai Metal Exchange from 2007 to 2024. Our results demonstrate that during periods of high volatility, Long Short-Term Memory (LSTM) models incorporating sentiment data from a finetuned Qwen3 model (Sharpe ratio 1.04) significantly outperform baseline models using tabular data alone (Sharpe ratio 0.23). Subsequent analysis elucidates the nuanced roles of news sources, topics, and event types in aluminum price forecasting.
Authors: Alvaro Paredes Amorin, Andre Python, Christoph Weisser
Executive Impact
This research highlights how advanced AI sentiment analysis can revolutionize commodity price forecasting, delivering substantial performance gains, especially in volatile markets.
LSTM models with sentiment data showed a 359% improvement over tabular-only baselines in high-volatility periods.
In medium-volatility periods, sentiment-only strategies achieved the highest Sharpe ratio of 1.19, outperforming combined approaches.
Sentiment-augmented LSTM models achieved a 292% total return, significantly outperforming the 131% from no-sentiment models.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Peak R² Score for Aluminum Price Prediction
0.90 Achieved by LSTM (no-sentiment) with a 1-month windowThe Long Short-Term Memory (LSTM) model, even without sentiment data, demonstrated a peak R² score of 0.90, indicating high predictive accuracy for monthly aluminum prices when using optimal configurations and a 1-month historical window.
| Feature | Highlight | Significance |
|---|---|---|
| Overall Accuracy | Average Hit Rate of 0.556 | Consistently above random chance (0.50), demonstrating reliable signal generation across all configurations and horizons. |
| Optimal Horizon | 3-month forecasting window | Consistently delivers peak performance across both models and sentiment sources, balancing noise filtering and business cycle context. |
| Best Model (1-month) | ConvLSTM (0.585 hit rate) | Demonstrates strong short-term forecasting capability. |
| Best Sentiment Source | Reuters sentiment (0.559 average hit rate) | Most effective source overall, showing robust performance across models. |
| Baseline Competitiveness | No-sentiment baseline (0.551 average hit rate) | Raw price history contains substantial directional information, with sentiment providing additional enhancement. |
Max Sharpe Ratio with Sentiment
1.19 Achieved by sentiment-only strategy in medium volatilityDuring periods of medium volatility, a sentiment-only trading strategy, driven by finetuned LLM signals, yielded a Sharpe Ratio of 1.19, outperforming all other strategies including those combining sentiment with tabular data.
Case Study: Sentiment-Driven Profit in High Volatility (February 2020)
Challenge: In February 2020, amidst rising market volatility and coronavirus fears, the observed aluminum price was $1940.50/ton, falling to $1759.43/ton by March 2020. The no-sentiment model predicted a slight increase to $1974.37/ton (a long signal), indicating a potential loss.
Solution: The Reuters-based sentiment model accurately captured a short signal, predicting a price of $1899.57/ton. This correctly anticipated the market downturn, demonstrating the LLM's ability to discern critical directional shifts.
Outcome: While the no-sentiment strategy resulted in a -9.33% loss, the sentiment-driven short trade generated a significant 9.33% return. This highlights sentiment's critical value during turbulent periods where traditional models often fail, transforming potential losses into profits.
Aluminum Price Forecasting Workflow
Reuters Dominance in Price Movement Sentiment
0.71 Sharpe Ratio for Reuters' Price Movement TopicAmong individual topics, Reuters headlines categorized as 'Price Movement' yielded the highest Sharpe ratio of 0.71. This highlights Reuters' superior signal-to-noise content and timely reporting for direct price dynamics.
| Topic | Reuters (SR) | Dow Jones (SR) | China Service (SR) | Key Insight |
|---|---|---|---|---|
| Price Movement | 0.714 | 0.525 | 0.674 | Reuters leads in direct price dynamics (26.6% better than DJ). |
| Environmental | 0.601 | 0.347 | 0.395 | Reuters provides superior signals from regulatory and climate news. |
| Company News | 0.270 | -0.338 | 0.048 | Reuters sentiment offers strong positive value, while Dow Jones's signals are counterproductive. |
| Production Output | 0.042 | 0.125 | 0.283 | ChinaService excels here, reflecting its industrial focus. |
| Supply Disruption | -0.073 | 0.164 | 0.124 | Sentiment can be misleading; Dow Jones shows positive value in this category. |
Calculate Your Potential ROI
Estimate the efficiency gains and cost savings your enterprise could realize by implementing AI-driven sentiment analysis for market forecasting.
Your AI Implementation Roadmap
A typical phased approach to integrate AI-driven sentiment analysis into your enterprise operations.
Phase 1: Discovery & Strategy
Initial consultation to understand your current forecasting methods, data sources, and business objectives. Define clear KPIs and a tailored AI strategy.
Phase 2: Data Integration & Model Finetuning
Integrate your internal and external data sources. Finetune large language models with your domain-specific financial data and market insights for optimal sentiment extraction.
Phase 3: Prototype & Validation
Develop a prototype forecasting model incorporating sentiment signals. Conduct backtesting and walk-forward validation against historical data to prove economic utility and predictive accuracy.
Phase 4: Deployment & Optimization
Deploy the validated AI model into your existing trading or analysis infrastructure. Continuously monitor performance, refine sentiment models, and integrate new data sources for ongoing optimization and enhanced decision-making.
Transform Your Forecasting with AI
Ready to unlock the full predictive power of news sentiment for your commodity markets? Schedule a personalized consultation to explore how our finetuned LLM solutions can drive superior returns and risk management for your enterprise.