Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

feat: add optional market sentiment feature enrichment#1413

Open
alexander-schneider wants to merge 4 commits intoAI4Finance-Foundation:masterAI4Finance-Foundation/FinRL:masterfrom
alexander-schneider:codex/adanos-sentiment-featuresalexander-schneider/FinRL:codex/adanos-sentiment-featuresCopy head branch name to clipboard
Open

feat: add optional market sentiment feature enrichment#1413
alexander-schneider wants to merge 4 commits intoAI4Finance-Foundation:masterAI4Finance-Foundation/FinRL:masterfrom
alexander-schneider:codex/adanos-sentiment-featuresalexander-schneider/FinRL:codex/adanos-sentiment-featuresCopy head branch name to clipboard

Conversation

@alexander-schneider
Copy link
Copy Markdown

Summary

This PR adds an optional market sentiment feature enricher for existing OHLCV datasets.

It does not add a new price data source. Instead, it provides a small helper that merges structured, lagged sentiment features onto an existing FinRL dataframe so those features can be appended to tech_indicator_list and used as part of the RL state.

What is included

  • finrl.meta.preprocessor.adanos_sentiment.add_adanos_market_sentiment(...)
  • a small example script showing how to enrich a Yahoo Finance dataset before training
  • targeted unit tests
  • short documentation updates in the README, examples, FAQ, and data layer docs

Design choices

  • Optional and fail-open: without ADANOS_API_KEY, the helper is a no-op and returns the original dataframe
  • No provider changes: Yahoo/WRDS/Alpaca remain unchanged
  • Lagged features only: sentiment columns are shifted before joining to avoid same-day leakage
  • Recent-window positioning: intended for short-horizon or recent-window research where structured market sentiment is available

Example features

The helper adds lagged aggregate and source-level features such as:

  • adanos_buzz_mean_lag1
  • adanos_sentiment_mean_lag1
  • adanos_source_coverage_lag1
  • adanos_reddit_buzz_lag1
  • adanos_x_buzz_lag1
  • adanos_news_buzz_lag1
  • adanos_polymarket_buzz_lag1

Validation

Ran locally:

  • python3 -m pytest unit_tests/preprocessors/test_adanos_sentiment.py -q
  • python3 -m py_compile finrl/meta/preprocessor/adanos_sentiment.py examples/FinRL_StockTrading_2026_1_data_with_adanos.py unit_tests/preprocessors/test_adanos_sentiment.py
  • git diff --check

I also tried python3 -m pytest unit_tests/preprocessors -q, but that currently fails in this environment because upstream tests import finrl top-level and require optional dependencies such as gymnasium during collection.

Copy link
Copy Markdown

@atharvajoshi01 atharvajoshi01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentiment enrichment idea is useful but a couple of concerns:

  1. The adanos_sentiment module adds a hard dependency on an external paid API. If ADANOS_API_KEY isn't set, does the pipeline still work cleanly? The example checks for it but the preprocessor module should handle the missing key gracefully at import time.

  2. The lagged features approach is good for avoiding look-ahead bias. Worth documenting what lag values are used and why (e.g. t-1, t-2 to avoid data leakage).

@alexander-schneider
Copy link
Copy Markdown
Author

Thanks for the review. I pushed a small follow-up in 5d6818a to make both points explicit.

  • Optional API behavior: the module now documents that import has no side effects, does not read ADANOS_API_KEY, and does not call the external API. add_adanos_market_sentiment(..., api_key=None) returns a copy of the input dataframe unchanged, and I added a regression test that passes a failing session to verify no network call is made without an explicit key.
  • Lag behavior: the docs now state that the default is lag=1, producing t-1 columns such as adanos_buzz_mean_lag1 to avoid same-day sentiment leakage/look-ahead bias. I also documented that users can pass lag=2 or higher for a more conservative delay and added a lag=2 regression test.

Validation run:

  • python3 -m pytest unit_tests/preprocessors/test_adanos_sentiment.py -q
  • python3 -m py_compile finrl/meta/preprocessor/adanos_sentiment.py unit_tests/preprocessors/test_adanos_sentiment.py examples/FinRL_StockTrading_2026_1_data_with_adanos.py
  • pre-commit run --files README.md docs/source/finrl_meta/Data_layer.rst examples/FinRL_StockTrading_2026_1_data_with_adanos.py examples/README.md finrl/meta/preprocessor/adanos_sentiment.py unit_tests/preprocessors/test_adanos_sentiment.py
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.