Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions 4 .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,7 @@ target/

#Ipython Notebook
.ipynb_checkpoints

# Data files
*.parquet
*.csv
102 changes: 102 additions & 0 deletions 102 examples/rest/STOCKS_TRADES_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Stock Trades Downloader

This script downloads all trades for a specified ticker within a date range and saves them to a Parquet file.

## Features

- Downloads all trades for a ticker within a specified date range
- Handles API pagination automatically (50,000 records per request limit)
- Saves data in efficient Parquet format
- Progress tracking with per-day trade counts
- Error handling for API issues

## Requirements

- Python 3.10 or higher
- Valid Polygon.io API key
- Required packages: `polygon-api-client`, `pandas`, `pyarrow`

## Configuration

Edit the following variables in `stocks-trades.py`:

```python
TICKER = "IREN" # Stock ticker symbol
START_DATE = "2025-01-01" # Start date (YYYY-MM-DD)
END_DATE = "2025-12-31" # End date (YYYY-MM-DD)
```

## Usage

1. Set your Polygon.io API key as an environment variable:
```bash
export POLYGON_API_KEY="your_api_key_here"
```

2. Run the script:
```bash
cd examples/rest
python stocks-trades.py
```

3. The script will:
- Fetch trades for each day in the date range
- Display progress for each day
- Save all trades to a Parquet file named: `{TICKER}_trades_{START_DATE}_to_{END_DATE}.parquet`

## Output

The script generates a Parquet file containing the following fields for each trade:

- `conditions`: List of trade conditions
- `correction`: Trade correction indicator
- `exchange`: Exchange ID
- `id`: Trade ID
- `participant_timestamp`: Participant timestamp
- `price`: Trade price
- `sequence_number`: Sequence number
- `sip_timestamp`: SIP timestamp
- `size`: Trade size
- `tape`: Tape
- `trf_id`: TRF ID
- `trf_timestamp`: TRF timestamp

## Example Output

**Note**: The example below shows the configured ticker (IREN) and date range (2025). Actual trade counts will vary based on market activity during that period.

```
Downloading trades for IREN from 2025-01-01 to 2025-12-31...
This may take some time due to API rate limits and pagination...
Fetching trades for 2025-01-01... 1234 trades
Fetching trades for 2025-01-02... 2345 trades
...
Total trades downloaded: 123456
Saving to IREN_trades_2025-01-01_to_2025-12-31.parquet...
Successfully saved 123456 trades to IREN_trades_2025-01-01_to_2025-12-31.parquet
File size: 5.67 MB
```

## Notes

- The API has a limit of 50,000 records per request. The script handles pagination automatically.
- For large date ranges, the script may take significant time to complete.
- The script processes one day at a time to manage memory efficiently.
- Output files are excluded from git via `.gitignore`.

## Reading the Parquet File

You can read the generated Parquet file using pandas:

```python
import pandas as pd

# Read the parquet file
df = pd.read_parquet("IREN_trades_2025-01-01_to_2025-12-31.parquet")

# Display basic statistics
print(df.head())
print(f"Total trades: {len(df)}")
print(f"\nColumns: {df.columns.tolist()}")
print(f"\nData types:\n{df.dtypes}")
```
76 changes: 71 additions & 5 deletions 76 examples/rest/stocks-trades.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
from polygon import RESTClient
from polygon.rest.models import Trade
import pandas as pd # type: ignore
from datetime import datetime, timedelta

# docs
# https://polygon.io/docs/stocks/get_v3_trades__stockticker
Expand All @@ -10,12 +13,75 @@
# and is used by traders, investors, and researchers to gain insights into
# market behavior and inform their investment decisions.

# Configuration
TICKER = "IREN"
START_DATE = "2025-01-01"
END_DATE = "2025-12-31"
OUTPUT_FILE = f"{TICKER}_trades_{START_DATE}_to_{END_DATE}.parquet"

# client = RESTClient("XXXXXX") # hardcoded api_key is used
client = RESTClient() # POLYGON_API_KEY environment variable is used

trades = []
for t in client.list_trades("IBIO", "2023-02-01", limit=50000):
trades.append(t)
print(f"Downloading trades for {TICKER} from {START_DATE} to {END_DATE}...")
print(f"This may take some time due to API rate limits and pagination...")

all_trades = []
trade_count = 0

# Generate date range to iterate through each day
start = datetime.strptime(START_DATE, "%Y-%m-%d")
end = datetime.strptime(END_DATE, "%Y-%m-%d")
current_date = start

while current_date <= end:
date_str = current_date.strftime("%Y-%m-%d")
print(f"Fetching trades for {date_str}...", end=" ")

day_trades = 0
try:
# The API handles pagination automatically when we iterate
# limit=50000 is the maximum allowed per request
for t in client.list_trades(TICKER, date_str, limit=50000):
# Verify this is a Trade object
if isinstance(t, Trade):
# Convert trade object to dictionary
trade_dict = {
"conditions": t.conditions,
"correction": t.correction,
"exchange": t.exchange,
"id": t.id,
"participant_timestamp": t.participant_timestamp,
"price": t.price,
"sequence_number": t.sequence_number,
"sip_timestamp": t.sip_timestamp,
"size": t.size,
"tape": t.tape,
"trf_id": t.trf_id,
"trf_timestamp": t.trf_timestamp,
}
all_trades.append(trade_dict)
day_trades += 1
trade_count += 1

print(f"{day_trades} trades")
except Exception as e:
print(f"Error: {e}")

# Move to next day
current_date += timedelta(days=1)

print(f"\nTotal trades downloaded: {trade_count}")

if trade_count > 0:
# Convert to pandas DataFrame
df = pd.DataFrame(all_trades)

# prints each trade that took place
print(trades)
# Save to parquet file
print(f"Saving to {OUTPUT_FILE}...")
df.to_parquet(OUTPUT_FILE, index=False)
print(f"Successfully saved {trade_count} trades to {OUTPUT_FILE}")
print(
f"File size: {pd.read_parquet(OUTPUT_FILE).memory_usage(deep=True).sum() / 1024 / 1024:.2f} MB"
)
else:
print("No trades found for the specified period.")
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.