Introduction
In the fast-paced world of financial markets, the ability to efficiently store, manipulate, and analyze data is crucial. This is where Pandas, a powerful and flexible Python library, becomes an important tool for finance professionals, traders, and data analysts. Pandas is known for its simplicity, great performance and popularity across Data enthusiasts. At the end of this article, we will provide a Sample code to pull and store Historical Data into a DataFrame using our own TWS API (Documentation can be found here: https://ibkrcampus.com/ibkr-api-page/twsapi-doc/).
What is Pandas?
Pandas is an open-source library that provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. At the heart of Pandas are two primary data structures: Series and DataFrames. A Series is essentially a one-dimensional array capable of storing any data type, whereas a DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns).
How Pandas Works with Market Data
Receiving Data
The first step in working with market data is data ingestion. Market data can come from various sources, including APIs. You can easily import data from various file formats (CSV, JSON, Excel, etc.) or directly from a dictionary in your code; which can be created by requesting Market Data via an API.
Data Storage and Manipulation
Once the market data is imported into Pandas, it is stored in DataFrame format. This structure is particularly suited for financial time series data as it allows for easy manipulation and transformation of data. For example, you can:
- Resample or aggregate data: Convert high-frequency data into lower-frequency data (e.g., from minute-by-minute to daily), which is particularly useful for historical trend analysis.
- Handle missing values: Fill or drop missing values, which is common in real-world market data.
- Perform data transformations: Apply mathematical operations to adjust prices for splits, dividends, or to calculate returns.
- Filtering and sorting: Easily filter out data based on certain criteria or sort data according to date, price, or any other column.
Analysis and Visualization
Beyond data storage and manipulation, Pandas seamlessly integrates with other libraries for data analysis and visualization, such as NumPy for numerical computations and Matplotlib or Seaborn for plotting. This integration allows for comprehensive financial analysis, such as calculating financial indicators (moving averages, RSI, etc.), performing statistical tests, or visualizing price movements and trading signals.
In this example, we will be requesting historical data for Apple stock (Symbol AAPL) in the form of 1 hour bars for the past 2 days. To request this data, we will be using our TWS API method, reqHistoricalData. You can check our Python tutorials on how to request historical data: (Tutorial). Once this data is received, we will store it into a Pandas DataFrame then set the index to Datetime (explain why). Finally, using the MatplotLib library, we will graph the prices and volume in a simple chart. Code Snippet below:
Example of pulling Historical Data using TWS API
In this example, we will demonstrate how to request historical data for Apple Inc. (AAPL) stock in the form of 1-hour bars over the past two days. We will utilize the TWS API method, reqHistoricalData, to obtain this data. For detailed instructions on using this method, you can refer to our Python tutorials.
Once the historical data is received, we will store it in a Pandas DataFrame. Setting the index to Datetime is crucial because it allows us to efficiently handle and analyze time-series data. This indexing facilitates various time-based operations and visualizations.
Finally, we will use the Matplotlib library to create a simple chart displaying the prices and volume of the stock. Below is a code snippet illustrating the process.
from ibapi.client import * from ibapi.wrapper import * import pandas as pd import matplotlib.pyplot as plt import matplotlib.dates as mdates port = 7496 class TestApp(EClient, EWrapper): def __init__(self): EClient.__init__(self, self) self.data = [] def nextValidId(self, orderId: OrderId): mycontract = Contract() mycontract.conId = 265598 mycontract.exchange = "SMART" self.reqHistoricalData( reqId=123, contract=mycontract, endDateTime="", durationStr= "2 D", barSizeSetting = "1 hour", whatToShow= "TRADES", useRTH=0, formatDate=1, keepUpToDate=False, chartOptions=[], ) def historicalData(self, reqId: int, bar: BarData): print("Historical Bar", bar) # Remove timezone information before appending date_without_tz = bar.date.split(' ')[0] + ' ' + bar.date.split(' ')[1] self.data.append([ date_without_tz, bar.open, bar.high, bar.low, bar.close, bar.volume, bar.wap, bar.barCount ]) def historicalDataEnd(self, reqId: int, start: str, end: str): print("nHistorical Data received n") df = pd.DataFrame(self.data, columns=[ 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'WAP', 'BarCount' ]) # Convert 'Date' to datetime and set as index df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d %H:%M:%S') df.set_index('Date', inplace=True) print("Creating Data Frame...Printing DataFrame:n") print(df) # Plot the data fig, ax1 = plt.subplots(figsize=(12, 8)) # Plot OHLC data ax1.plot(df.index, df['Open'], label='Open', color='blue') ax1.plot(df.index, df['High'], label='High', color='green') ax1.plot(df.index, df['Low'], label='Low', color='red') ax1.plot(df.index, df['Close'], label='Close', color='black') ax1.set_ylabel('Price') ax1.legend(loc='upper left') # Create another y-axis for the volume data ax2 = ax1.twinx() ax2.fill_between(df.index, df['Volume'], color='gray', alpha=0.3, label='Volume') ax2.set_ylabel('Volume') ax2.legend(loc='upper right') # Format the x-axis to show the full date and time ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d %H:%M:%S')) plt.xticks(rotation=45) plt.gcf().autofmt_xdate() # Auto format date for better visibility plt.title('Stock Price and Volume') plt.show() return super().historicalDataEnd(reqId, start, end) def error(self, reqId: TickerId, errorCode: int, errorString: str, advancedOrderRejectJson=""): print(reqId, errorCode, errorString, advancedOrderRejectJson) app = TestApp() app.connect("127.0.0.1", port, 1) app.run()
Conclusion
Pandas is a cornerstone tool for anyone dealing with market data. Its comprehensive features for data ingestion, manipulation, analysis, and visualization make it a robust and useful tool in a financial analyst’s toolkit. By abstracting away much of the complexity involved in data handling, Pandas allows analysts to focus more on extracting insights and making informed decisions, thereby playing a crucial role in the financial decision-making process.
Reference: https://pandas.pydata.org/
Disclosure: Interactive Brokers
The analysis in this material is provided for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IBKR to buy, sell or hold such investments. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
The views and opinions expressed herein are those of the author and do not necessarily reflect the views of Interactive Brokers, its affiliates, or its employees.
Disclosure: Order Types / TWS
The order types available through Interactive Brokers LLC’s Trader Workstation are designed to help you limit your loss and/or lock in a profit. Market conditions and other factors may affect execution. In general, orders guarantee a fill or guarantee a price, but not both. In extreme market conditions, an order may either be executed at a different price than anticipated or may not be filled in the marketplace.
Disclosure: API Examples Discussed
Throughout the lesson, please keep in mind that the examples discussed are purely for technical demonstration purposes, and do not constitute trading advice. Also, it is important to remember that placing trades in a paper account is recommended before any live trading.