Tushare financial data interface
Plotting stock k-line charts
The structure and changing characteristics of different types of financial data vary, and naturally the form of charts suitable for describing the characteristics of different types of data will also vary. Line and point charts are the most common two-dimensional charts used by financial analysts, as they are easier to show the changing characteristics of financial data and are simpler to draw. This section introduces the basic methods of visualising time series data in the Python language using the drawing of stock k-line charts (also known as candlestick charts) as an example.
The process of plotting stock k-line charts using Tushare platform data in the Python runtime environment consists of three main steps as follows.
Step 1: Determine the data source. In this section, we choose to obtain stock ticker data from the Tushare platform and use the built-in interface functions of the Tushare package to obtain stock ticker data, so we first import the Tushare package using the command import tushare.
Step 2: Determine the form of the visualisation and the tools to implement it. In this section, the financials charting package mpl_finance is chosen as the tool for drawing k-line charts of stock prices, so you need to use the command "from mpl_finance import candlestick_ochl" to import the sub-module for drawing k-line charts from the mpl_finance package module candlestick_ochl. The mpl_finance package is a separate graphics package from Matplotlib (the command "pip install mpl_finance" completes the installation of the package) and is usually used for plotting stock price k-line charts and line charts. As the pro interface functions provided by the Tushare package differ from the data structure returned by the normal interface functions, it is necessary to ensure that the data format matches when calling the plotting functions. The program in this section makes use of the pro interface's pro.daily() function to obtain daily stock data and adapt the tick data structure appropriately to the requirements of the parameter format of the candlestick_ochl() function of the mpl_finance package. In addition, the number of k-lines drawn in the output chart should not be too high, as too many k-lines will inevitably lead to too small a distance between the k-lines and make the k-chart less clear.
Step 3: Determine the output tool for drawing the chart. This section selects the charting package Matplotlib as the output tool for the k-line chart, because Matplotlib package provides a wealth of chart output functions, which can be set relatively easily for the structure of the chart layout, colour and axis format and many other aspects, making the chart more beautiful and easier to understand. Therefore the module Matplotlib needs to be imported into the program with the command "import matplotlib".
import tushare as ts
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import ticker
from matplotlib.pylab import date2num
#from mpl_finance import candlestick_ochl #need to install the mpl_finance package separately
from mplfinance.original_flavor import candlestick_ochl #need to install the mpl_finance package separately
plt.rcParams['font.sans-serif'] = ['SimHei'] # Used to display Chinese labels normally
# Users with sufficient permissions are using the pro interface to get data
pro = ts.pro_api()
code = '600004.SH'
df = pro.daily(ts_code=code, start_date='20191201')
df.shape
#stock_daily = pro.daily(ts_code=code, start_date='20181201')
# stock_daily.to_excel('stock_daily.xlsx') # save as spreadsheet
# Users who do not have sufficient permissions to use the pro interface to get the data execute the following code to get the data directly from the xlxs file
df = pd.read_excel('stock_daily.xlsx', dtype={'code': 'str','trade_date': 'str'})
df.drop(df.columns[0], axis=1, inplace=True)
df.shape
df2 = df.query('trade_date >= "20171001"').reset_index() # select data after Oct 1, 2017
df2 = df2.sort_values(by='trade_date', ascending=True) # sort the original data in descending order by date
df2['dates'] = np.range(0, len(df2)) # len(df2) refers to the number of records
fig, ax = plt.subplots(figsize=(20, 9))
fig.subplots_adjust(bottom=0.2) # control subplots
### arguments to the candlestick_ochl() function
# ax Examples of plotting Axes
# quotes sequence (time, open, close, high, low) time is of type float, date must be converted to float
# width The width of the red and green rectangle in the image, representing the number of days
# colourup the colour of the closing price if it is greater than the opening price
# colordown the colour of the rectangle if it is lower than the opening price
# alpha the transparency of the colour of the rectangle
candlestick_ochl(ax, quotes=df2[['dates', 'open', 'close', 'high', 'low']].values,
width=0.55, colorup='r', colordown='g', alpha=0.95)
date_tickers = df2['trade_date'].values
def format_date(x, pos):
if (x < 0) or (x > len(date_tickers)-1):
return ''
return date_tickers[int(x)]
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date)) # select and display the time scale on the horizontal axis according to certain rules
plt.xticks(rotation=30) # set the angle of rotation of the date scale
ax.set_ylabel('transaction_price')
plt.title(code)
plt.grid(True) # add grid, optional, just makes the image look better
plt.xlabel('trade date')
plt.show()